Fundamentals of Data Transformation
Learn how to build aggregation pipelines to process, transform, and analyze data efficiently in MongoDB.
|
Upon completion of the Fundamentals of Data Transformation skill and assessment, you will earn a Credly badge that you are able to share with your network. |
Learning Objectives

Define Aggregation
Understand how the aggregation framework works in MongoDB, including its purpose, benefits, and common aggregation patterns.

Understand Aggregation Stage Ordering
Learn how the order of aggregation stages impacts performance and results, and apply best practices for efficient pipeline structuring

Build an Aggregation Pipeline
Learn how to build and optimize aggregation pipelines to efficiently process, transform, and manipulate data using various pipeline stages.
Who is this Course Good for?
If you are a developer who needs to move beyond basic queries and start transforming data directly in MongoDB, this Fundamentals of Data Transformation Skill Badge is designed for you. Developers who are familiar with basic MongoDB CRUD operations and now need to perform more complex data processing in the database will find this skill particularly useful. Here, you’ll learn to use MongoDB’s aggregation framework to process, filter, and reshape data for analytics, reporting, or application features. Whether you are building dashboards, powering APIs, or preparing data for downstream services, this badge helps you understand how to design efficient aggregation pipelines. By the end, you will be comfortable treating MongoDB not just as a data store, but as a powerful engine for data transformation and analysis.
What to Expect in this Course
In this skill badge, you will learn what aggregation means in MongoDB and how the aggregation framework enables you to process and analyze data directly in your database. The course starts by defining aggregation and introducing the concept of an aggregation pipeline as a sequence of stages where each stage performs a specific operation on the data and passes the transformed results to the next stage.
From there, you dive into common aggregation stages and how they work together. You explore typical pipeline sequences such as $match, $group, and $project to filter documents, group them into categories, compute new fields, and return just the fields your application needs. You then expand this sequence with stages like $sort and $limit to prioritize and reduce results, giving you a clear, focused view of your data. These examples show how MongoDB aggregation supports everything from simple filtering to more advanced reporting patterns.
The course also introduces more advanced stages such as $lookup and $set. If you have experience with SQL, you will recognize $lookup as a way to perform join-like operations between collections when your data model and requirements call for it. You will learn when a $lookup-based approach is appropriate and how it compares to embedding related data in a single document. To help you build high-performing pipelines, you will examine why the order of stages matters and how it can significantly impact efficiency. You also learn how to use explain plans to assess the performance characteristics of your aggregation pipelines and identify potential optimizations before they impact production workloads. Hands-on labs give you practice building and iterating on aggregation pipelines in realistic scenarios so you can confidently apply MongoDB aggregation to your own projects.
Summary of the Course
- Define aggregation in MongoDB and explain how aggregation pipelines process and transform data.
- Create aggregation pipelines that combine stages such as $match, $group, and $project to filter, group, and reshape results.
- Extend pipelines with stages like $sort and $limit to order and constrain result sets for clearer insights.
- Understand how stage ordering affects the performance and behavior of MongoDB aggregation pipelines.
- Use common pipeline patterns as starting points and adapt them to specific data and business requirements.
- Apply $lookup and $set to handle scenarios that require combining data from multiple collections or adjusting fields in-flight.
- Evaluate aggregation pipeline performance using explain plans and adjust designs accordingly.
- Build confidence working with MongoDB aggregation through hands-on practice that mirrors real-world data transformation tasks.
Sarah Evans | Senior Curriculum Engineer
Sarah is a Senior Curriculum Engineer on the Curriculum team at MongoDB. Prior to MongoDB, she taught and developed curricula for developer bootcamps. Sarah has a MAT degree from Columbia University Teachers College and studied Software Engineering at Flatiron School in Chicago, IL.
Aaron Becker | Technologist, Education
Aaron Becker is a Technical Trainer, Instructional Designer, and Training Manager who has worked in the tech sector for over 13 years. Before joining the Curriculum team at MongoDB, Aaron worked in DevOps at CircleCI, creating their first Certification course (CircleCI Associate Developer) and leading a team responsible for creating and managing the educational content for CircleCI Academy for external/customer training, as well as CircleCI University for internal team member training.
Prior to that, Aaron worked in data protection, redundancy, and security at Carbonite, where he headed up the Training team, created and delivered ILT training courses for Carbonite's Mid-Market and Enterprise level products, and assisted over 150 employees in earning Microsoft certifications.
Aaron enjoys writing, performing, recording, mixing and mastering music, playing video games, and writing biographical text in the third person.
Manuel Fontan Garcia | Senior Technologist, Education
Manuel is a Senior Technologist on the Curriculum team at MongoDB. Previously he was a Senior Technical Services Engineer in the Core team at MongoDB. In between Manuel worked as a database reliability engineer at Slack for a little over 2 years and then for Cognite until he re-joined MongoDB. With over 15 years experience in software development and distributed systems, he is naturally curious and holds a Telecommunications Engineering MSc from Vigo University (Spain) and a Free and Open Source Software MSc from Rey Juan Carlos University (Spain).
Daniel Curran | Senior Software Engineer
Daniel is a Senior Software Engineer at MongoDB. Before joining MongoDB, he worked as an Instructional Designer and Content Developer specialising in technical content for a host of international clients. Daniel's goal is to remove obstacles so learners can feel confident on their journey to become masters of MongoDB.
My name is Aaron. I'm a curriculum engineer and technologist at MongoDB, and I'll be your guide as you learn this skill.
The MongoDB aggregation framework is a powerful set of tools and operations that allow you to process and analyze data stored in MongoDB collections.
It enables various data manipulations and transformations through a pipeline of stages, each performing a specific operation on the data and passing the results to the next stage.
You may already be familiar with this concept from Linux, where pipelines allow you to process streams of text data with a series of command line utilities.
Similarly, MongoDB's aggregation pipelines enable you to process streams of data using a series of stages that can filter, project, group, sort, and transform data directly within the database.
To learn about this skill, we'll start with an introduction to MongoDB's aggregation framework. We'll discuss common aggregation stages and how they process data within a pipeline.
Here, we'll talk about why the order of the stages in your pipeline is important as it can have a big impact on overall performance.
After that, we'll explore some common pipeline stage sequences. These are common combinations of stages used to process data. You can use these as a starting point and modify them to fit your data and business goals.
Our first common stage sequence is match, group, and project.
This is used to filter data, group it into new categories and compute new fields, and finally return only the fields you want. After that, we'll introduce two new pipeline stages in another common stage sequence, match, group, sort, and limit.
Here, you can filter data, group it into a category, and compute additional fields, then sort the results and limit them for a crystal clear picture of the results. The final common pipeline stage sequence we'll cover is lookup and set. If you have experience with SQL, you've surely performed a SQL join to combine data from multiple tables. The lookup stage can help us do just that. While in many instances in MongoDB, we can avoid performing a lookup operation by embedding a one to many relationship within a single record, sometimes using lookup is necessary.
Finally, we'll talk about how you can assess the performance of your aggregation pipelines using Explain Plans. Once you're done, you'll have opportunities to practice building your own pipelines by completing hands on labs that present real world scenarios.
When you're finished, you'll be ready to put your new skills to the test. To earn your badge, simply complete all the related content and then take the short test at the end.
After passing the test, you'll receive an official Credly badge via the email you provided. Be sure to share your badge on LinkedIn to show off your new skills. By completing this skill badge, you'll know how to transform your data using MongoDB's aggregation framework. Let's get started.
