Fundamentals of Data Transformation

Learn how to build aggregation pipelines to process, transform, and analyze data efficiently in MongoDB.

Upon completion of the Fundamentals of Data Transformation skill and skill check, you will earn a Credly badge that you are able to share with your network.


  Learning Objectives

Define Aggregation Framework

Understand how the aggregation framework works in MongoDB, including its purpose, benefits, and common aggregation patterns.

Understand Aggregation Pipeline Stage Ordering

Learn how the order of aggregation pipeline stages impacts performance and results, and apply best practices for efficient pipeline structuring.





Build an Aggregation Pipeline

Learn how to build and optimize aggregation pipelines to efficiently process, transform, and manipulate data using various pipeline stages.

Sarah Evans | Senior Curriculum Engineer

Sarah Evans | Senior Curriculum Engineer

Sarah is a Senior Curriculum Engineer on the Curriculum team at MongoDB. Prior to MongoDB, she taught and developed curricula for developer bootcamps. Sarah has a MAT degree from Columbia University Teachers College and studied Software Engineering at Flatiron School in Chicago, IL.

Aaron Becker | Technologist, Education

Aaron Becker | Technologist, Education

Aaron Becker is a Technical Trainer, Instructional Designer, and Training Manager who has worked in the tech sector for over 13 years. Before joining the Curriculum team at MongoDB, Aaron worked in DevOps at CircleCI, creating their first Certification course (CircleCI Associate Developer) and leading a team responsible for creating and managing the educational content for CircleCI Academy for external/customer training, as well as CircleCI University for internal team member training.

Prior to that, Aaron worked in data protection, redundancy, and security at Carbonite, where he headed up the Training team, created and delivered ILT training courses for Carbonite's Mid-Market and Enterprise level products, and assisted over 150 employees in earning Microsoft certifications.

Aaron enjoys writing, performing, recording, mixing and mastering music, playing video games, and writing biographical text in the third person.

Manuel Fontan Garcia | Senior Technologist, Education

Manuel Fontan Garcia | Senior Technologist, Education

Manuel is a Senior Technologist on the Curriculum team at MongoDB. Previously he was a Senior Technical Services Engineer in the Core team at MongoDB. In between Manuel worked as a database reliability engineer at Slack for a little over 2 years and then for Cognite until he re-joined MongoDB. With over 15 years experience in software development and distributed systems, he is naturally curious and holds a Telecommunications Engineering MSc from Vigo University (Spain) and a Free and Open Source Software MSc from Rey Juan Carlos University (Spain).

Daniel Curran | Senior Software Engineer

Daniel Curran | Senior Software Engineer

Daniel is a Senior Software Engineer at MongoDB. Before joining MongoDB, he worked as an Instructional Designer and Content Developer specialising in technical content for a host of international clients. Daniel's goal is to remove obstacles so learners can feel confident on their journey to become masters of MongoDB.

Welcome. We're excited to help you learn how to process, transform, and analyze your data by using aggregation pipelines in MongoDB.

My name is Aaron. I'm a curriculum engineer and technologist at MongoDB, and I'll be your guide as you learn this skill.

The MongoDB aggregation framework is a powerful set of tools and operations that allow you to process and analyze data stored in MongoDB collections.

It enables various data manipulations and transformations through a pipeline of stages, each performing a specific operation on the data and passing the results to the next stage.

You may already be familiar with this concept from Linux, where pipelines allow you to process streams of text data with a series of command line utilities.

Similarly, MongoDB's aggregation pipelines enable you to process streams of data using a series of stages that can filter, project, group, sort, and transform data directly within the database.

To learn about this skill, we'll start with an introduction to MongoDB's aggregation framework. We'll discuss common aggregation stages and how they process data within a pipeline.

Here, we'll talk about why the order of the stages in your pipeline is important as it can have a big impact on overall performance.

After that, we'll explore some common pipeline stage sequences. These are common combinations of stages used to process data. You can use these as a starting point and modify them to fit your data and business goals.

Our first common stage sequence is match, group, and project.

This is used to filter data, group it into new categories and compute new fields, and finally return only the fields you want. After that, we'll introduce two new pipeline stages in another common stage sequence, match, group, sort, and limit.

Here, you can filter data, group it into a category, and compute additional fields, then sort the results and limit them for a crystal clear picture of the results. The final common pipeline stage sequence we'll cover is lookup and set. If you have experience with SQL, you've surely performed a SQL join to combine data from multiple tables. The lookup stage can help us do just that. While in many instances in MongoDB, we can avoid performing a lookup operation by embedding a one to many relationship within a single record, sometimes using lookup is necessary.

Finally, we'll talk about how you can assess the performance of your aggregation pipelines using Explain Plans. Once you're done, you'll have opportunities to practice building your own pipelines by completing hands on labs that present real world scenarios.

When you're finished, you'll be ready to put your new skills to the test. To earn your badge, simply complete all the related content and then take the short test at the end.

After passing the test, you'll receive an official Credly badge via the email you provided. Be sure to share your badge on LinkedIn to show off your new skills. By completing this skill badge, you'll know how to transform your data using MongoDB's aggregation framework. Let's get started.