Lesson 3: Using MongoDB Aggregation Stages with Python: $sort and $project / Learn
Using MongoDB Aggregation Stages with Python: $sort and $project
Review the following code, which demonstrates how to use the $sort and $project stages in a MongoDB aggregation pipeline by using PyMongo.
Using $sort
When we build queries by using the aggregation framework, each stage transforms or organizes data in a specific way. In this lesson, we focused on the $sort and $project stages.
Use the $sort operator to organize the input documents in ascending or descending order. $sort takes a document that specifies the field(s) to sort by and the respective sort order. To sort in ascending order, use the value of 1. For descending order, use the value of -1.
Here's an example of a $sort stage:
# Organize documents in order from highest balance to lowest.
organize_by_original_balance = {"$sort": {"balance": -1}}
Using $project
Use the $project stage to specify the fields returned by the aggregation. $project can be used to include or exclude existing fields by setting a field to 1 to include or 0 to exclude. It can also be used to add new fields or reset the value of existing fields.
To add a new field by using $project, specify the field name and set its value to an expression like this: <field>: <expression>. In this example, the new field name is gbp_balance. The expression contains the $divide arithmetic operator, the $balance field reference, and the conversion_rate_usd_to_gbp variable.
When creating an aggregation pipeline, the $project stage should usually be the last stage in a pipeline because it specifies the exact fields to be returned to the client.
Here's an example of a $project stage:
# Return only the account type & balance fields, plus a new field containing balance in Great British Pounds (GBP).
return_specified_fields = {
"$project": {
"account_type": 1,
"balance": 1,
"gbp_balance": {"$divide": ["$balance", conversion_rate_usd_to_gbp]},
"_id": 0,
}
}
Aggregation Example That Uses $match, $sort, and $project
The following is an example of an aggregation pipeline that uses $match, $sort, and $project:
# Connect to MongoDB cluster with MongoClient
client = MongoClient(MONGODB_URI)
# Get reference to 'bank' database
db = client.bank
# Get a reference to the 'accounts' collection
accounts_collection = db.accounts
# Return the account type, original balance, and balance converted to Great British Pounds (GBP)
# of all checking accounts with an original balance of greater than $1,500 US dollars, in order from highest original balance to lowest.
# To calculate the balance in GBP, divide the original balance by the conversion rate
conversion_rate_usd_to_gbp = 1.3
# Select checking accounts with balances of more than $1,500.
select_accounts = {"$match": {"account_type": "checking", "balance": {"$gt": 1500}}}
# Organize documents in order from highest balance to lowest.
organize_by_original_balance = {"$sort": {"balance": -1}}
# Return only the account type & balance fields, plus a new field containing balance in Great British Pounds (GBP).
return_specified_fields = {
"$project": {
"account_type": 1,
"balance": 1,
"gbp_balance": {"$divide": ["$balance", conversion_rate_usd_to_gbp]},
"_id": 0,
}
}
# Create an aggegation pipeline containing the four stages created above
pipeline = [
select_accounts,
organize_by_original_balance,
return_specified_fields,
]
# Perform an aggregation on 'pipeline'.
results = accounts_collection.aggregate(pipeline)
print(
"Account type, original balance and balance in GDP of checking accounts with original balance greater than $1,500,"
"in order from highest original balance to lowest: ", "\n"
)
for item in results:
pprint.pprint(item)
client.close()