Schema Design Optimization / Optimize Your Schema
When using our bookstore app, we know many entities will be queried frequently, like reviews, user information, and related books. To optimize overall performance, we want to use as few queries as possible and avoid $lookup operations. Embedding these entities in a single document is not an option, since this will either result in an unbounded array or exceeding the 16-megabyte document size limit. However, storing these entities in different collections can impact performance, since we'll need to use expensive $lookup operations or multiple queries to retrieve the data.
So what do we do? In this case, the single collection pattern is a solution to consider. The single collection pattern groups related documents of different types into a single collection. This makes data retrieval more efficient by avoiding multiple queries when reading related documents that aren't embedded.
It also helps avoid expensive $lookup operations to retrieve data from entities that live in different documents. The single collection pattern is particularly useful for modeling many-to-many relationships where you are concerned about data duplication, and embedding isn't a good option. It can also be used for one-to-many relationships. Real-world use cases for this pattern include a catalog of product items or an online shopping cart.
The single collection pattern has two variants. The first applies to many-to-many relationships, and the second applies to one-to-many relationships. Let's take a look at what each variant might look like in our bookstore application to help us build a books catalog. The first variant uses an array of references and a document type field, allowing us to model both one-to-many and many-to-many relationships.
In our bookstore example, we have three collections, one for each entity used to model the book's catalog feature--
reviews, users, and books. Let's apply the single collection pattern to one book. To implement the single collection pattern, we first need to add a doc-type field to each document. This will allow us to query the documents by type, as if they were entities living in separate collections.
You may recognize this implementation detail from another pattern, known as the inheritance pattern. Next, we add the related-to field. This will help model the relationships between documents and the new collection using an array of references. All the documents in our example are related to the same book ID, and the book document points to itself.
Now we just need to move the documents to the books catalog collection. To complete the implementation of the single collection pattern, we have to create an index on the related-to array to support our application queries. We can now easily retrieve all documents related to the specified book, as shown here. We can also query all documents for a given doc type.
For example, here, we're querying for all the reviews given to a book. The second variant of the single collection pattern uses an overloaded field. A field is overloaded when it is used for another purpose other than its original intent. In this case, we're using the field to identify the documents in the single collection and also to enable efficient queries--
hence, the term "overloaded." This variant is used to model one-to-many relationships only. Let's look at an example of using the single collection pattern to model a one-to-many relationship. In our bookstore app, we want to model the relationship between books and reviews.
To do this, we add and overload the single collection ID field. The book documents will use the book ID and the overloaded field. For the review documents, the single collection ID field value will be overloaded using both the book ID and the review ID, separated by a slash. Now let's move our documents to the books catalog collection.
Finally, we need to create an index for the single collection ID to support the application queries. Now that we have implemented the second variant of the single collection pattern, let's see how to query the single collection using the SC ID field. First, we will use a regular expression to query all documents starting with "book ID." Here's what that looks like.
Now we will update the regular expression to filter out the book document and get only the related reviews. Check it out. Nicely done. Let's recap what we learned about the single collection pattern.
The single collection pattern groups related documents of different types into a single collection. It is useful for modeling many-to-many relationships when embedding is not a good option. And it can be used to model one-to-many relationships. We learned how to implement the two variants of the single collection pattern.
First, we can use the document type field and an array of related documents. This is useful for modeling many-to-many relationships. When modeling one-to-many relationships, we have an additional option using an overloaded field. Great job.
See you in the next lesson.
