Relational to Document Model / Design Relationships

6:01
After identifying relationships between entities, we must now translate them into the MongoDB document model. Should related entities be separate documents that are linked or should both entities be stored within one document? We must decide whether to use referencing or embedding for these relationships. In this video, we'll focus on the printed book entity. We'll also review a set of guidelines that can help you decide whether to reference or embed a relationship. A reference establishes a relationship between entities where each entity is typically stored in a separate document and linked together using a common key. Most often these documents come from different collections, but that is not a requirement. For example, imagine I have separate documents for a printed book and its authors. We can link the documents together using a common key, in this case, an author ID. Within a book document, I have a set of author IDs that reference the author documents. The alternative to referencing is embedding. When embedding, both entities are inside a single document. In our example, the author information is stored within the printed book document in the books collection. Now the question is, how do we decide whether to reference or embed a relationship? First and foremost, it's important to follow the MongoDB golden rule of schema design. Data that is accessed together should be stored together. For example, if we access a book and its author together in our application, the two entities should be stored together. In addition, MongoDB has a set of guidelines that can help you make this decision. You can apply these guidelines to any relationship when deciding whether to embed or reference. For now, let's apply these guidelines to the many-to-many relationship that exists between the printed book and author entities from our bookstore app. We'll use a series of questions based on each guideline shown here to help us determine whether we should embed or reference our many-to-many relationship. As we move through the questions, we'll keep a simple scorecard of embedding versus referencing using points. We can then compare scores at the end. OK, let's do it. First up is "Simplicity." Would keeping entities together lead to a simpler data model and simpler code? Yes. Collapsing book and author entities into a single document simplifies our application code. A yes answer in this case is 1 point for embedding. The second guideline is "Go Together." Do the entities go together? Do they have a has-a, contains, or go together relationship? Yes. A book has an author. This is another point for embedding. The next guideline is "Query Atomicity." Does the app query the entities together? We want to load the books information together with the authors, so yes. This is another point for embedding. The fourth guideline is "Update Complexity." Are the entities updated together? No. We could change an authors biography without changing information for a book that they wrote. In this case, a no is a point for referencing. The fifth guideline is "Archival." Should the entities be archived at the same time? No. An author may have more than one book listed in our app. We wouldn't want to archive the author if just one of their books was deleted. This is another point for referencing. The sixth guideline is "Cardinality." Is there a high cardinality, current or growing, in the child side of the relationship, in our case, the author? This question is asking us to think about the possibility of an unbounded array, which we want to avoid. For our app, we aren't in danger of a high cardinality because it is unlikely that a list of authors for one book will be very long or will change over time. So the answer is no. In this case, no is actually a point for embedding. Next we have "Data Duplication." Would data duplication be too complicated to manage and would it be undesired? In this example, data duplication is not difficult to manage so we answer no. Once again, this no is actually a point for embedding. The eighth guideline is "Document Size." Would the combined size of the entities take too much memory or transfer bandwidth for the application? No. One document with both author and book entities would be relatively small. This is another point for embedding. The ninth guideline is "Document Growth." If we embed, with the embedded piece grow without bound? For the entities in our example, there would be little growth over time so our answer is no. This is another point for embedding. The 10th guideline is workload. Are the entities written at different times in a write-heavy workload? The entities for printed books and authors will be written at the same time. For adding or updating books, we expect this to occur at a rate of 10 times per hour, so this is not a write-heavy workload. Our answer is no. This is another point for embedding. The 11th guideline is individuality. For the child side of the relationship, in our case, the author, can the entity exist by itself without a parent, in our case, the book? No. An author cannot exist in our application without a book. This is another point for embedding. Tallying the results, it's clear that we should embed the author or authors with the book for this particular relationship. Whenever we use these guidelines, we need to consider the priority of each in relation to our application requirements. There are times when one guideline will take priority and dictate whether we should embed or reference a relationship. Let's recap what you learned in this video. When modeling with MongoDB, we can establish a relationship between entities through a reference by including the ID of the child document in the parent document, or we can establish the relationship by embedding, where we include the child document within the parent document. When deciding whether to reference or embed, the golden rule is-- data that is accessed together should be stored together. And finally, we covered guidelines that can help you decide whether to define a relationship by embedding or referencing.

Click below to access the Embed vs Reference PDF.

Access Embed vs Reference PDF