Data Modeling in MongoDB : A complete guide

Data Modeling in MongoDB : A complete guide

In this article, we will learn about data modeling in MongoDB and its different use cases.

So, let's get started...

What is data modeling?

Data modeling is the process of taking unstructured data generated by a real-world scenario and then structuring it into a logical data model in a database, which can be done by using certain criteria.

Different Types of relationships:

There are mainly three types of relationships between data:

1) One to one

In one-to-one reletionship, one field can have only one value.

For example, One movie can only have one name.

2) one to many

According to mongodb reletionship, we can divide this into another three sub-reletionship as below:

i)One to Few

For Example, One movie can win many awards but don't thousands, thre will be few only

ii)One to Many

This is the most important relationship and is mostly used in mongodb.

For Example, One movie can have many reviews, like hundreds/thousands

One to Many-1
One to Many-1

iii)One to Ton

For Example, application logs, suppose you want to capture login activity logs of your application, there can eventually grow to millions if we have a large user base

3)Many to many

For Example, One movie can have many actors, but one actor can also play in many movies



1) Referenced /Normalized :

Here we do create two separate documents and then give the reference IDs on one document to another.

For example, We can create one movie document and another actors documents and then we can connect movies with actors by providing actors refrence in movie documents by their id, this is also called child referencing.


๐Ÿ‘ Its easier to query each document on its own 


๐Ÿ‘Ž We need 2 queries to get data from referenced documents

2) Embbeded/DeNormalized:

Here we embed related documents directly inside documents, so in Embedded documents, we have all data within these documents, so no need to create other documents.


๐Ÿ‘This can improve performance as we need fewer queries to get all data.


๐Ÿ‘ŽWe can't get only embedded data, if is there any requirements that happens, so in this case, you have to use normalized data.

When to use Embed and Referenced:

1) Embedding:

We can use always Embeddiing while having, the following criteria:

  •  One to Few relationships
  • Data is mostly readData
  • Data does not change quickly
  • High read/low write ratio
  • Datasets really belong together (user + email address)

For example, Images of movies, as once its added, it's not get updated regularly

2) Referencing:

We can use Referencing while having, the following criteria:

  • Always while having one to ton or One to many relationships
  • Data get updated a lot
  • Low read/high write ratio, For example, Movies + review as a review can be updated multiple times and also can be updated when any user likes, dislikes or marks as helpful
  • We frequently need to query both datasets on their own, For example, if we need to fetch images only multiple times then we must have to use referencing

Types of Referencing:

!) Child referencing:

Here mostly we store references of other documents as an array in main parent documents

Best for :1 to Few

!!) Parent Referencing:

Now suppose in child referencing we are storing loggin info logs and that can become very large in feature and as there 16MB limit for BSON document, it can be easily over, so it's not ideal, so in this case, we can use parent referencing.

Best for : 1 to Many, 1 to ton

Here we store parent reference id in child documents.

!!!) Two-way referencing: Best for : MANY TO MANY

Subscribe to our Newsletter

Stay up to date! Get all the latest posts delivered straight to your inbox.

If You Appreciate What We Do Here On TutsCoder, You Should Consider:

If you like what you are reading, please consider buying us a coffee ( or 2 ) as a token of appreciation.

Support Us

We are thankful for your never ending support.

Leave a Comment