Unlike relational databases, there are no native methods for relations in MongoDB. However, we can try to create relations by referencing the _id between documents.
We will continue to use the sample database, sample_supplies, for this demo. In the last chapter, we created the collection customers and applies some validation rules. I will drop the existing db and recreate it with db.createCollection(...)
method.
> use sample_supplies
switched to db sample_supplies
> show collections
customers
sales
> db.customers.drop()
true
Create a customers collection with email as unique index
> db.createCollection( "customers",
{
validator: {
$jsonSchema: {
required: [ "email" ],
properties: {
email: {
bsonType: "string",
description: "unique email address"
}
}
}
}
}
)
{ "ok" : 1 }
> db.customers.createIndex( { email: 1 }, { unique: true })
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
Create a backup of sales collection
Before we modify the sales collection, we want to create a copy of it in case anything goes wrong.
Two aggregation stages can help us do that: $merge
and $out
. A key difference is that, $out
replaces a collection if it exists, while $merge
will merge with an existing collection. Note that $out
is only available since MongoDB 4.4, while $merge
is available since MongoDB 4.2.
Since we want to clone the collection as a backup, we can use the $out
stage:
> db.sales.aggregate([
{
$out: "sales_backup"
}
])
// No output
Check that the backup collection is created:
> show collections
customers
sales
sales_backup
> db.sales.count()
5000
> db.sales_backup.count()
5000
Copy customer data to customers collection
> db.sales.aggregate([
{
$project: {
_id: 0,
customer: 1
}
},
{
$set: {
email: "$customer.email",
age: "$customer.age",
gender: "$customer.gender"
}
},
{
$unset: "customer"
},
{
$merge: {
into: "customers",
on: "email"
}
}
])
// No output
Explaining the aggregation stages:
- Only get the customer field and remove _id field.
- Set email, age and gender fields.
- Unset the customer field.
- Merge the resulting documents with the collection customers.
Check the customers collection:
> db.customers.find()
{ "_id" : ObjectId("60b4e5964913a878e72bcf62"), "email" : "[email protected]", "age" : 42, "gender" : "M" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf63"), "email" : "[email protected]", "age" : 50, "gender" : "M" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf64"), "email" : "[email protected]", "age" : 51, "gender" : "M" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf65"), "email" : "[email protected]", "age" : 45, "gender" : "F" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf66"), "email" : "[email protected]", "age" : 40, "gender" : "M" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf67"), "email" : "[email protected]", "age" : 44, "gender" : "M" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf68"), "email" : "[email protected]", "age" : 40, "gender" : "F" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf69"), "email" : "[email protected]", "age" : 71, "gender" : "M" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf6a"), "email" : "[email protected]", "age" : 26, "gender" : "M" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf6b"), "email" : "[email protected]", "age" : 57, "gender" : "M" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf6c"), "email" : "[email protected]", "age" : 34, "gender" : "F" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf6d"), "email" : "[email protected]", "age" : 49, "gender" : "F" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf6e"), "email" : "[email protected]", "age" : 59, "gender" : "F" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf6f"), "email" : "[email protected]", "age" : 55, "gender" : "M" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf70"), "email" : "[email protected]", "age" : 36, "gender" : "F" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf71"), "email" : "[email protected]", "age" : 28, "gender" : "M" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf72"), "email" : "[email protected]", "age" : 53, "gender" : "F" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf73"), "email" : "[email protected]", "age" : 50, "gender" : "F" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf74"), "email" : "[email protected]", "age" : 40, "gender" : "F" }
{ "_id" : ObjectId("60b4e5964913a878e72bcf75"), "email" : "[email protected]", "age" : 39, "gender" : "F" }
Type "it" for more
Move satisfaction score to a separate field in sales
> db.sales.aggregate([
{
$project: {
customer: 1
}
},
{
$set: {
satisfaction: "$customer.satisfaction"
}
},
{
$unset: "customer.satisfaction"
},
{
$merge: {
into: "sales",
on: "_id",
}
}
])
Check that it worked:
> db.sales.findOne({}, { items: 0 })
{
"_id" : ObjectId("5bd761dcae323e45a93ccfe8"),
"saleDate" : ISODate("2015-03-23T21:06:49.506Z"),
"storeLocation" : "Denver",
"customer" : {
"gender" : "M",
"age" : 42,
"email" : "[email protected]"
},
"couponUsed" : true,
"purchaseMethod" : "Online",
"satisfaction" : 4
}
Create a many-to-one relation
Replace customer with the *id of the customer:
> db.sales.aggregate([
{
$project: {
customer: 1
}
},
{
$lookup: {
from: "customers",
localField: "customer.email",
foreignField: "email",
as: "customer",
}
},
{
$set: {
customer: {
$arrayElemAt: [ "$customer", 0 ]
}
}
},
{
$set: {
customer: "$customer._id"
}
},
{
$merge: {
into: "sales",
on: "_id"
}
}
])
Confirm that it worked:
> db.sales.findOne({}, { items: 0 })
{
"_id" : ObjectId("5bd761dcae323e45a93ccfe8"),
"saleDate" : ISODate("2015-03-23T21:06:49.506Z"),
"storeLocation" : "Denver",
"customer" : ObjectId("60b4e5964913a878e72bcf62"),
"couponUsed" : true,
"purchaseMethod" : "Online",
"satisfaction" : 4
}
Now we have a one-way many-to-one relation, which means multiple sales documents can point to one customer. However, to find all the sales records of a customer, we still need to loop through the whole sales collection. To create a bidirectional one-to-many relation, we can create an array in a customer document to store all the _id of sales documents.
Bidirectional one-to-many relation
Backup again:
> db.sales.aggregate([
{
$out: "sales_backup_related"
}
])
> db.cuctomers.aggregate([
{
$out: "customers_backup"
}
])
Populate customer documents with sales document _id:
> db.customers.aggregate([
{
$project: {
_id: 1
}
},
{
$lookup: {
from: "sales",
let: { customer_id: "$_id" },
pipeline: [
{
$match: {
$expr: {
$eq: [ "$$customer_id", "$customer" ]
}
}
},
{
$project: {
_id: 1
}
}
],
as: "sales"
}
},
{
$unwind: "$sales"
},
{
$set: {
sales: "$sales._id"
}
},
{
$group: {
_id: "$_id",
sales: {
$addToSet: "$sales"
}
}
},
{
$merge: {
into: "customers",
on: "_id"
}
}
])
This aggregation may take some time, because, for every customer, MongoDB needs to loop through all the sales records to find matching documents.
In the end, we will have a customer collection with every customer having an array of sales document _id. Doing so allows us to:
- Fetch individual sales document by _id
- Count related sales documents by array length
- Update customer details only once, without modifying multiple sales documents
> let customer = db.customers.findOne({ email: "[email protected]" })
> db.sales.find(
{
_id: {
$in: customer.sales
}
},
{
items: 0
}
).pretty()
Output:
{
"_id" : ObjectId("5bd761dcae323e45a93cd04b"),
"saleDate" : ISODate("2014-07-16T04:13:47.488Z"),
"storeLocation" : "Austin",
"customer" : ObjectId("60b4e5964913a878e72bcfc5"),
"couponUsed" : false,
"purchaseMethod" : "Online",
"satisfaction" : 3
}