Sample data source
A variety of sample data is provided by the developers of MongoDB. They can be found here:
https://developer.mongodb.com/article/atlas-sample-datasets/
A link to directly download all available sample datasets (307 MB):
https://atlas-education.s3.amazonaws.com/sampledata.archive
Import data
-
Navigate to folder
/home/ricky/Downloads
. -
Download sample data:
wget https://atlas-education.s3.amazonaws.com/sampledata.archive
-
Import data by using
mongorestore
tool:
mongorestore --archive=sampledata.archive
Last line of output:
433281 document(s) restored successfully. 0 document(s) failed to restore.
Explore sample data
Show all DBs:
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
logbook 0.000GB
people 0.000GB
sample_airbnb 0.051GB
sample_analytics 0.009GB
sample_geospatial 0.001GB
sample_mflix 0.040GB
sample_restaurants 0.006GB
sample_supplies 0.001GB
sample_training 0.040GB
sample_weatherdata 0.002GB
Use DB sample_supplies
, and show all collections:
> use sample_supplies
switched to db sample_supplies
> show collections
sales
Explore the collection sales
:
// Count the number of documents in this collection.
> db.sales.count()
5000
// Find the first document in the collection.
> db.sales.findOne()
{
"_id" : ObjectId("5bd761dcae323e45a93ccfe8"),
"saleDate" : ISODate("2015-03-23T21:06:49.506Z"),
"items" : [
{
"name" : "printer paper",
"tags" : [
"office",
"stationary"
],
"price" : NumberDecimal("40.01"),
"quantity" : 2
},
{
"name" : "notepad",
"tags" : [
"office",
"writing",
"school"
],
"price" : NumberDecimal("35.29"),
"quantity" : 2
},
{
"name" : "pens",
"tags" : [
"writing",
"office",
"school",
"stationary"
],
"price" : NumberDecimal("56.12"),
"quantity" : 5
},
{
"name" : "backpack",
"tags" : [
"school",
"travel",
"kids"
],
"price" : NumberDecimal("77.71"),
"quantity" : 2
},
{
"name" : "notepad",
"tags" : [
"office",
"writing",
"school"
],
"price" : NumberDecimal("18.47"),
"quantity" : 2
},
{
"name" : "envelopes",
"tags" : [
"stationary",
"office",
"general"
],
"price" : NumberDecimal("19.95"),
"quantity" : 8
},
{
"name" : "envelopes",
"tags" : [
"stationary",
"office",
"general"
],
"price" : NumberDecimal("8.08"),
"quantity" : 3
},
{
"name" : "binder",
"tags" : [
"school",
"general",
"organization"
],
"price" : NumberDecimal("14.16"),
"quantity" : 3
}
],
"storeLocation" : "Denver",
"customer" : {
"gender" : "M",
"age" : 42,
"email" : "[email protected]",
"satisfaction" : 4
},
"couponUsed" : true,
"purchaseMethod" : "Online"
}
From the output, we can see the structure of the document:
- _id: ObjectId
- saleDate: ISODate
- items: An array of items
- name: Item name
- tags: Item tags
- price: Price of the item, to 2 decimal places
- quantity: An integer counting the quantity
- storeLocation: The name of the store location
- customer: An object recording the customer details
- gender: “M” or “F”
- age: Integer age
- email: An email address
- satisfaction: An integer of satisfaction score
- couponUsed: A boolean of whether a coupon is used
- purchaseMethod: The name of the purchase method
distinct() method
We can further explore the collection by using some query methods:
// Find all the distinct values of customer gender.
> db.sales.distinct("customer.gender")
[ "F", "M" ]
// Find all the distinct values of item tags.
> db.sales.distinct("items.tags");
[
"electronics",
"general",
"kids",
"office",
"organization",
"school",
"stationary",
"travel",
"writing"
]
// Find all the item names
> db.sales.distinct("items.name");
[
"backpack",
"binder",
"envelopes",
"laptop",
"notepad",
"pens",
"printer paper"
]
// Find all the purchase methods.
> db.sales.distinct("purchaseMethod");
[ "In store", "Online", "Phone" ]
// Find all the store locations.
> db.sales.distinct("storeLocation");
[ "Austin", "Denver", "London", "New York", "San Diego", "Seattle" ]
// Find the number of documents with store location "Austin".
> db.sales.count({
... storeLocation: "Austin"
... })
676