SlideShare a Scribd company logo
Schema Design Basics Roger Bodamer roger @ 10gen.com @rogerb
A brief history of Data Modeling ISAM COBOL  Network  Hiearchical Relational 1970 E.F.Codd introduces 1 st  Normal Form (1NF) 1971 E.F.Codd introduces 2 nd  and 3 rd  Normal Form (2NF, 3NF 1974 Codd & Boyce define Boyce/Codd Normal Form (BCNF) 2002 Date, Darween, Lorentzos define 6 th  Normal Form (6NF) Object
So why model data?
Modeling goals Goals: Avoid anomalies when inserting, updating or deleting Minimize redesign when extending the schema Make the model informative to users Avoid bias towards a particular style of query * source : wikipedia
Relational made normalized data look like this
Document databases make normalized data look like this
Some terms before we proceed RDBMS Document DBs Table Collection View / Row(s) JSON Document Index Index Join Embedding & Linking across documents Partition Shard Partition Key Shard Key
Recap Design documents that simply map to your application post = { author : “roger”, date : new Date(), text : “I love J.Biebs...”, tags : [“rockstar”,“puppy-love”]}
Query operators Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,  // find posts with any tags >db.posts.find({ tags : {$exists: true}})
Query operators Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,  // find posts with any tags >db.posts.find({ tags : {$exists: true}}) Regular expressions: // posts where author starts with k >db.posts.find({ author : /^r*/i })
Query operators Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,  // find posts with any tags >db.posts.find({ tags : {$exists: true}}) Regular expressions: // posts where author starts with k >db.posts.find({ author : /^r*/i })  Counting:  // posts written by mike >db.posts.find({ author : “roger”}).count()
Extending the Schema new_comment = { author : “Gretchen”,  date : new Date(), text : “Biebs is Toll!!!!”} new_info = { ‘$push’: { comments : new_comment}, ‘ $inc’: { comments_count : 1}} >db.posts.update({ _id : “...” }, new_info)
{  _id  : ObjectId("4c4ba5c0672c685e5e8aabf3"),  author  : ”roger", date  : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)",  text  : " I love J.Biebs... ", tags  : [ ”rockstar", ”puppy-love" ], comments_count : 1,  comments  : [ { author  : ”Gretchen", date  : "Sat Jul 24 2010 20:51:03 GMT-0700 (PDT)", text  : ”  Biebs is Toll!!!! " } ]} Extending the Schema
// create index on nested documents: >db.posts.ensureIndex({"comments.author": 1}) >db.posts.find({comments.author:”Gretchen”}) // find last 5 posts: >db.posts.find().sort({ date :-1}).limit(5) // most commented post: >db.posts.find().sort({ comments_count :-1}).limit(1) When sorting, check if you need an index Extending the Schema
Single Table Inheritance >db.shapes.find() {  _id : ObjectId("..."),  type : "circle",  area : 3.14,  radius : 1} {  _id : ObjectId("..."),  type : "square",  area : 4,  d : 2} {  _id : ObjectId("..."),  type : "rect",  area : 10,  length : 5,  width : 2} // find shapes where radius > 0  >db.shapes.find({ radius : { $gt : 0}}) // create index >db.shapes.ensureIndex({ radius : 1})
One to Many - Embedded Array / Using Array Keys - slice operator to return subset of array - hard to find latest comments across all documents
One to Many - Embedded Array / Array Keys - slice operator to return subset of array - hard to find latest comments across all documents - Embedded tree - Single document - Natural
One to Many - Embedded Array / Array Keys - slice operator to return subset of array - hard to find latest comments across all documents - Embedded tree - Single document - Natural  - Normalized (2 collections) - most flexible - more queries
Many - Many Example: - Product can be in many categories - Category can have many products Products - product_id Category - category_id Prod_Categories id product_id category_id
products: {  _id : ObjectId("4c4ca23933fb5941681b912e"), name : "Sumatra Dark Roast", category_ids : [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} Many – Many
products: {  _id : ObjectId("4c4ca23933fb5941681b912e"), name : "Sumatra Dark Roast", category_ids : [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: {  _id : ObjectId("4c4ca25433fb5941681b912f"),  name : "Indonesia",  product_ids : [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]} Many – Many
products: {  _id : ObjectId("4c4ca23933fb5941681b912e"), name : "Sumatra Dark Roast", category_ids : [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: {  _id : ObjectId("4c4ca25433fb5941681b912f"),  name : "Indonesia",  product_ids : [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]} //All categories for a given product >db.categories.find({ product_ids : ObjectId("4c4ca23933fb5941681b912e")}) Many - Many
products: {  _id : ObjectId("4c4ca23933fb5941681b912e"), name : "Sumatra Dark Roast", category_ids : [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: {  _id : ObjectId("4c4ca25433fb5941681b912f"),  name : "Indonesia",  product_ids : [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]} //All categories for a given product >db.categories.find({ product_ids : ObjectId("4c4ca23933fb5941681b912e")}) //All products for a given category >db.products.find({ category_ids : ObjectId("4c4ca25433fb5941681b912f")})  Many - Many
products: {  _id : ObjectId("4c4ca23933fb5941681b912e"), name : "Sumatra Dark Roast", category_ids : [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: {  _id : ObjectId("4c4ca25433fb5941681b912f"),  name : "Indonesia"} Alternative
products: {  _id : ObjectId("4c4ca23933fb5941681b912e"), name : "Sumatra Dark Roast", category_ids : [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: {  _id : ObjectId("4c4ca25433fb5941681b912f"),  name : "Indonesia"} // All products for a given category >db.products.find({ category_ids : ObjectId("4c4ca25433fb5941681b912f")})  Alternative
products: {  _id : ObjectId("4c4ca23933fb5941681b912e"), name : "Sumatra Dark Roast", category_ids : [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: {  _id : ObjectId("4c4ca25433fb5941681b912f"),  name : "Indonesia"} // All products for a given category >db.products.find({ category_ids : ObjectId("4c4ca25433fb5941681b912f")})  // All categories for a given product product  = db.products.find( _id  : some_id) >db.categories.find({ _id  : {$in : product.category_ids}})  Alternative
Trees Full Tree in Document {  comments : [ {  author : “rpb”,  text : “...”,  replies : [ { author : “Fred”,  text : “...”, replies : []}  ]} ]} Pros: Single Document, Performance, Intuitive Cons: Hard to search,  4MB limit
Trees - continued Parent Links - Each node is stored as a document - Contains the id of the parent Child Links - Each node contains the id’s of the children - Can support graphs (multiple parents / child)
Array of Ancestors - Store Ancestors of a node  {  _id : "a" } {  _id : "b",  ancestors : [ "a" ],  parent : "a" } {  _id : "c",  ancestors : [ "a", "b" ],  parent : "b" } {  _id : "d",  ancestors : [ "a", "b" ],  parent : "b" } {  _id : "e",  ancestors : [ "a" ],  parent : "a" } {  _id : "f",  ancestors : [ "a", "e" ],  parent : "e" } {  _id : "g",  ancestors : [ "a", "b", "d" ],  parent : "d" }
Array of Ancestors - Store Ancestors of a node  {  _id : "a" } {  _id : "b",  ancestors : [ "a" ],  parent : "a" } {  _id : "c",  ancestors : [ "a", "b" ],  parent : "b" } {  _id : "d",  ancestors : [ "a", "b" ],  parent : "b" } {  _id : "e",  ancestors : [ "a" ],  parent : "a" } {  _id : "f",  ancestors : [ "a", "e" ],  parent : "e" } {  _id : "g",  ancestors : [ "a", "b", "d" ],  parent : "d" } //find all descendants of b: >db.tree2.find({ ancestors : ‘b’})
Array of Ancestors - Store Ancestors of a node  {  _id : "a" } {  _id : "b",  ancestors : [ "a" ],  parent : "a" } {  _id : "c",  ancestors : [ "a", "b" ],  parent : "b" } {  _id : "d",  ancestors : [ "a", "b" ],  parent : "b" } {  _id : "e",  ancestors : [ "a" ],  parent : "a" } {  _id : "f",  ancestors : [ "a", "e" ],  parent : "e" } {  _id : "g",  ancestors : [ "a", "b", "d" ],  parent : "d" } //find all descendants of b: >db.tree2.find({ ancestors : ‘b’}) //find all ancestors of f: >ancestors = db.tree2.findOne({ _id :’f’}).ancestors >db.tree2.find({ _id : { $in : ancestors})
Variable Keys How to index ? {  "_id" : "uuid1",   "field1" : {   "ctx1" : { "ctx3" : 5, … },      "ctx8" : { "ctx3" : 5, … } }} db.MyCollection.find({ "field1.ctx1.ctx3" : { $exists : true} }) Rewrite: {  "_id" : "uuid1",   "field1" : {   key: "ctx1”, value : { k:"ctx3”, v : 5, … },      key: "ctx8”, value : { k: "ctx3”, v : 5, … } }} db.x.ensureIndex({“field1.key.k”, 1})
findAndModify Queue example //Example: find highest priority job and mark job = db.jobs.findAndModify({   query :  {inprogress: false}, sort :  {priority: -1),  update : {$set: {inprogress: true,  started: new Date()}}, new : true})
Learn More Kyle’s presentation + video:  http://www.slideshare.net/kbanker/mongodb-schema-design http://www.blip.tv/file/3704083 Dwight’s presentation http://www.slideshare.net/mongosf/schema-design-with-mongodb-dwight-merriman Documentation Trees:      http://www.mongodb.org/display/DOCS/Trees+in+MongoDB Queues:   http://www.mongodb.org/display/DOCS/findandmodify+Command Aggregration:  http://www.mongodb.org/display/DOCS/Aggregation Capped Col. :  http://www.mongodb.org/display/DOCS/Capped+Collections Geo:  http://www.mongodb.org/display/DOCS/Geospatial+Indexing GridFS:  http://www.mongodb.org/display/DOCS/GridFS+Specification
Thank You :-)
Download MongoDB http://www.mongodb.org and let us know what you think @mongodb
DBRef DBRef { $ref : collection,  $id : id_value} - Think URL - YDSMV: your driver support may vary Sample Schema: nr =  { note_refs : [{"$ref" : "notes", "$id" : 5}, ... ]} Dereferencing: nr.forEach(function(r) { printjson(db[r.$ref].findOne({ _id : r.$id})); }
BSON Mongodb stores data in BSON  internally Lightweight, Traversable, Efficient encoding Typed  boolean, integer, float, date, string, binary, array...

More Related Content

PDF
Intro to MongoDB and datamodeling
PPTX
Schema design with MongoDB (Dwight Merriman)
KEY
Schema Design with MongoDB
PDF
MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way
PDF
MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
PDF
San Francisco Java User Group
KEY
Schema Design (Mongo Austin)
PPTX
MongoDB Schema Design: Four Real-World Examples
Intro to MongoDB and datamodeling
Schema design with MongoDB (Dwight Merriman)
Schema Design with MongoDB
MongoDB Europe 2016 - ETL for Pros – Getting Data Into MongoDB The Right Way
MongoDB Europe 2016 - Advanced MongoDB Aggregation Pipelines
San Francisco Java User Group
Schema Design (Mongo Austin)
MongoDB Schema Design: Four Real-World Examples

What's hot (20)

PPTX
Dropping ACID with MongoDB
PDF
MongoDB and Ruby on Rails
PPTX
MongoDB (Advanced)
PPTX
Powerful Analysis with the Aggregation Pipeline
PDF
01 ElasticSearch : Getting Started
PPTX
Back to Basics Webinar 2: Your First MongoDB Application
PPTX
Moose Best Practices
PPTX
ETL for Pros: Getting Data Into MongoDB
PPTX
Dex Technical Seminar (April 2011)
PDF
NoSQL を Ruby で実践するための n 個の方法
PDF
03. ElasticSearch : Data In, Data Out
PPTX
Back to Basics Webinar 3 - Thinking in Documents
PPTX
Back to Basics Webinar 1 - Introduction to NoSQL
ODP
Terms of endearment - the ElasticSearch Query DSL explained
PPTX
ETL for Pros: Getting Data Into MongoDB
PPTX
Dev Jumpstart: Schema Design Best Practices
KEY
The Ruby/mongoDB ecosystem
PPTX
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
PDF
06. ElasticSearch : Mapping and Analysis
PDF
ActiveRecord vs Mongoid
Dropping ACID with MongoDB
MongoDB and Ruby on Rails
MongoDB (Advanced)
Powerful Analysis with the Aggregation Pipeline
01 ElasticSearch : Getting Started
Back to Basics Webinar 2: Your First MongoDB Application
Moose Best Practices
ETL for Pros: Getting Data Into MongoDB
Dex Technical Seminar (April 2011)
NoSQL を Ruby で実践するための n 個の方法
03. ElasticSearch : Data In, Data Out
Back to Basics Webinar 3 - Thinking in Documents
Back to Basics Webinar 1 - Introduction to NoSQL
Terms of endearment - the ElasticSearch Query DSL explained
ETL for Pros: Getting Data Into MongoDB
Dev Jumpstart: Schema Design Best Practices
The Ruby/mongoDB ecosystem
MongoDB San Francisco 2013: Data Modeling Examples From the Real World presen...
06. ElasticSearch : Mapping and Analysis
ActiveRecord vs Mongoid
Ad

Similar to Schema design short (20)

KEY
Schema design
PDF
10gen Presents Schema Design and Data Modeling
PPTX
Mongo db – document oriented database
PPTX
Schema design mongo_boston
PPTX
Schema Design
KEY
Schema Design
PDF
Schema Design
PPTX
Webinar: Schema Design
ODP
MongoDB San Francisco DrupalCon 2010
PDF
Schema & Design
KEY
MongoDB - Introduction
PPTX
Webinar: General Technical Overview of MongoDB for Dev Teams
PPTX
Why NoSQL Makes Sense
PPTX
Why NoSQL Makes Sense
PPTX
Schema Design
PDF
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
PDF
Latinoware
PDF
2013-03-23 - NoSQL Spartakiade
PDF
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
PDF
Schema Design
Schema design
10gen Presents Schema Design and Data Modeling
Mongo db – document oriented database
Schema design mongo_boston
Schema Design
Schema Design
Schema Design
Webinar: Schema Design
MongoDB San Francisco DrupalCon 2010
Schema & Design
MongoDB - Introduction
Webinar: General Technical Overview of MongoDB for Dev Teams
Why NoSQL Makes Sense
Why NoSQL Makes Sense
Schema Design
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB für .NET Entwickler)
Latinoware
2013-03-23 - NoSQL Spartakiade
The Fine Art of Schema Design in MongoDB: Dos and Don'ts
Schema Design
Ad

More from MongoDB (20)

PDF
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
PDF
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
PDF
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
PDF
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
PDF
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
PDF
MongoDB SoCal 2020: MongoDB Atlas Jump Start
PDF
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
PDF
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
PDF
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
PDF
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
PDF
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
PDF
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
PDF
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
PDF
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
PDF
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
PDF
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
PDF
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
PDF
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...

Recently uploaded (20)

PDF
SparkLabs Primer on Artificial Intelligence 2025
PDF
Chapter 2 Digital Image Fundamentals.pdf
PDF
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
PDF
Event Presentation Google Cloud Next Extended 2025
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
PDF
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
CroxyProxy Instagram Access id login.pptx
PDF
NewMind AI Weekly Chronicles - July'25 - Week IV
PDF
Doc9.....................................
PPTX
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
PDF
Enable Enterprise-Ready Security on IBM i Systems.pdf
PDF
madgavkar20181017ppt McKinsey Presentation.pdf
PPTX
ABU RAUP TUGAS TIK kelas 8 hjhgjhgg.pptx
PDF
CIFDAQ's Token Spotlight: SKY - A Forgotten Giant's Comeback?
PDF
Smarter Business Operations Powered by IoT Remote Monitoring
PDF
Reimagining Insurance: Connected Data for Confident Decisions.pdf
PPTX
Telecom Fraud Prevention Guide | Hyperlink InfoSystem
PDF
REPORT: Heating appliances market in Poland 2024
SparkLabs Primer on Artificial Intelligence 2025
Chapter 2 Digital Image Fundamentals.pdf
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
Event Presentation Google Cloud Next Extended 2025
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
How-Cloud-Computing-Impacts-Businesses-in-2025-and-Beyond.pdf
Understanding_Digital_Forensics_Presentation.pptx
CroxyProxy Instagram Access id login.pptx
NewMind AI Weekly Chronicles - July'25 - Week IV
Doc9.....................................
Comunidade Salesforce São Paulo - Desmistificando o Omnistudio (Vlocity)
Enable Enterprise-Ready Security on IBM i Systems.pdf
madgavkar20181017ppt McKinsey Presentation.pdf
ABU RAUP TUGAS TIK kelas 8 hjhgjhgg.pptx
CIFDAQ's Token Spotlight: SKY - A Forgotten Giant's Comeback?
Smarter Business Operations Powered by IoT Remote Monitoring
Reimagining Insurance: Connected Data for Confident Decisions.pdf
Telecom Fraud Prevention Guide | Hyperlink InfoSystem
REPORT: Heating appliances market in Poland 2024

Schema design short

  • 1. Schema Design Basics Roger Bodamer roger @ 10gen.com @rogerb
  • 2. A brief history of Data Modeling ISAM COBOL Network Hiearchical Relational 1970 E.F.Codd introduces 1 st Normal Form (1NF) 1971 E.F.Codd introduces 2 nd and 3 rd Normal Form (2NF, 3NF 1974 Codd & Boyce define Boyce/Codd Normal Form (BCNF) 2002 Date, Darween, Lorentzos define 6 th Normal Form (6NF) Object
  • 3. So why model data?
  • 4. Modeling goals Goals: Avoid anomalies when inserting, updating or deleting Minimize redesign when extending the schema Make the model informative to users Avoid bias towards a particular style of query * source : wikipedia
  • 5. Relational made normalized data look like this
  • 6. Document databases make normalized data look like this
  • 7. Some terms before we proceed RDBMS Document DBs Table Collection View / Row(s) JSON Document Index Index Join Embedding & Linking across documents Partition Shard Partition Key Shard Key
  • 8. Recap Design documents that simply map to your application post = { author : “roger”, date : new Date(), text : “I love J.Biebs...”, tags : [“rockstar”,“puppy-love”]}
  • 9. Query operators Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne, // find posts with any tags >db.posts.find({ tags : {$exists: true}})
  • 10. Query operators Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne, // find posts with any tags >db.posts.find({ tags : {$exists: true}}) Regular expressions: // posts where author starts with k >db.posts.find({ author : /^r*/i })
  • 11. Query operators Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne, // find posts with any tags >db.posts.find({ tags : {$exists: true}}) Regular expressions: // posts where author starts with k >db.posts.find({ author : /^r*/i }) Counting: // posts written by mike >db.posts.find({ author : “roger”}).count()
  • 12. Extending the Schema new_comment = { author : “Gretchen”, date : new Date(), text : “Biebs is Toll!!!!”} new_info = { ‘$push’: { comments : new_comment}, ‘ $inc’: { comments_count : 1}} >db.posts.update({ _id : “...” }, new_info)
  • 13. { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : ”roger", date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", text : " I love J.Biebs... ", tags : [ ”rockstar", ”puppy-love" ], comments_count : 1, comments : [ { author : ”Gretchen", date : "Sat Jul 24 2010 20:51:03 GMT-0700 (PDT)", text : ” Biebs is Toll!!!! " } ]} Extending the Schema
  • 14. // create index on nested documents: >db.posts.ensureIndex({"comments.author": 1}) >db.posts.find({comments.author:”Gretchen”}) // find last 5 posts: >db.posts.find().sort({ date :-1}).limit(5) // most commented post: >db.posts.find().sort({ comments_count :-1}).limit(1) When sorting, check if you need an index Extending the Schema
  • 15. Single Table Inheritance >db.shapes.find() { _id : ObjectId("..."), type : "circle", area : 3.14, radius : 1} { _id : ObjectId("..."), type : "square", area : 4, d : 2} { _id : ObjectId("..."), type : "rect", area : 10, length : 5, width : 2} // find shapes where radius > 0 >db.shapes.find({ radius : { $gt : 0}}) // create index >db.shapes.ensureIndex({ radius : 1})
  • 16. One to Many - Embedded Array / Using Array Keys - slice operator to return subset of array - hard to find latest comments across all documents
  • 17. One to Many - Embedded Array / Array Keys - slice operator to return subset of array - hard to find latest comments across all documents - Embedded tree - Single document - Natural
  • 18. One to Many - Embedded Array / Array Keys - slice operator to return subset of array - hard to find latest comments across all documents - Embedded tree - Single document - Natural - Normalized (2 collections) - most flexible - more queries
  • 19. Many - Many Example: - Product can be in many categories - Category can have many products Products - product_id Category - category_id Prod_Categories id product_id category_id
  • 20. products: { _id : ObjectId("4c4ca23933fb5941681b912e"), name : "Sumatra Dark Roast", category_ids : [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} Many – Many
  • 21. products: { _id : ObjectId("4c4ca23933fb5941681b912e"), name : "Sumatra Dark Roast", category_ids : [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id : ObjectId("4c4ca25433fb5941681b912f"), name : "Indonesia", product_ids : [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]} Many – Many
  • 22. products: { _id : ObjectId("4c4ca23933fb5941681b912e"), name : "Sumatra Dark Roast", category_ids : [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id : ObjectId("4c4ca25433fb5941681b912f"), name : "Indonesia", product_ids : [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]} //All categories for a given product >db.categories.find({ product_ids : ObjectId("4c4ca23933fb5941681b912e")}) Many - Many
  • 23. products: { _id : ObjectId("4c4ca23933fb5941681b912e"), name : "Sumatra Dark Roast", category_ids : [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id : ObjectId("4c4ca25433fb5941681b912f"), name : "Indonesia", product_ids : [ ObjectId("4c4ca23933fb5941681b912e"), ObjectId("4c4ca30433fb5941681b9130"), ObjectId("4c4ca30433fb5941681b913a"]} //All categories for a given product >db.categories.find({ product_ids : ObjectId("4c4ca23933fb5941681b912e")}) //All products for a given category >db.products.find({ category_ids : ObjectId("4c4ca25433fb5941681b912f")}) Many - Many
  • 24. products: { _id : ObjectId("4c4ca23933fb5941681b912e"), name : "Sumatra Dark Roast", category_ids : [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id : ObjectId("4c4ca25433fb5941681b912f"), name : "Indonesia"} Alternative
  • 25. products: { _id : ObjectId("4c4ca23933fb5941681b912e"), name : "Sumatra Dark Roast", category_ids : [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id : ObjectId("4c4ca25433fb5941681b912f"), name : "Indonesia"} // All products for a given category >db.products.find({ category_ids : ObjectId("4c4ca25433fb5941681b912f")}) Alternative
  • 26. products: { _id : ObjectId("4c4ca23933fb5941681b912e"), name : "Sumatra Dark Roast", category_ids : [ ObjectId("4c4ca25433fb5941681b912f"), ObjectId("4c4ca25433fb5941681b92af”]} categories: { _id : ObjectId("4c4ca25433fb5941681b912f"), name : "Indonesia"} // All products for a given category >db.products.find({ category_ids : ObjectId("4c4ca25433fb5941681b912f")}) // All categories for a given product product = db.products.find( _id : some_id) >db.categories.find({ _id : {$in : product.category_ids}}) Alternative
  • 27. Trees Full Tree in Document { comments : [ { author : “rpb”, text : “...”, replies : [ { author : “Fred”, text : “...”, replies : []} ]} ]} Pros: Single Document, Performance, Intuitive Cons: Hard to search, 4MB limit
  • 28. Trees - continued Parent Links - Each node is stored as a document - Contains the id of the parent Child Links - Each node contains the id’s of the children - Can support graphs (multiple parents / child)
  • 29. Array of Ancestors - Store Ancestors of a node { _id : "a" } { _id : "b", ancestors : [ "a" ], parent : "a" } { _id : "c", ancestors : [ "a", "b" ], parent : "b" } { _id : "d", ancestors : [ "a", "b" ], parent : "b" } { _id : "e", ancestors : [ "a" ], parent : "a" } { _id : "f", ancestors : [ "a", "e" ], parent : "e" } { _id : "g", ancestors : [ "a", "b", "d" ], parent : "d" }
  • 30. Array of Ancestors - Store Ancestors of a node { _id : "a" } { _id : "b", ancestors : [ "a" ], parent : "a" } { _id : "c", ancestors : [ "a", "b" ], parent : "b" } { _id : "d", ancestors : [ "a", "b" ], parent : "b" } { _id : "e", ancestors : [ "a" ], parent : "a" } { _id : "f", ancestors : [ "a", "e" ], parent : "e" } { _id : "g", ancestors : [ "a", "b", "d" ], parent : "d" } //find all descendants of b: >db.tree2.find({ ancestors : ‘b’})
  • 31. Array of Ancestors - Store Ancestors of a node { _id : "a" } { _id : "b", ancestors : [ "a" ], parent : "a" } { _id : "c", ancestors : [ "a", "b" ], parent : "b" } { _id : "d", ancestors : [ "a", "b" ], parent : "b" } { _id : "e", ancestors : [ "a" ], parent : "a" } { _id : "f", ancestors : [ "a", "e" ], parent : "e" } { _id : "g", ancestors : [ "a", "b", "d" ], parent : "d" } //find all descendants of b: >db.tree2.find({ ancestors : ‘b’}) //find all ancestors of f: >ancestors = db.tree2.findOne({ _id :’f’}).ancestors >db.tree2.find({ _id : { $in : ancestors})
  • 32. Variable Keys How to index ? { "_id" : "uuid1",   "field1" : {   "ctx1" : { "ctx3" : 5, … },     "ctx8" : { "ctx3" : 5, … } }} db.MyCollection.find({ "field1.ctx1.ctx3" : { $exists : true} }) Rewrite: { "_id" : "uuid1",   "field1" : {   key: "ctx1”, value : { k:"ctx3”, v : 5, … },     key: "ctx8”, value : { k: "ctx3”, v : 5, … } }} db.x.ensureIndex({“field1.key.k”, 1})
  • 33. findAndModify Queue example //Example: find highest priority job and mark job = db.jobs.findAndModify({ query : {inprogress: false}, sort : {priority: -1), update : {$set: {inprogress: true, started: new Date()}}, new : true})
  • 34. Learn More Kyle’s presentation + video: http://www.slideshare.net/kbanker/mongodb-schema-design http://www.blip.tv/file/3704083 Dwight’s presentation http://www.slideshare.net/mongosf/schema-design-with-mongodb-dwight-merriman Documentation Trees: http://www.mongodb.org/display/DOCS/Trees+in+MongoDB Queues: http://www.mongodb.org/display/DOCS/findandmodify+Command Aggregration: http://www.mongodb.org/display/DOCS/Aggregation Capped Col. : http://www.mongodb.org/display/DOCS/Capped+Collections Geo: http://www.mongodb.org/display/DOCS/Geospatial+Indexing GridFS: http://www.mongodb.org/display/DOCS/GridFS+Specification
  • 37. DBRef DBRef { $ref : collection, $id : id_value} - Think URL - YDSMV: your driver support may vary Sample Schema: nr = { note_refs : [{"$ref" : "notes", "$id" : 5}, ... ]} Dereferencing: nr.forEach(function(r) { printjson(db[r.$ref].findOne({ _id : r.$id})); }
  • 38. BSON Mongodb stores data in BSON internally Lightweight, Traversable, Efficient encoding Typed boolean, integer, float, date, string, binary, array...

Editor's Notes

  • #37: blog post twitter