0% found this document useful (0 votes)

110 views15 pages

How MongoDB Stores Data Internally

MongoDB stores data using a document-oriented model with a hierarchy of databases, collections, and documents, utilizing BSON for efficient storage. The default storage engine, WiredTiger, employs a B-Tree structure for data organization and includes features like journaling for durability and various compression techniques. Understanding these internals aids in schema design, performance tuning, and troubleshooting storage issues.

Uploaded by

biharmesudhar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

110 views15 pages

How MongoDB Stores Data Internally

Uploaded by

biharmesudhar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

How MongoDB Stores

Data Internally

@ sanuj bansal
Introduction
When we use MongoDB, we usually think of simple JSON
documents.

But have you ever wondered how those documents are stored?

Understanding the internals helps with:

✅Schema design
✅Performance tuning
✅Index optimization
✅Troubleshooting storage issues

@ sanuj bansal
MongoDB’s High-Level
Data Structure
MongoDB uses a document-oriented model with the following
hierarchy:
Database: Top-level container
Collection: Groups of documents (like SQL tables)
Document: JSON-like data (stored as BSON)

MongoDB is schema-flexible: documents in the same collection

can have different fields.

@ sanuj bansal
What is BSON?
MongoDB doesn’t actually store documents as raw JSON.
Instead, it uses BSON (Binary JSON).

Why BSON?
Faster to encode/decode
Supports rich types like Date, ObjectId, Decimal128
More compact and optimized for machine parsing

Example Comparison:

@ sanuj bansal
Collections and
Namespace Files
Each collection is internally mapped to a namespace file.

For example:
myDatabase.users collection maps to a namespace like:
myDatabase.0
This file stores metadata and actual document locations.

Collections are stored in data files, indexed by namespace

files.

@ sanuj bansal
WiredTiger – The
Default Storage Engine
Since MongoDB 3.2, WiredTiger is the default storage
engine.

Main Functions of WiredTiger:

Write-Ahead Logging (WAL)
Compression
Caching and memory management
Document-level locking

Each collection has two main files:

.wt file for data
.wt file for each index

@ sanuj bansal
How WiredTiger Stores
Documents
WiredTiger uses a B-Tree data structure.
Each node contains sorted key-value pairs.
Internal nodes: Pointers to children
Leaf nodes: Contain actual data

Data is split into:

Pages (smallest unit of I/O)
Extents (group of pages)
Blocks (raw chunks written to disk)

@ sanuj bansal
Journaling – Ensuring
Durability
Before data is committed to the .wt file, it's first written to
the journal.

Why Journaling?
Ensures ACID durability
Allows recovery after crash
Uses write-ahead logging

Journals are flushed to disc every 100ms (configurable).

@ sanuj bansal
Write Path – Step by
Step
Here’s what happens when you insert a document:
1.Document is validated and converted to BSON
2.Stored in in-memory cache (WiredTiger’s cache)
3.Logged in the journal (WAL)
4.Indexes are updated
5.Eventually flushed to .wt files during checkpointing

Checkpointing writes stable snapshots of memory → disk →

safe storage.

@ sanuj bansal
Read Path – Step by
Step
When you query MongoDB:
1.Query is parsed & optimized
2.Index is used (if available)
3.Data is fetched from memory or disk
4.BSON is decoded → JSON
5.Returned to client

@ sanuj bansal
Indexes and Storage

Indexes are stored as separate B-trees in WiredTiger.

Common indexes:
Single Field
Compound Index
Multikey Index
Geospatial / Text

Each index lives in its own .wt file and is updated during
inserts/updates.
Index updates are part of the same journaling and
checkpoint process.

@ sanuj bansal
Compression
Techniques
MongoDB compresses data to reduce disk I/O.

Supported Compression Types:

Snappy (default): Fastest, good for general use
zlib: Higher compression, slower
zstd: Balanced performance and compression

Compression applies to both data and indexes.

@ sanuj bansal
Memory Management

MongoDB uses an internal cache managed by WiredTiger.

Acts like a RAM-based storage layer
Frequently accessed documents are cached
Cache size is configurable (defaults to ~50% of RAM)

Performance can degrade if working set > available

memory.

@ sanuj bansal
Data File Organization

Each database has its own folder inside the dbPath.

For WiredTiger:
collection-*.wt – Data files for collections
index-*.wt – Index files
WiredTiger.wt – Metadata and global state
journal/ – Write-ahead logs

You can see these files in your MongoDB data directory.

@ sanuj bansal
Follow For More
Such Content !

Sanuj Bansal
Senior Developer

Module 2 Architecture
No ratings yet
Module 2 Architecture
12 pages
Screenshot 2024-09-21 at 8.36.35 AM
No ratings yet
Screenshot 2024-09-21 at 8.36.35 AM
31 pages
MongoDB vs RDBMS: Performance Insights
No ratings yet
MongoDB vs RDBMS: Performance Insights
12 pages
Unit 1 Part2
No ratings yet
Unit 1 Part2
33 pages
Mongo Lesson2
No ratings yet
Mongo Lesson2
43 pages
Mongodb
No ratings yet
Mongodb
28 pages
7-MongoDB Storage Engine
No ratings yet
7-MongoDB Storage Engine
32 pages
Mongodb-Unit 5
No ratings yet
Mongodb-Unit 5
120 pages
Understanding MongoDB Basics
No ratings yet
Understanding MongoDB Basics
46 pages
MongoDB NoSQL Database Guide
No ratings yet
MongoDB NoSQL Database Guide
19 pages
MongoDB Basics for Tech Enthusiasts
No ratings yet
MongoDB Basics for Tech Enthusiasts
9 pages
Se DBMS 2023 Unit4
No ratings yet
Se DBMS 2023 Unit4
53 pages
Mongo Notes
No ratings yet
Mongo Notes
37 pages
Introduction to MongoDB Basics
No ratings yet
Introduction to MongoDB Basics
53 pages
MongoDB Tutorial ?
No ratings yet
MongoDB Tutorial ?
9 pages
Mongodb Notes
No ratings yet
Mongodb Notes
8 pages
Mongodb Interview Questions (V4.4)
No ratings yet
Mongodb Interview Questions (V4.4)
25 pages
01 - Introduction To MongoDB
No ratings yet
01 - Introduction To MongoDB
15 pages
MongoDB Document Database Overview
No ratings yet
MongoDB Document Database Overview
31 pages
MST Unit-5
No ratings yet
MST Unit-5
14 pages
MongoDB Lecture 1
No ratings yet
MongoDB Lecture 1
37 pages
Mongodb Notes HD Excl
No ratings yet
Mongodb Notes HD Excl
22 pages
Mongo DB
No ratings yet
Mongo DB
12 pages
Overview of MongoDB: Features & History
No ratings yet
Overview of MongoDB: Features & History
14 pages
Intro To MongoDB
100% (1)
Intro To MongoDB
13 pages
1664473609-Unit 5 - Database Management - MongoDB
No ratings yet
1664473609-Unit 5 - Database Management - MongoDB
23 pages
MongoDB: A Guide for Developers
No ratings yet
MongoDB: A Guide for Developers
50 pages
6 - Document Databases With MongoDB
No ratings yet
6 - Document Databases With MongoDB
5 pages
BDA Unit 3 Notes
No ratings yet
BDA Unit 3 Notes
10 pages
MongoDB Architecture & Storage
No ratings yet
MongoDB Architecture & Storage
52 pages
BDA Module3
No ratings yet
BDA Module3
36 pages
MongoDB Quick Guide
No ratings yet
MongoDB Quick Guide
61 pages
FSD Unit III
No ratings yet
FSD Unit III
22 pages
MongoDB Basics: Installation and Usage Guide
No ratings yet
MongoDB Basics: Installation and Usage Guide
14 pages
Mongo DB
No ratings yet
Mongo DB
74 pages
MongoDB Cheat Sheet for Developers
No ratings yet
MongoDB Cheat Sheet for Developers
10 pages
Mongo Best Practices
No ratings yet
Mongo Best Practices
31 pages
Chap4 MongoDBDataModel
No ratings yet
Chap4 MongoDBDataModel
9 pages
Understanding NoSQL and MongoDB Basics
No ratings yet
Understanding NoSQL and MongoDB Basics
20 pages
Mongodb
No ratings yet
Mongodb
6 pages
MongoDB Interview Questions
No ratings yet
MongoDB Interview Questions
9 pages
Lecture 18 Theory
No ratings yet
Lecture 18 Theory
18 pages
DF200 - 01 - Indexes and Optimization Mongo DB Training
No ratings yet
DF200 - 01 - Indexes and Optimization Mongo DB Training
69 pages
Unit - Iii Bda
No ratings yet
Unit - Iii Bda
51 pages
Atlas Best Practices
No ratings yet
Atlas Best Practices
19 pages
Mongodb Tutorial: Database Collection
No ratings yet
Mongodb Tutorial: Database Collection
36 pages
Lec 17 M DB
No ratings yet
Lec 17 M DB
17 pages
MongoDB: Features, Differences, and CRUD Operations
100% (1)
MongoDB: Features, Differences, and CRUD Operations
11 pages
MongoDB Interview Questions Guide
No ratings yet
MongoDB Interview Questions Guide
13 pages
Lecture 07.06 ModelingDataInMongo - 12
No ratings yet
Lecture 07.06 ModelingDataInMongo - 12
12 pages
MongoDB: A Guide for Developers
No ratings yet
MongoDB: A Guide for Developers
66 pages
MongoDB Interview Q&A Guide
No ratings yet
MongoDB Interview Q&A Guide
2 pages
MongoDB Guide for Students
No ratings yet
MongoDB Guide for Students
104 pages
MongoDB Guide: Features, Installation & CRUD
No ratings yet
MongoDB Guide: Features, Installation & CRUD
17 pages
Unit 2 - Bda Notes
No ratings yet
Unit 2 - Bda Notes
37 pages
Mongodb
No ratings yet
Mongodb
1 page
Mongodb
No ratings yet
Mongodb
60 pages
Big Data (Unit 3)
No ratings yet
Big Data (Unit 3)
46 pages
2 Corinthians Complete 5-2-15
No ratings yet
2 Corinthians Complete 5-2-15
82 pages
Es Unit-1
No ratings yet
Es Unit-1
50 pages
Module Kbat Bi Fasa 1 2018 PDF
No ratings yet
Module Kbat Bi Fasa 1 2018 PDF
81 pages
EEF281 Syllabus
No ratings yet
EEF281 Syllabus
2 pages
Spell Number
No ratings yet
Spell Number
5 pages
Computational Thinking Searching and Sorting Algorithms VythqRmDYxPxsQt9
No ratings yet
Computational Thinking Searching and Sorting Algorithms VythqRmDYxPxsQt9
15 pages
Customizable Presentation Template
No ratings yet
Customizable Presentation Template
8 pages
Communicative Activity 01 Creation Stories
No ratings yet
Communicative Activity 01 Creation Stories
1 page
Mp3DirectCut Version Instructions
No ratings yet
Mp3DirectCut Version Instructions
2 pages
10th B2 First Listening Mock Third Term
No ratings yet
10th B2 First Listening Mock Third Term
3 pages
How To Write A Myth Story Project
No ratings yet
How To Write A Myth Story Project
2 pages
VLE User Guide For ASP-India
No ratings yet
VLE User Guide For ASP-India
15 pages
Differential Equations in Matlab-II: Riddhi@civil - Iitb.ac - in
No ratings yet
Differential Equations in Matlab-II: Riddhi@civil - Iitb.ac - in
45 pages
Homework Chapter 5
No ratings yet
Homework Chapter 5
7 pages
Chapter 5 Reteach and Enrich
No ratings yet
Chapter 5 Reteach and Enrich
16 pages
Human Speech Apparatus
100% (10)
Human Speech Apparatus
2 pages
GOA-Mathematics Sample Paper-1-SOLUTION-Class 10 Question Paper (SA-II)
No ratings yet
GOA-Mathematics Sample Paper-1-SOLUTION-Class 10 Question Paper (SA-II)
15 pages
Computer Science Paper 1 SL Markscheme
No ratings yet
Computer Science Paper 1 SL Markscheme
9 pages
OSPFv2 Configuration for Network Setup
No ratings yet
OSPFv2 Configuration for Network Setup
20 pages
Hypothesis Test: Mean vs. Hypothesized Value
No ratings yet
Hypothesis Test: Mean vs. Hypothesized Value
27 pages
Practice Worksheet - Mathematics
No ratings yet
Practice Worksheet - Mathematics
4 pages
Face2face. Pre-Intermediate Workbook With Answer Key (Book, 2012) (WorldCat - Org)
100% (1)
Face2face. Pre-Intermediate Workbook With Answer Key (Book, 2012) (WorldCat - Org)
3 pages
How To Write A Youtube Script
No ratings yet
How To Write A Youtube Script
103 pages
Makna Gereja Bambu Bagi Umat Katolik Di Bunder
No ratings yet
Makna Gereja Bambu Bagi Umat Katolik Di Bunder
18 pages
The Disperse of Yasharal
100% (2)
The Disperse of Yasharal
4 pages
Active vs Passive Voice Rules
No ratings yet
Active vs Passive Voice Rules
9 pages
A2 Speaking Test Format and Tasks
No ratings yet
A2 Speaking Test Format and Tasks
2 pages
Prayers and Bible Affirmations
100% (1)
Prayers and Bible Affirmations
3 pages
Detailed Summary of "Dusk"
No ratings yet
Detailed Summary of "Dusk"
3 pages
Semantic Data Modeling Guide
No ratings yet
Semantic Data Modeling Guide
30 pages

How MongoDB Stores Data Internally

Uploaded by

How MongoDB Stores Data Internally

Uploaded by

How MongoDB Stores

Understanding the internals helps with:

MongoDB is schema-flexible: documents in the same collection

Collections are stored in data files, indexed by namespace

Main Functions of WiredTiger:

Each collection has two main files:

Data is split into:

Journals are flushed to disc every 100ms (configurable).

Checkpointing writes stable snapshots of memory → disk →

Indexes are stored as separate B-trees in WiredTiger.

Supported Compression Types:

Compression applies to both data and indexes.

MongoDB uses an internal cache managed by WiredTiger.

Performance can degrade if working set > available

Each database has its own folder inside the dbPath.

You can see these files in your MongoDB data directory.

You might also like