0% found this document useful (0 votes)

110 views15 pages

How MongoDB Stores Data Internally

MongoDB stores data using a document-oriented model with a hierarchy of databases, collections, and documents, utilizing BSON for efficient storage. The default storage engine, WiredTiger, employs a B-Tree structure for data organization and includes features like journaling for durability and various compression techniques. Understanding these internals aids in schema design, performance tuning, and troubleshooting storage issues.

Uploaded by

biharmesudhar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

110 views15 pages

How MongoDB Stores Data Internally

Uploaded by

biharmesudhar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

How MongoDB Stores

Data Internally

@ sanuj bansal
Introduction
When we use MongoDB, we usually think of simple JSON
documents.

But have you ever wondered how those documents are stored?

Understanding the internals helps with:

✅Schema design
✅Performance tuning
✅Index optimization
✅Troubleshooting storage issues

@ sanuj bansal
MongoDB’s High-Level
Data Structure
MongoDB uses a document-oriented model with the following
hierarchy:
Database: Top-level container
Collection: Groups of documents (like SQL tables)
Document: JSON-like data (stored as BSON)

MongoDB is schema-flexible: documents in the same collection

can have different fields.

@ sanuj bansal
What is BSON?
MongoDB doesn’t actually store documents as raw JSON.
Instead, it uses BSON (Binary JSON).

Why BSON?
Faster to encode/decode
Supports rich types like Date, ObjectId, Decimal128
More compact and optimized for machine parsing

Example Comparison:

@ sanuj bansal
Collections and
Namespace Files
Each collection is internally mapped to a namespace file.

For example:
myDatabase.users collection maps to a namespace like:
myDatabase.0
This file stores metadata and actual document locations.

Collections are stored in data files, indexed by namespace

files.

@ sanuj bansal
WiredTiger – The
Default Storage Engine
Since MongoDB 3.2, WiredTiger is the default storage
engine.

Main Functions of WiredTiger:

Write-Ahead Logging (WAL)
Compression
Caching and memory management
Document-level locking

Each collection has two main files:

.wt file for data
.wt file for each index

@ sanuj bansal
How WiredTiger Stores
Documents
WiredTiger uses a B-Tree data structure.
Each node contains sorted key-value pairs.
Internal nodes: Pointers to children
Leaf nodes: Contain actual data

Data is split into:

Pages (smallest unit of I/O)
Extents (group of pages)
Blocks (raw chunks written to disk)

@ sanuj bansal
Journaling – Ensuring
Durability
Before data is committed to the .wt file, it's first written to
the journal.

Why Journaling?
Ensures ACID durability
Allows recovery after crash
Uses write-ahead logging

Journals are flushed to disc every 100ms (configurable).

@ sanuj bansal
Write Path – Step by
Step
Here’s what happens when you insert a document:
1.Document is validated and converted to BSON
2.Stored in in-memory cache (WiredTiger’s cache)
3.Logged in the journal (WAL)
4.Indexes are updated
5.Eventually flushed to .wt files during checkpointing

Checkpointing writes stable snapshots of memory → disk →

safe storage.

@ sanuj bansal
Read Path – Step by
Step
When you query MongoDB:
1.Query is parsed & optimized
2.Index is used (if available)
3.Data is fetched from memory or disk
4.BSON is decoded → JSON
5.Returned to client

@ sanuj bansal
Indexes and Storage

Indexes are stored as separate B-trees in WiredTiger.

Common indexes:
Single Field
Compound Index
Multikey Index
Geospatial / Text

Each index lives in its own .wt file and is updated during
inserts/updates.
Index updates are part of the same journaling and
checkpoint process.

@ sanuj bansal
Compression
Techniques
MongoDB compresses data to reduce disk I/O.

Supported Compression Types:

Snappy (default): Fastest, good for general use
zlib: Higher compression, slower
zstd: Balanced performance and compression

Compression applies to both data and indexes.

@ sanuj bansal
Memory Management

MongoDB uses an internal cache managed by WiredTiger.

Acts like a RAM-based storage layer
Frequently accessed documents are cached
Cache size is configurable (defaults to ~50% of RAM)

Performance can degrade if working set > available

memory.

@ sanuj bansal
Data File Organization

Each database has its own folder inside the dbPath.

For WiredTiger:
collection-*.wt – Data files for collections
index-*.wt – Index files
WiredTiger.wt – Metadata and global state
journal/ – Write-ahead logs

You can see these files in your MongoDB data directory.

@ sanuj bansal
Follow For More
Such Content !

Sanuj Bansal
Senior Developer

Module 2 Architecture
No ratings yet
Module 2 Architecture
12 pages
Screenshot 2024-09-21 at 8.36.35 AM
No ratings yet
Screenshot 2024-09-21 at 8.36.35 AM
31 pages
MongoDB vs RDBMS: Performance Insights
No ratings yet
MongoDB vs RDBMS: Performance Insights
12 pages
Unit 1 Part2
No ratings yet
Unit 1 Part2
33 pages
Mongo Lesson2
No ratings yet
Mongo Lesson2
43 pages
Mongodb
No ratings yet
Mongodb
28 pages
7-MongoDB Storage Engine
No ratings yet
7-MongoDB Storage Engine
32 pages
Mongodb-Unit 5
No ratings yet
Mongodb-Unit 5
120 pages
Understanding MongoDB Basics
No ratings yet
Understanding MongoDB Basics
46 pages
MongoDB NoSQL Database Guide
No ratings yet
MongoDB NoSQL Database Guide
19 pages
MongoDB Basics for Tech Enthusiasts
No ratings yet
MongoDB Basics for Tech Enthusiasts
9 pages
Se DBMS 2023 Unit4
No ratings yet
Se DBMS 2023 Unit4
53 pages
Mongo Notes
No ratings yet
Mongo Notes
37 pages
Introduction to MongoDB Basics
No ratings yet
Introduction to MongoDB Basics
53 pages
MongoDB Tutorial ?
No ratings yet
MongoDB Tutorial ?
9 pages
Mongodb Notes
No ratings yet
Mongodb Notes
8 pages
Mongodb Interview Questions (V4.4)
No ratings yet
Mongodb Interview Questions (V4.4)
25 pages
01 - Introduction To MongoDB
No ratings yet
01 - Introduction To MongoDB
15 pages
MongoDB Document Database Overview
No ratings yet
MongoDB Document Database Overview
31 pages
MST Unit-5
No ratings yet
MST Unit-5
14 pages
MongoDB Lecture 1
No ratings yet
MongoDB Lecture 1
37 pages
Mongodb Notes HD Excl
No ratings yet
Mongodb Notes HD Excl
22 pages
Mongo DB
No ratings yet
Mongo DB
12 pages
Overview of MongoDB: Features & History
No ratings yet
Overview of MongoDB: Features & History
14 pages
Intro To MongoDB
100% (1)
Intro To MongoDB
13 pages
1664473609-Unit 5 - Database Management - MongoDB
No ratings yet
1664473609-Unit 5 - Database Management - MongoDB
23 pages
MongoDB: A Guide for Developers
No ratings yet
MongoDB: A Guide for Developers
50 pages
6 - Document Databases With MongoDB
No ratings yet
6 - Document Databases With MongoDB
5 pages
BDA Unit 3 Notes
No ratings yet
BDA Unit 3 Notes
10 pages
MongoDB Architecture & Storage
No ratings yet
MongoDB Architecture & Storage
52 pages
BDA Module3
No ratings yet
BDA Module3
36 pages
MongoDB Quick Guide
No ratings yet
MongoDB Quick Guide
61 pages
FSD Unit III
No ratings yet
FSD Unit III
22 pages
MongoDB Basics: Installation and Usage Guide
No ratings yet
MongoDB Basics: Installation and Usage Guide
14 pages
Mongo DB
No ratings yet
Mongo DB
74 pages
MongoDB Cheat Sheet for Developers
No ratings yet
MongoDB Cheat Sheet for Developers
10 pages
Mongo Best Practices
No ratings yet
Mongo Best Practices
31 pages
Chap4 MongoDBDataModel
No ratings yet
Chap4 MongoDBDataModel
9 pages
Understanding NoSQL and MongoDB Basics
No ratings yet
Understanding NoSQL and MongoDB Basics
20 pages
Mongodb
No ratings yet
Mongodb
6 pages
MongoDB Interview Questions
No ratings yet
MongoDB Interview Questions
9 pages
Lecture 18 Theory
No ratings yet
Lecture 18 Theory
18 pages
DF200 - 01 - Indexes and Optimization Mongo DB Training
No ratings yet
DF200 - 01 - Indexes and Optimization Mongo DB Training
69 pages
Unit - Iii Bda
No ratings yet
Unit - Iii Bda
51 pages
Atlas Best Practices
No ratings yet
Atlas Best Practices
19 pages
Mongodb Tutorial: Database Collection
No ratings yet
Mongodb Tutorial: Database Collection
36 pages
Lec 17 M DB
No ratings yet
Lec 17 M DB
17 pages
MongoDB: Features, Differences, and CRUD Operations
100% (1)
MongoDB: Features, Differences, and CRUD Operations
11 pages
MongoDB Interview Questions Guide
No ratings yet
MongoDB Interview Questions Guide
13 pages
Lecture 07.06 ModelingDataInMongo - 12
No ratings yet
Lecture 07.06 ModelingDataInMongo - 12
12 pages
MongoDB: A Guide for Developers
No ratings yet
MongoDB: A Guide for Developers
66 pages
MongoDB Interview Q&A Guide
No ratings yet
MongoDB Interview Q&A Guide
2 pages
MongoDB Guide for Students
No ratings yet
MongoDB Guide for Students
104 pages
MongoDB Guide: Features, Installation & CRUD
No ratings yet
MongoDB Guide: Features, Installation & CRUD
17 pages
Unit 2 - Bda Notes
No ratings yet
Unit 2 - Bda Notes
37 pages
Mongodb
No ratings yet
Mongodb
1 page
Mongodb
No ratings yet
Mongodb
60 pages
Big Data (Unit 3)
No ratings yet
Big Data (Unit 3)
46 pages
Ethereum: Blockchain for Developers
No ratings yet
Ethereum: Blockchain for Developers
41 pages
SERC Handbook On Digital Engineering With Ontologies 2.0
No ratings yet
SERC Handbook On Digital Engineering With Ontologies 2.0
207 pages
Pratical 6 Operators
No ratings yet
Pratical 6 Operators
4 pages
C - DBADM - 2404 2404-Demo
No ratings yet
C - DBADM - 2404 2404-Demo
5 pages
Airline Reservation System Project
100% (1)
Airline Reservation System Project
24 pages
Definition of JDBC
No ratings yet
Definition of JDBC
4 pages
Ip pb1 QP Ms Agra Set A
No ratings yet
Ip pb1 QP Ms Agra Set A
17 pages
SQL DBA Resume
No ratings yet
SQL DBA Resume
6 pages
Data Collection Essentials
No ratings yet
Data Collection Essentials
1 page
REPORT Mini Bank Adin Access
No ratings yet
REPORT Mini Bank Adin Access
52 pages
Data Mining Intro, Functionalities, Issues
No ratings yet
Data Mining Intro, Functionalities, Issues
30 pages
IRS Syllabus
No ratings yet
IRS Syllabus
2 pages
Query Processing Practice Solutions
No ratings yet
Query Processing Practice Solutions
4 pages
Statistical Description of Data - Class Notes
No ratings yet
Statistical Description of Data - Class Notes
184 pages
Unit 4 DOT NET
No ratings yet
Unit 4 DOT NET
10 pages
1-Giriş Ve Temel Kavramlar-OK en-US
No ratings yet
1-Giriş Ve Temel Kavramlar-OK en-US
27 pages
Sqlday 21
No ratings yet
Sqlday 21
14 pages
Maven Advanced Tableau
No ratings yet
Maven Advanced Tableau
94 pages
Logical and Reference Function
No ratings yet
Logical and Reference Function
3 pages
FortiSIEM Catalogo
No ratings yet
FortiSIEM Catalogo
7 pages
ADF JClient Login and UI Tips
No ratings yet
ADF JClient Login and UI Tips
61 pages
eIDMS PDF
No ratings yet
eIDMS PDF
99 pages
Lab Assignment No.10
No ratings yet
Lab Assignment No.10
3 pages
Scriptcase Macros Overview
No ratings yet
Scriptcase Macros Overview
6 pages
CS ELEC 4 Midterm Module
No ratings yet
CS ELEC 4 Midterm Module
59 pages
DBMS Notes
No ratings yet
DBMS Notes
23 pages
List of Practicals XII IP 2023-24
No ratings yet
List of Practicals XII IP 2023-24
5 pages
Data Analyst Profile: Jayantee Bhalerao
No ratings yet
Data Analyst Profile: Jayantee Bhalerao
2 pages
Interface Python With MYSQL - Tutorial - 2
No ratings yet
Interface Python With MYSQL - Tutorial - 2
7 pages
Blogging Website Project Reportwifsaiuhfahufhbf
No ratings yet
Blogging Website Project Reportwifsaiuhfahufhbf
9 pages

How MongoDB Stores Data Internally

Uploaded by

How MongoDB Stores Data Internally

Uploaded by

How MongoDB Stores

Understanding the internals helps with:

MongoDB is schema-flexible: documents in the same collection

Collections are stored in data files, indexed by namespace

Main Functions of WiredTiger:

Each collection has two main files:

Data is split into:

Journals are flushed to disc every 100ms (configurable).

Checkpointing writes stable snapshots of memory → disk →

Indexes are stored as separate B-trees in WiredTiger.

Supported Compression Types:

Compression applies to both data and indexes.

MongoDB uses an internal cache managed by WiredTiger.

Performance can degrade if working set > available

Each database has its own folder inside the dbPath.

You can see these files in your MongoDB data directory.

You might also like