MongoDB Manual
MongoDB Manual
Release 2.4.9
Contents
Install MongoDB
1.1 Installation Guides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 First Steps with MongoDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
3
19
27
27
29
58
82
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Data Models
3.1 Data Modeling Introduction . . . .
3.2 Data Modeling Concepts . . . . . .
3.3 Data Model Examples and Patterns
3.4 Data Model Reference . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
97
. 97
. 99
. 106
. 121
Administration
135
4.1 Administration Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
4.2 Administration Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
4.3 Administration Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
Security
5.1 Security Introduction
5.2 Security Concepts .
5.3 Security Tutorials . .
5.4 Security Reference .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
237
237
239
245
267
Aggregation
6.1 Aggregation Introduction
6.2 Aggregation Concepts . .
6.3 Aggregation Examples . .
6.4 Aggregation Reference . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
277
277
281
292
308
Indexes
7.1 Index Introduction
7.2 Index Concepts . .
7.3 Indexing Tutorials
7.4 Indexing Reference
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
313
313
318
339
375
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
377
377
381
419
466
Sharding
9.1 Sharding Introduction . .
9.2 Sharding Concepts . . . .
9.3 Sharded Cluster Tutorials
9.4 Sharding Reference . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
479
479
484
506
546
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
555
555
558
567
569
572
578
581
585
587
11 Release Notes
11.1 Current Stable Release . . . . .
11.2 Previous Stable Releases . . . .
11.3 Other MongoDB Release Notes
11.4 MongoDB Version Numbers . .
Replication
8.1 Replication Introduction
8.2 Replication Concepts . .
8.3 Replica Set Tutorials . .
8.4 Replication Reference .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
591
591
609
634
634
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
637
637
637
638
638
638
Index
ii
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
655
See About MongoDB Documentation (page 637) for more information about the MongoDB Documentation project,
this Manual and additional editions of this text.
Note: This version of the PDF does not include the reference section, see MongoDB Reference Manual1 for a PDF
edition of all MongoDB Reference Material.
1 http://docs.mongodb.org/v2.4/MongoDB-reference-manual.pdf
Contents
Contents
CHAPTER 1
Install MongoDB
MongoDB runs on most platforms and supports both 32-bit and 64-bit architectures.
1.1.1 Linux
Install on Linux (page 3)
Install on Linux
These documents provide instructions to install MongoDB for various Linux systems.
Recommended
For easy installation, MongoDB provides packages for popular Linux distributions. The following guides detail the
installation process for these systems:
Install on Red Hat Enterprise Linux (page 4) Install MongoDB on Red Hat Enterprise, CentOS, Fedora and related
Linux systems using .rpm packages.
Install on Ubuntu (page 6) Install MongoDB on Ubuntu Linux systems using .deb packages.
Install on Debian (page 7) Install MongoDB on Debian systems using .deb packages.
For systems without supported packages, refer to the Manual Installation tutorial.
Manual Installation
Although packages are the preferred installation method, for Linux systems without supported packages, see the
following guide:
Install on Other Linux Systems (page 9) Install MongoDB on other Linux systems from the MongoDB archives.
Install MongoDB on Red Hat Enterprise, CentOS, Fedora, or Amazon Linux This tutorial outlines the steps
to install MongoDB on Red Hat Enterprise Linux, CentOS Linux, Fedora Linux and related systems. The tutorial
uses .rpm packages to install. While some of these distributions include their own MongoDB packages, the official
MongoDB packages are generally more up to date.
Packages The MongoDB downloads repository contains two packages:
mongo-10gen-server
This package contains the mongod and mongos daemons from the latest stable release and associated configuration and init scripts. Additionally, you can use this package to install daemons from a previous release
(page 4) of MongoDB.
mongo-10gen
This package contains all MongoDB tools from the latest stable release. Additionally, you can use this package
to install tools from a previous release (page 4) of MongoDB. Install this package on all production MongoDB
hosts and optionally on other systems from which you may need to administer MongoDB systems.
Install MongoDB
Configure Package Management System (YUM) Create a /etc/yum.repos.d/mongodb.repo file to hold
the following configuration information for the MongoDB repository:
Tip
For production deployments, always run MongoDB on 64-bit systems.
If you are running a 64-bit system, use the following configuration:
[mongodb]
name=MongoDB Repository
baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/x86_64/
gpgcheck=0
enabled=1
If you are running a 32-bit system, which is not recommended for production deployments, use the following configuration:
[mongodb]
name=MongoDB Repository
baseurl=http://downloads-distro.mongodb.org/repo/redhat/os/i686/
gpgcheck=0
enabled=1
Install Packages Issue the following command (as root or with sudo) to install the latest stable version of MongoDB and the associated tools:
yum install mongo-10gen mongo-10gen-server
This installs the mongo-10gen and mongo-10gen-server packages with the 2.2.3 release. You can specify
any available version of MongoDB; however yum will upgrade the mongo-10gen and mongo-10gen-server
packages when a newer version becomes available. Use the following pinning procedure to prevent unintended upgrades.
To pin a package, add the following line to your /etc/yum.conf file:
exclude=mongo-10gen,mongo-10gen-server
Control Scripts
Warning: With the introduction of systemd in Fedora 15, the control scripts included in the packages available
in the MongoDB downloads repository are not compatible with Fedora systems. A correction is forthcoming,
see SERVER-7285a for more information, and in the mean time use your own control scripts or install using the
procedure outlined in Install MongoDB on Linux Systems (page 9).
a https://jira.mongodb.org/browse/SERVER-7285
The packages include various control scripts, including the init script /etc/rc.d/init.d/mongodb. These
packages configure MongoDB using the /etc/mongodb.conf file in conjunction with the control scripts.
As of version 2.4.9, there are no control scripts for mongos. mongos is only used in sharding deployments
(page 484). You can use the mongod init script to derive your own mongos control script.
Run MongoDB
Important: You must configure SELinux to allow MongoDB to start on Fedora systems. Administrators have two
options:
enable access to the relevant ports (e.g. 27017) for SELinux. See Configuration Options (page 242) for more
information on MongoDBs default ports (page 275).
disable SELinux entirely. This requires a system reboot and may have larger implications for your deployment.
Start MongoDB The MongoDB instance stores its data files in the /var/lib/mongo and its log files in
/var/log/mongo, and run using the mongod user account. If you change the user that runs the MongoDB process,
you must modify the access control rights to the /var/lib/mongo and /var/log/mongo directories.
Start the mongod process by issuing the following command (as root or with sudo):
service mongod start
You can verify that the mongod process has started successfully by checking the contents of the log file at
/var/log/mongo/mongod.log.
You may optionally ensure that MongoDB will start following a system reboot by issuing the following command
(with root privileges:)
chkconfig mongod on
Stop MongoDB Stop the mongod process by issuing the following command (as root or with sudo):
service mongod stop
Restart MongoDB You can restart the mongod process by issuing the following command (as root or with sudo):
service mongod restart
Follow the state of this process by watching the output in the /var/log/mongo/mongod.log file to watch for
errors or important messages from the server.
Install MongoDB on Ubuntu This tutorial outlines the steps to install MongoDB on Ubuntu Linux systems. The
tutorial uses .deb packages to install. Although Ubuntu include its own MongoDB packages, the official MongoDB
packages are generally more up to date.
Note: If you use an older Ubuntu that does not use Upstart, (i.e. any version before 9.10 Karmic) please follow the
instructions on the Install MongoDB on Debian (page 7) tutorial.
Package Options The MongoDB downloads repository provides the mongodb-10gen package, which contains
the latest stable release. Additionally you can install previous releases (page 6) of MongoDB.
You cannot install this package concurrently with the mongodb, mongodb-server, or mongodb-clients packages provided by Ubuntu.
Install MongoDB
Configure Package Management System (APT) The Ubuntu package management tool (i.e. dpkg and apt)
ensure package consistency and authenticity by requiring that distributors sign packages with GPG keys. Issue the
following command to import the MongoDB public GPG Key1 :
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
Install Packages Issue the following command to install the latest stable version of MongoDB:
sudo apt-get install mongodb-10gen
When this command completes, you have successfully installed MongoDB! Continue for configuration and start-up
suggestions.
Manage Installed Versions You can use the mongodb-10gen package to install previous versions of MongoDB.
To install a specific release, append the version number to the package name, as in the following example:
apt-get install mongodb-10gen=2.2.3
This will install the 2.2.3 release of MongoDB. You can specify any available version of MongoDB; however
apt-get will upgrade the mongodb-10gen package when a newer version becomes available. Use the following
pinning procedure to prevent unintended upgrades.
1 http://docs.mongodb.org/10gen-gpg-key.asc
To pin a package, issue the following command at the system prompt to pin the version of MongoDB at the currently
installed version:
echo "mongodb-10gen hold" | sudo dpkg --set-selections
You can verify that mongod has started successfully by checking the contents of the log file at
/var/log/mongodb/mongodb.log.
Stop MongoDB As needed, you may stop the mongod process by issuing the following command:
sudo service mongodb stop
Restart MongoDB You may restart the mongod process by issuing the following command:
sudo service mongodb restart
Install MongoDB on Debian This tutorial outlines the steps to install MongoDB on Debian systems. The tutorial
uses .deb packages to install. While some Debian distributions include their own MongoDB packages, the official
MongoDB packages are generally more up to date.
Note: This tutorial applies to both Debian systems and versions of Ubuntu Linux prior to 9.10 Karmic which do
not use Upstart. Other Ubuntu users will want to follow the Install MongoDB on Ubuntu (page 6) tutorial.
Package Options The downloads repository provides the mongodb-10gen package, which contains the latest
stable release. Additionally you can install previous releases (page 8) of MongoDB.
You cannot install this package concurrently with the mongodb, mongodb-server, or mongodb-clients packages that your release of Debian may include.
Install MongoDB
Configure Package Management System (APT) The Debian package management tools (i.e. dpkg and apt)
ensure package consistency and authenticity by requiring that distributors sign packages with GPG keys.
Step 1: Import MongoDB PGP key. Issue the following command to add the MongoDB public GPG Key2 to the
system key ring.
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv 7F0CEB10
Step 3: Reload local package database. Issue the following command to reload the local package database:
sudo apt-get update
Install Packages Issue the following command to install the latest stable version of MongoDB:
sudo apt-get install mongodb-10gen
This will install the 2.2.3 release of MongoDB. You can specify any available version of MongoDB; however
apt-get will upgrade the mongodb-10gen package when a newer version becomes available. Use the following
pinning procedure to prevent unintended upgrades.
To pin a package, issue the following command at the system prompt to pin the version of MongoDB at the currently
installed version:
echo "mongodb-10gen hold" | sudo dpkg --set-selections
You can verify that mongod has started successfully by checking the contents of the log file at
/var/log/mongodb/mongodb.log.
2 http://docs.mongodb.org/10gen-gpg-key.asc
Install MongoDB on Linux Systems Compiled versions of MongoDB for Linux provide a simple option for installing MongoDB for other Linux systems without supported packages.
Installation Process MongoDB provides archives for both 64-bit and 32-bit Linux. Follow the installation procedure
appropriate for your system.
Install for 64-bit Linux
Step 1: Download the Latest Release In a system shell, download the latest release for 64-bit Linux.
curl -O http://downloads.mongodb.org/linux/mongodb-linux-x86_64-2.4.9.tgz
Step 3: Optional. Copy MongoDB to Target Directory Copy the extracted folder into another location, such as
mongodb.
mkdir -p mongodb
cp -R -n mongodb-linux-x86_64-2.4.9/ mongodb
Step 4: Optional. Configure Search Path To ensure that the downloaded binaries are in your PATH, you can
modify your PATH and/or create symbolic links to the MongoDB binaries in your /usr/local/bin directory
(/usr/local/bin is already in your PATH). You can find the MongoDB binaries in the bin/ directory within the
archive.
Install for 32-bit Linux
Step 1: Download the Latest Release In a system shell, download the latest release for 32-bit Linux.
curl -O http://downloads.mongodb.org/linux/mongodb-linux-i686-2.4.9.tgz
Step 3: Optional. Copy MongoDB to Target Directory Copy the extracted folder into another location, such as
mongodb.
mkdir -p mongodb
cp -R -n mongodb-linux-i686-2.4.9/ mongodb
Step 4: Optional. Configure Search Path To ensure that the downloaded binaries are in your PATH, you can
modify your PATH and/or create symbolic links to the MongoDB binaries in your /usr/local/bin directory
(/usr/local/bin is already in your PATH). You can find the MongoDB binaries in the bin/ directory within the
archive.
Run MongoDB
Set Up the Data Directory Before you start mongod for the first time, you will need to create the data directory
(i.e. dbpath). By default, mongod writes data to the /data/db directory.
Step 1: Create dbpath To create the default dbpath directory, use the following command:
mkdir -p /data/db
Step 2: Set dbpath Permissions Ensure that the user that runs the mongod process has read and write permissions
to this directory. For example, if you will run the mongod process, change the owner of the /data/db directory:
chown mongodb /data/db
If your PATH does not include the location of the mongod binary, enter the full path to the mongod binary.
Starting mongod without any arguments starts a MongoDB instance that writes data to the /data/db directory. To
specify an alternate data directory, start mongod with the --dbpath option:
mongod --dbpath <some alternate directory>
Whether using the default /data/db or an alternate directory, ensure that the user account running mongod has read
and write permissions to the directory.
Stop MongoDB To stop the mongod instance, press Control+C in the terminal where the mongod instance is
running.
10
1.1.2 OS X
Install MongoDB on OS X (page 11)
Install MongoDB on OS X
Platform Support
Starting in version 2.4, MongoDB only supports OS X versions 10.6 (Snow Leopard) on Intel x86-64 and later.
MongoDB is available through the popular OS X package manager Homebrew3 or through the MongoDB Download
site.
Install MongoDB with Homebrew
Homebrew4 5 installs binary packages based on published formulae. The following commands will update brew to
the latest packages and install MongoDB.
In a terminal shell, use the following sequence of commands to updatebrew to the latest packages and install
MongoDB:
brew update
brew install mongodb
Later, if you need to upgrade MongoDB, run the following sequence of commands to update the MongoDB installation
on your system:
brew update
brew upgrade mongodb
Optionally, you can choose to build MongoDB from source. Use the following command to build MongoDB with
SSL support:
brew install mongodb --with-openssl
You can also install the latest development release of MongoDB for testing and development with the following
command:
brew install mongodb --devel
Manual Installation
Step 1: Download the Latest Release In a system shell, download the latest release for 64-bit OS X.
curl -O http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.4.9.tgz
Homebrew requires some initial setup and configuration. This configuration is beyond the scope of this document.
11
Step 2: Extract MongoDB From Archive Extract the files from the downloaded archive.
tar -zxvf mongodb-osx-x86_64-2.4.9.tgz
Step 3: Optional. Copy MongoDB to Target Directory Copy the extracted folder into another location, such as
mongodb.
mkdir -p mongodb
cp -R -n mongodb-osx-x86_64-2.4.9/ mongodb
Step 4: Optional. Configure Search Path To ensure that the downloaded binaries are in your PATH, you can
modify your PATH and/or create symbolic links to the MongoDB binaries in your /usr/local/bin directory
(/usr/local/bin is already in your PATH). You can find the MongoDB binaries in the bin/ directory within the
archive.
Run MongoDB
Set Up the Data Directory Before you start mongod for the first time, you will need to create the data directory.
By default, mongod writes data to the /data/db/ directory.
Step 1: Create dbpath To create the default dbpath directory, use the following command:
mkdir -p /data/db
Step 2: Set dbpath Permissions Ensure that the user that runs the mongod process has read and write permissions
to this directory. For example, if you will run the mongod process, change the owner of the /data/db directory:
chown `id -u` /data/db
If your PATH does not include the location of the mongod binary, enter the full path to the mongod binary.
The previous command starts a mongod instance that writes data to the /data/db/ directory. To specify an alternate
data directory, start mongod with the --dbpath option:
mongod --dbpath <some alternate directory>
Whether using the default /data/db/ or an alternate directory, ensure that the user account running mongod has
read and write permissions to the directory.
Stop MongoDB To stop the mongod instance, press Control+C in the terminal where the mongod instance is
running.
12
1.1.3 Windows
Install MongoDB on Windows (page 13)
Install MongoDB on Windows
Platform Support
Starting in version 2.2, MongoDB does not support Windows XP. Please use a more recent version of Windows to use
more recent releases of MongoDB.
Important: If you are running any edition of Windows Server 2008 R2 or Windows 7, please install a hotfix to
resolve an issue with memory mapped files on Windows6 .
1. Download the latest production release of MongoDB from the MongoDB downloads page7 . Ensure you download the correct version of MongoDB for your Windows system. The 64-bit versions of MongoDB will not
work with 32-bit Windows.
2. Extract the downloaded archive.
(a) In Windows Explorer, find the MongoDB download file, typically in the default Downloads directory.
(b) Extract the archive to C:\ by right clicking on the archive and selecting Extract All and browsing to C:\.
3. Optional. Move the MongoDB directory to another location.
C:\mongodb directory:
13
cd \
move C:\mongodb-win32-* C:\mongodb
Note: MongoDB is self-contained and does not have any other system dependencies. You can run MongoDB from
any folder you choose. You may install MongoDB in any directory (e.g. D:\test\mongodb)
Run MongoDB
Set Up the Data Directory MongoDB requires a data folder to store its files. The default location for the MongoDB
data directory is C:\data\db. Create this folder using the Command Prompt. Go to the C:\ directory and issue the
following command sequence:
md data
md data\db
You can specify an alternate path for data files using the --dbpath option to mongod.exe.
Start MongoDB To start MongoDB, execute from the Command Prompt:
C:\mongodb\bin\mongod.exe
This will start the main MongoDB database process. The waiting for connections message in the console
output indicates that the mongod.exe process is running successfully.
Note: Depending on the security level of your system, Windows will issue a Security Alert dialog box about blocking
some features of C:\\mongodb\bin\mongod.exe from communicating on networks. All users should select
Private Networks, such as my home or work network and click Allow access. For additional
information on security and MongoDB, please read the Security Concepts (page 239) page.
Warning: Do not allow mongod.exe to be accessible to public networks without running in Secure Mode (i.e.
auth.) MongoDB is designed to be run in trusted environments and the database does not enable authentication
or Secure Mode by default.
You may specify an alternate path for \data\db with the dbpath setting for mongod.exe, as in the following
example:
C:\mongodb\bin\mongod.exe --dbpath d:\test\mongodb\data
If your path includes spaces, enclose the entire path in double quotations, for example:
C:\mongodb\bin\mongod.exe --dbpath "d:\test\mongo db data"
Connect to MongoDB Connect to MongoDB using the mongo.exe shell. Open another Command Prompt and
issue the following command:
C:\mongodb\bin\mongo.exe
Note: Executing the command start C:\mongodb\bin\mongo.exe will automatically start the mongo.exe
shell in a separate Command Prompt window.
14
The mongo.exe shell will connect to mongod.exe running on the localhost interface and port 27017 by default.
At the mongo.exe prompt, issue the following two commands to insert a record in the test collection of the default
test database and then retrieve that record:
db.test.save( { a: 1 } )
db.test.find()
See also:
mongo and http://docs.mongodb.org/manualreference/method. If you want to develop applications
using .NET, see the documentation of C# and MongoDB8 for more information.
MongoDB as a Windows Service
Configure the System The following steps, although optional, are good practice.
You should specify two options when running MongoDB as a Windows Service: a path for the log output (i.e.
logpath) and a configuration file.
1. Optional. Create a specific directory for MongoDB log files:
md C:\mongodb\log
2. Optional. Create a configuration file for the logpath option for MongoDB in the Command Prompt by issuing
this command:
echo logpath=C:\mongodb\log\mongo.log > C:\mongodb\mongod.cfg
Note: Consider setting the logappend option. If you do not, mongod.exe will delete the contents of the existing
log file when starting.
Changed in version 2.2: The default logpath and logappend behavior changed in the 2.2 release.
Install and Run the MongoDB Service Run all of the following commands in Command Prompt with Administrative Privileges:
1. To install the MongoDB service:
C:\mongodb\bin\mongod.exe --config C:\mongodb\mongod.cfg --install
Modify the path to the mongod.cfg file as needed. For the --install option to succeed, you must specify
a logpath setting or the --logpath run-time option.
2. To run the MongoDB service:
net start MongoDB
8 http://docs.mongodb.org/ecosystem/drivers/csharp
15
If you wish to use an alternate path for your dbpath specify it in the config file (e.g. C:\mongodb\mongod.cfg)
on that you specified in the --install operation. You may also specify --dbpath on the command line; however,
always prefer the configuration file.
If you have not set up the data directory, set up the data directory (page 14) where MongoDB will store its data files.
If the dbpath directory does not exist, mongod.exe will not be able to start. The default value for dbpath is
\data\db.
Stop or Remove the MongoDB Service To stop the MongoDB service:
net stop MongoDB
Changed in version 2.4.4: MongoDB Enterprise uses Cyrus SASL instead of GNU SASL. Earlier 2.4 Enterprise
versions use GNU SASL (libgsasl) instead. For required packages for the earlier 2.4 versions, see Earlier 2.4
Versions (page 17).
To use MongoDB Enterprise, you must install several prerequisites. The names of the packages vary by distribution
and are as follows:
Debian or Ubuntu 12.04 require:
libssl0.9.8, snmp, snmpd, cyrus-sasl2-dbg,
cyrus-sasl2-mit-dbg,
libsasl2-2,
libsasl2-dev,
libsasl2-modules,
and
libsasl2-modules-gssapi-mit. Issue a command such as the following to install these packages:
sudo apt-get install libssl0.9.8 snmp snmpd cyrus-sasl2-dbg cyrus-sasl2-mit-dbg libsasl2-2 libsa
CentOS and Red Hat Enterprise Linux 6.x and 5.x, as well as Amazon Linux AMI require:
net-snmp, net-snmp-libs, openssl, net-snmp-utils, cyrus-sasl, cyrus-sasl-lib,
cyrus-sasl-devel, and cyrus-sasl-gssapi. Issue a command such as the following to install these
packages:
sudo yum install openssl net-snmp net-snmp-libs net-snmp-utils cyrus-sasl cyrus-sasl-lib cyrus-s
16
Earlier 2.4 Versions Before version 2.4.4, the 2.4 versions of MongoDB Enterprise use libgsasl10 . The required
packages for the different distributions are as follows:
Ubuntu 12.04 requires libssl0.9.8, libgsasl, snmp, and snmpd. Issue a command such as the following to install these packages:
sudo apt-get install libssl0.9.8 libgsasl7 snmp snmpd
Red Hat Enterprise Linux 6.x series and Amazon Linux AMI require openssl, libgsasl7, net-snmp,
net-snmp-libs, and net-snmp-utils. To download libgsasl you must enable the EPEL repository
by issuing the following sequence of commands to add and update the system repositories:
sudo rpm -ivh http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
sudo yum update -y
When you have installed and updated the EPEL repositories, issue the following install these packages:
sudo yum install openssl net-snmp net-snmp-libs net-snmp-utils libgsasl
Note: Before 2.4.4, MongoDB Enterprise 2.4 for SUSE requires libgsasl11 which is not available in the default
repositories for SUSE.
When you have installed the required packages, and downloaded the Enterprise packages12 you can install the packages
using the same procedure as a standard installation of MongoDB on Linux Systems (page 9).
Note: .deb and .rpm packages for Enterprise releases are available for some platforms. You can use these to install
MongoDB directly using the dpkg and rpm utilities.
Use the sequence of commands below to download and extract MongoDB Enterprise packages appropriate for your
distribution:
Ubuntu 12.04
curl -O http://downloads.10gen.com/linux/mongodb-linux-x86_64-subscription-ubuntu1204-2.4.9.tgz
tar -zxvf mongodb-linux-x86_64-subscription-ubuntu1204-2.4.9.tgz
cp -R -n mongodb-linux-x86_64-subscription-ubuntu1204-2.4.9/ mongodb
17
curl -O http://downloads.10gen.com/linux/mongodb-linux-x86_64-subscription-rhel62-2.4.9.tgz
tar -zxvf mongodb-linux-x86_64-subscription-rhel62-2.4.9.tgz
cp -R -n mongodb-linux-x86_64-subscription-rhel62-2.4.9/ mongodb
Note: The Enterprise packages currently include an example SNMP configuration file named mongod.conf. This
file is not a MongoDB configuration file.
Before you start mongod for the first time, you will need to create the data directory (i.e. dbpath). By default,
mongod writes data to the /data/db directory.
Step 1: Create dbpath To create the default dbpath directory, use the following command:
mkdir -p /data/db
Step 2: Set dbpath Permissions Ensure that the user that runs the mongod process has read and write permissions
to this directory. For example, if you will run the mongod process, change the owner of the /data/db directory:
chown mongodb /data/db
If your PATH does not include the location of the mongod binary, enter the full path to the mongod binary.
Starting mongod without any arguments starts a MongoDB instance that writes data to the /data/db directory. To
specify an alternate data directory, start mongod with the --dbpath option:
mongod --dbpath <some alternate directory>
Whether using the default /data/db or an alternate directory, ensure that the user account running mongod has read
and write permissions to the directory.
18
Stop MongoDB To stop the mongod instance, press Control+C in the terminal where the mongod instance is
running.
Further Reading
As you begin to use MongoDB, consider the Getting Started with MongoDB (page 19) and MongoDB Tutorials
(page 186) resources. To read about features only available in MongoDB Enterprise, consider: Monitor MongoDB
with SNMP (page 177) and Deploy MongoDB with Kerberos Authentication (page 261).
From a system prompt, start mongo by issuing the mongo command, as follows:
mongo
By default, mongo looks for a database server listening on port 27017 on the localhost interface. To connect to
a server on a different port or interface, use the --port and --host options.
Select a Database
After starting the mongo shell your session will use the test database by default. At any time, issue the following
operation at the mongo to report the name of the current database:
13 http://api.mongodb.org/js
19
db
1. From the mongo shell, display the list of databases, with the following operation:
show dbs
3. Confirm that your session has the mydb database as context, by checking the value of the db object, which
returns the name of the current database, as follows:
db
At this point, if you issue the show dbs operation again, it will not include the mydb database. MongoDB
will not permanently create a database until you insert data into that database. The Create a Collection and
Insert Documents (page 20) section describes the process for inserting data.
New in version 2.4: show databases also returns a list of databases.
Display mongo Help
At any point, you can access help for the mongo shell using the following operation:
help
Furthermore, you can append the .help() method to some JavaScript methods, any cursor object, as well as the db
and db.collection objects to return additional help information.
Create a Collection and Insert Documents
In this section, you insert documents into a new collection named testData within the new database named mydb.
MongoDB will create a collection implicitly upon its first use. You do not need to create a collection before inserting
data. Furthermore, because MongoDB uses dynamic schemas (page 556), you also need not specify the structure of
your documents before inserting them into the collection.
1. From the mongo shell, confirm you are in the mydb database by issuing the following:
db
2. If mongo does not return mydb for the previous operation, set the context to the mydb database, with the
following operation:
use mydb
3. Create two documents named j and k by using the following sequence of JavaScript operations:
j = { name : "mongo" }
k = { x : 3 }
4. Insert the j and k documents into the testData collection with the following sequence of operations:
db.testData.insert( j )
db.testData.insert( k )
When you insert the first document, the mongod will create both the mydb database and the testData
collection.
20
5. Confirm that the testData collection exists. Issue the following operation:
show collections
The mongo shell will return the list of the collections in the current (i.e. mydb) database. At this point, the only
collection is testData. All mongod databases also have a system.indexes (page 227) collection.
6. Confirm that the documents exist in the testData collection by issuing a query on the collection using the
find() method:
db.testData.find()
This operation returns the following results. The ObjectId (page 129) values will be unique:
{ "_id" : ObjectId("4c2209f9f3924d31102bd84a"), "name" : "mongo" }
{ "_id" : ObjectId("4c2209fef3924d31102bd84b"), "x" : 3 }
All MongoDB documents must have an _id field with a unique value. These operations do not explicitly
specify a value for the _id field, so mongo creates a unique ObjectId (page 129) value for the field before
inserting it into the collection.
Insert Documents using a For Loop or a JavaScript Function
To perform the remaining procedures in this tutorial, first add more documents to your database using one or both of
the procedures described in Generate Test Data (page 23).
Working with the Cursor
When you query a collection, MongoDB returns a cursor object that contains the results of the query. The mongo
shell then iterates over the cursor to display the results. Rather than returning all results at once, the shell iterates over
the cursor 20 times to display the first 20 results and then waits for a request to iterate over the remaining results. In
the shell, use enter it to iterate over the next set of results.
The procedures in this section show other ways to work with a cursor. For comprehensive documentation on cursors,
see crud-read-cursor.
Iterate over the Cursor with a Loop
Before using this procedure, make sure to add at least 25 documents to a collection using one of the procedures in
Generate Test Data (page 23). You can name your database and collections anything you choose, but this procedure
will assume the database named test and a collection named testData.
1. In the MongoDB JavaScript shell, query the testData collection and assign the resulting cursor object to the
c variable:
var c = db.testData.find()
2. Print the full result set by using a while loop to iterate over the c variable:
while ( c.hasNext() ) printjson( c.next() )
The hasNext() function returns true if the cursor has documents. The next() method returns the next
document. The printjson() method renders the document in a JSON-like format.
The operation displays 20 documents. For example, if the documents have a single field named x, the operation
displays the field as well as each documents ObjectId:
21
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
ObjectId("51a7dc7b2cacf40b79990be6"),
ObjectId("51a7dc7b2cacf40b79990be7"),
ObjectId("51a7dc7b2cacf40b79990be8"),
ObjectId("51a7dc7b2cacf40b79990be9"),
ObjectId("51a7dc7b2cacf40b79990bea"),
ObjectId("51a7dc7b2cacf40b79990beb"),
ObjectId("51a7dc7b2cacf40b79990bec"),
ObjectId("51a7dc7b2cacf40b79990bed"),
ObjectId("51a7dc7b2cacf40b79990bee"),
ObjectId("51a7dc7b2cacf40b79990bef"),
ObjectId("51a7dc7b2cacf40b79990bf0"),
ObjectId("51a7dc7b2cacf40b79990bf1"),
ObjectId("51a7dc7b2cacf40b79990bf2"),
ObjectId("51a7dc7b2cacf40b79990bf3"),
ObjectId("51a7dc7b2cacf40b79990bf4"),
ObjectId("51a7dc7b2cacf40b79990bf5"),
ObjectId("51a7dc7b2cacf40b79990bf6"),
ObjectId("51a7dc7b2cacf40b79990bf7"),
ObjectId("51a7dc7b2cacf40b79990bf8"),
ObjectId("51a7dc7b2cacf40b79990bf9"),
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
1 }
2 }
3 }
4 }
5 }
6 }
7 }
8 }
9 }
10 }
11 }
12 }
13 }
14 }
15 }
16 }
17 }
18 }
19 }
20 }
The following procedure lets you manipulate a cursor object as if it were an array:
1. In the mongo shell, query the testData collection and assign the resulting cursor object to the c variable:
var c = db.testData.find()
2. To find the document at the array index 4, use the following operation:
printjson( c [ 4 ] )
When you access documents in a cursor using the array index notation, mongo first calls the
cursor.toArray() method and loads into RAM all documents returned by the cursor. The index is then
applied to the resulting array. This operation iterates the cursor completely and exhausts the cursor.
For very large result sets, mongo may run out of available memory.
For more information on the cursor, see crud-read-cursor.
Query for Specific Documents
MongoDB has a rich query system that allows you to select and filter the documents in a collection along specific
fields and values. See Query Documents (page 60) and Read Operations (page 31) for a full account of queries in
MongoDB.
In this procedure, you query for specific documents in the testData collection by passing a query document as a
parameter to the find() method. A query document specifies the criteria the query must match to return a document.
In the mongo shell, query for all documents where the x field has a value of 18 by passing the { x :
document as a parameter to the find() method:
22
18 } query
db.testData.find( { x : 18 } )
With the findOne() method you can return a single document from a MongoDB collection. The findOne()
method takes the same parameters as find(), but returns a document rather than a cursor.
To retrieve one document from the testData collection, issue the following command:
db.testData.findOne()
For more information on querying for documents, see the Query Documents (page 60) and Read Operations (page 31)
documentation.
Limit the Number of Documents in the Result Set
To increase performance, you can constrain the size of the result by limiting the amount of data your application must
receive over the network.
To specify the maximum number of documents in the result set, call the limit() method on a cursor, as in the
following command:
db.testData.find().limit(3)
MongoDB will return the following result, with different ObjectId (page 129) values:
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be6"), "x" : 1 }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be7"), "x" : 2 }
{ "_id" : ObjectId("51a7dc7b2cacf40b79990be8"), "x" : 3 }
23
1. From the mongo shell, insert new documents into the testData collection using the following for loop. If
the testData collection does not exist, MongoDB creates the collection implicitly.
for (var i = 1; i <= 25; i++) db.testData.insert( { x : i } )
The mongo shell displays the first 20 documents in the collection. Your ObjectId (page 129) values will be
different:
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
"_id"
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
ObjectId("51a7dc7b2cacf40b79990be6"),
ObjectId("51a7dc7b2cacf40b79990be7"),
ObjectId("51a7dc7b2cacf40b79990be8"),
ObjectId("51a7dc7b2cacf40b79990be9"),
ObjectId("51a7dc7b2cacf40b79990bea"),
ObjectId("51a7dc7b2cacf40b79990beb"),
ObjectId("51a7dc7b2cacf40b79990bec"),
ObjectId("51a7dc7b2cacf40b79990bed"),
ObjectId("51a7dc7b2cacf40b79990bee"),
ObjectId("51a7dc7b2cacf40b79990bef"),
ObjectId("51a7dc7b2cacf40b79990bf0"),
ObjectId("51a7dc7b2cacf40b79990bf1"),
ObjectId("51a7dc7b2cacf40b79990bf2"),
ObjectId("51a7dc7b2cacf40b79990bf3"),
ObjectId("51a7dc7b2cacf40b79990bf4"),
ObjectId("51a7dc7b2cacf40b79990bf5"),
ObjectId("51a7dc7b2cacf40b79990bf6"),
ObjectId("51a7dc7b2cacf40b79990bf7"),
ObjectId("51a7dc7b2cacf40b79990bf8"),
ObjectId("51a7dc7b2cacf40b79990bf9"),
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
"x"
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
1 }
2 }
3 }
4 }
5 }
6 }
7 }
8 }
9 }
10 }
11 }
12 }
13 }
14 }
15 }
16 }
17 }
18 }
19 }
20 }
1. The find() returns a cursor. To iterate the cursor and return more documents use the it operation in the
mongo shell. The mongo shell will exhaust the cursor, and return the following documents:
{
{
{
{
{
"_id"
"_id"
"_id"
"_id"
"_id"
:
:
:
:
:
ObjectId("51a7dce92cacf40b79990bfc"),
ObjectId("51a7dce92cacf40b79990bfd"),
ObjectId("51a7dce92cacf40b79990bfe"),
ObjectId("51a7dce92cacf40b79990bff"),
ObjectId("51a7dce92cacf40b79990c00"),
"x"
"x"
"x"
"x"
"x"
:
:
:
:
:
21
22
23
24
25
}
}
}
}
}
24
The insertData() function takes three parameters: a database, a new or existing collection, and the number of
documents to create. The function creates documents with an x field that is set to an incremented integer, as in the
following example documents:
{ "_id" : ObjectId("51a4da9b292904caffcff6eb"), "x" : 0 }
{ "_id" : ObjectId("51a4da9b292904caffcff6ec"), "x" : 1 }
{ "_id" : ObjectId("51a4da9b292904caffcff6ed"), "x" : 2 }
Store the function in your .mongorc.js file. The mongo shell loads the function for you every time you start a session.
Example
Specify database name, collection name, and the number of documents to insert as arguments to insertData().
insertData("test", "testData", 400)
This operation inserts 400 documents into the testData collection in the test database. If the collection and
database do not exist, MongoDB creates them implicitly before inserting documents.
See also:
MongoDB CRUD Concepts (page 29) and Data Models (page 97).
25
26
CHAPTER 2
MongoDB provides rich semantics for reading and manipulating data. CRUD stands for create, read, update, and
delete. These terms are the foundation for all interactions with the database.
MongoDB CRUD Introduction (page 27) An introduction to the MongoDB data model as well as queries and data
manipulations.
MongoDB CRUD Concepts (page 29) The core documentation of query and data manipulation.
MongoDB CRUD Tutorials (page 58) Examples of basic query and data modification operations.
MongoDB CRUD Reference (page 82) Reference material for the query and data manipulation interfaces.
27
28
Figure 2.3: The stages of a MongoDB query with a query criteria and a sort modifier.
Read Preference
For replica sets and sharded clusters with replica set components, applications specify read preferences (page 406). A
read preference determines how the client direct read operations to the set.
Write Concern
Applications can also control the behavior of write operations using write concern (page 47). Particularly useful
for deployments with replica sets, the write concern semantics allow clients to specify the assurance that MongoDB
provides when reporting on the success of a write operation.
Aggregation
In addition to the basic queries, MongoDB provides several data aggregation features. For example, MongoDB can
return counts of the number of documents that match a query, or return the number of distinct values for a field, or
process a collection of documents using a versatile stage-based data processing pipeline or map-reduce operations.
29
30
Distributed Queries (page 38) Describes how sharded clusters and replica sets affect the performance of read
operations.
Write Operations (page 42) Introduces data create and modify operations, their behavior, and performances.
Write Concern (page 47) Describes the kind of guarantee MongoDB provides when reporting on the success
of a write operation.
Distributed Write Operations (page 51) Describes how MongoDB directs write operations on sharded clusters
and replica sets and the performance characteristics of these operations.
For query operations, MongoDB provides a db.collection.find() method. The method accepts both the
query criteria and projections and returns a cursor (page 35) to the matching documents. You can optionally modify
the query to impose limits, skips, and sort orders.
The following diagram highlights the components of a MongoDB query operation:
31
This query selects the documents in the users collection that match the condition age is greater than 18. To specify
the greater than condition, query criteria uses the greater than (i.e. $gt) query selection operator. The query returns
at most 5 matching documents (or more precisely, a cursor to those documents). The matching documents will return
with only the _id, name and address fields. See Projections (page 33) for details.
See
SQL to MongoDB Mapping Chart (page 85) for additional examples of MongoDB queries and the corresponding SQL
statements.
Query Behavior
Consider the following diagram of the query process that specifies a query criteria and a sort modifier:
In the diagram, the query selects documents from the users collection. Using a query selection operator
to define the conditions for matching documents, the query selects documents that have age greater than (i.e. $gt)
18. Then the sort() modifier sorts the results by age in ascending order.
For additional examples of queries, see Query Documents (page 60).
32
Figure 2.7: The stages of a MongoDB query with a query criteria and a sort modifier.
Projections
Queries in MongoDB return all fields in all matching documents by default. To limit the amount of data that MongoDB
sends to applications, include a projection in the queries. By projecting results with a subset of fields, applications
reduce their network overhead and processing requirements.
Projections, which are the the second argument to the find() method, may either specify a list of fields to return or
list fields to exclude in the result documents.
Important:
projections.
Except for excluding the _id field in inclusive projections, you cannot mix exclusive and inclusive
Consider the following diagram of the query process that specifies a query criteria and a projection:
In the diagram, the query selects from the users collection. The criteria matches the documents that have age equal
to 18. Then the projection specifies that only the name field should return in the matching documents.
Projection Examples
Exclude One Field From a Result Set
db.records.find( { "user_id": { $lt: 42} }, { history: 0} )
This query selects a number of documents in the records collection that match the query { "user_id":
$lt: 42} }, but excludes the history field.
33
Figure 2.8: The stages of a MongoDB query with a query criteria and projection. MongoDB only transmits the
projected data to the clients.
This query selects a number of documents in the records collection that match the query { "user_id": {
$lt: 42} }, but returns documents that have the _id field (implicitly included) as well as the name and email
fields.
Return Two Fields and Exclude _id
db.records.find( { "user_id": { $lt: 42} }, { "_id": 0, "name": 1 , "email": 1 } )
This query selects a number of documents in the records collection that match the query { "user_id":
$lt: 42} }, but only returns the name and email fields.
See
Limit Fields to Return from a Query (page 64) for more examples of queries with projection statements.
34
Cursors
In the mongo shell, the primary method for the read operation is the db.collection.find() method. This
method queries a collection and returns a cursor to the returning documents.
To access the documents, you need to iterate the cursor. However, in the mongo shell, if the returned cursor is not
assigned to a variable using the var keyword, then the cursor is automatically iterated up to 20 times 1 to print up to
the first 20 documents in the results.
For example, in the mongo shell, the following read operation queries the inventory collection for documents that
have type equal to food and automatically print up to the first 20 matching documents:
db.inventory.find( { type: 'food' } );
To manually iterate the cursor to access the documents, see Iterate a Cursor in the mongo Shell (page 65).
Cursor Behaviors
Closure of Inactive Cursors By default, the server will automatically close the cursor after 10 minutes of inactivity
or if client has exhausted the cursor. To override this behavior, you can specify the noTimeout wire protocol flag2
in your query; however, you should either close the cursor manually or exhaust the cursor. In the mongo shell, you
can set the noTimeout flag:
var myCursor = db.inventory.find().addOption(DBQuery.Option.noTimeout);
See your driver (page 95) documentation for information on setting the noTimeout flag. For the mongo shell, see
cursor.addOption() for a complete list of available cursor flags.
Cursor Isolation Because the cursor is not isolated during its lifetime, intervening write operations on a document
may result in a cursor that returns a document more than once if that document has changed. To handle this situation,
see the information on snapshot mode (page 565).
Cursor Batches The MongoDB server returns the query results in batches. Batch size will not exceed the maximum
BSON document size. For most queries, the first batch returns 101 documents or just enough documents to exceed 1
megabyte. Subsequent batch size is 4 megabytes. To override the default size of the batch, see batchSize() and
limit().
For queries that include a sort operation without an index, the server must load all the documents in memory to perform
the sort and will return all documents in the first batch.
As you iterate through the cursor and reach the end of the returned batch, if there are more results, cursor.next()
will perform a getmore operation to retrieve the next batch. To see how many documents remain in the batch
as you iterate the cursor, you can use the objsLeftInBatch() method, as in the following example:
var myCursor = db.inventory.find();
var myFirstDocument = myCursor.hasNext() ? myCursor.next() : null;
myCursor.objsLeftInBatch();
1 You can use the DBQuery.shellBatchSize to change the number of iteration from the default value 20. See Executing Queries
(page 213) for more information.
2 http://docs.mongodb.org/meta-driver/latest/legacy/mongodb-wire-protocol
35
Cursor Information
You can use the command cursorInfo to retrieve the following information on cursors:
total number of open cursors
size of the client cursors in current use
number of timed out cursors since the last server restart
Consider the following example:
db.runCommand( { cursorInfo: 1 } )
Query Optimization
Indexes improve the efficiency of read operations by reducing the amount of data that query operations need to process.
This simplifies the work associated with fulfilling queries within MongoDB.
Create an Index to Support Read Operations
If your application queries a collection on a particular field or fields, then an index on the queried field or fields can
prevent the query from scanning the whole collection to find and return the query results. For more information about
indexes, see the complete documentation of indexes in MongoDB (page 318).
Example
An application queries the inventory collection on the type field. The value of the type field is user-driven.
var typeValue = <someUserInput>;
db.inventory.find( { type: typeValue } );
To improve the performance of this query, add an ascending, or a descending, index to the inventory collection
on the type field. 3 In the mongo shell, you can create indexes using the db.collection.ensureIndex()
method:
db.inventory.ensureIndex( { type: 1 } )
This index can prevent the above query on type from scanning the whole collection to return the results.
To analyze the performance of the query with an index, see Analyze Query Performance (page 66).
In addition to optimizing read operations, indexes can support sort operations and allow for a more efficient storage
utilization. See db.collection.ensureIndex() and Indexing Tutorials (page 339) for more information about
index creation.
3 For single-field indexes, the selection between ascending and descending order is immaterial. For compound indexes, the selection is important.
See indexing order (page 323) for more details.
36
Query Selectivity
Some query operations are not selective. These operations cannot use indexes effectively or cannot use indexes at all.
The inequality operators $nin and $ne are not very selective, as they often match a large portion of the index. As a
result, in most cases, a $nin or $ne query with an index may perform no better than a $nin or $ne query that must
scan all documents in a collection.
Queries that specify regular expressions, with inline JavaScript regular expressions or $regex operator expressions,
cannot use an index with one exception. Queries that specify regular expression with anchors at the beginning of a
string can use an index.
Covering a Query
This index will cover the following query on the type and item fields, which returns only the item field:
db.inventory.find( { type: "food", item:/^c/ },
{ item: 1, _id: 0 } )
However, the index will not cover the following query, which returns the item field and the _id field:
db.inventory.find( { type: "food", item:/^c/ },
{ item: 1 } )
See Create Indexes that Support Covered Queries (page 369) for more information on the behavior and use of covered
queries.
Query Plans
The MongoDB query optimizer processes queries and chooses the most efficient query plan for a query given the available indexes. The query system then uses this query plan each time the query runs. The query optimizer occasionally
reevaluates query plans as the content of the collection changes to ensure optimal query plans.
You can use the explain() method to view statistics about the query plan for a given query. This information can
help as you develop indexing strategies (page 368).
Query Optimization
37
As collections change over time, the query optimizer deletes the query plan and re-evaluates after any of the following
events:
The collection receives 1,000 write operations.
The reIndex rebuilds the index.
You add or drop an index.
The mongod process restarts.
Distributed Queries
Read Operations to Sharded Clusters
Sharded clusters allow you to partition a data set among a cluster of mongod instances in a way that is nearly transparent to the application. For an overview of sharded clusters, see the Sharding (page 479) section of this manual.
For a sharded cluster, applications issue operations to one of the mongos instances associated with the cluster.
Read operations on sharded clusters are most efficient when directed to a specific shard. Queries to sharded collections
should include the collections shard key (page 492). When a query includes a shard key, the mongos can use cluster
metadata from the config database (page 488) to route the queries to shards.
If a query does not include the shard key, the mongos must direct the query to all shards in the cluster. These scatter
gather queries can be inefficient. On larger clusters, scatter gather queries are unfeasible for routine operations.
For more information on read operations in sharded clusters, see the Sharded Cluster Query Routing (page 496) and
Shard Keys (page 492) sections.
38
39
Figure 2.10: Read operations to a sharded cluster. Query criteria includes the shard key. The query router mongos
can target the query to the appropriate shard or shards.
40
Figure 2.11: Read operations to a sharded cluster. Query criteria does not include the shard key. The query router
mongos must broadcast query to all shards for the collection.
41
Replica sets use read preferences to determine where and how to route read operations to members of the replica set.
By default, MongoDB always reads data from a replica sets primary. You can modify that behavior by changing the
read preference mode (page 476).
You can configure the read preference mode (page 476) on a per-connection or per-operation basis to allow reads from
secondaries to:
reduce latency in multi-data-center deployments,
improve read throughput by distributing high read-volumes (relative to write volume),
for backup operations, and/or
to allow reads during failover (page 397) situations.
Figure 2.12: Read operations to a replica set. Default read preference routes the read to the primary. Read preference
of nearest routes the read to the nearest member.
Read operations from secondary members of replica sets are not guaranteed to reflect the current state of the primary,
and the state of secondaries will trail the primary by some amount of time. Often, applications dont rely on this kind
of strict consistency, but application developers should always consider the needs of their application before setting
read preference.
For more information on read preference or on the read preference modes, see Read Preference (page 406) and Read
Preference Modes (page 476).
Write Concern (page 47) Describes the kind of guarantee MongoDB provides when reporting on the success of a
write operation.
Distributed Write Operations (page 51) Describes how MongoDB directs write operations on sharded clusters and
replica sets and the performance characteristics of these operations.
Write Operation Performance (page 52) Introduces the performance constraints and factors for writing data to MongoDB deployments.
Bulk Inserts in MongoDB (page 56) Describe behaviors associated with inserting an array of documents.
Record Padding (page 57) When storing documents on disk, MongoDB reserves space to allow documents to grow
efficiently during subsequent updates.
Write Operations Overview
A write operation is any operation that creates or modifies data in the MongoDB instance. In MongoDB, write
operations target a single collection. All write operations in MongoDB are atomic on the level of a single document.
There are three classes of write operations in MongoDB: insert, update, and remove. Insert operations add new data to
a collection. Update operations modify existing data, and remove operations delete data from a collection. No insert,
update, or remove can affect more than one document atomically.
For the update and remove operations, you can specify criteria, or conditions, that identify the documents to update or
remove. These operations use the same query syntax to specify the criteria as read operations (page 31).
After issuing these modification operations, MongoDB allows applications to determine the level of acknowledgment
returned from the database. See Write Concern (page 47).
Create
Create operations add new documents to a collection. In MongoDB, the db.collection.insert() method
performs create operations.
The following diagram highlights the components of a MongoDB insert operation:
43
This operation inserts a new document into the users collection. The new document has four fields: name, age,
status, and an _id field. MongoDB always adds the _id field to a new document if the field does not exist.
For more information, see db.collection.insert() and Insert Documents (page 59).
Some updates also create records. If an update operation specifies the upsert flag and there are no documents that
match the query portion of the update operation, then MongoDB will convert the update into an insert.
With an upsert, applications can decide between performing an update or an insert operation using just a single call.
Both the update() method and the save() method can perform an upsert. See update() and save() for
details on performing an upsert with these methods.
See
SQL to MongoDB Mapping Chart (page 85) for additional examples of MongoDB write operations and the corresponding SQL statements.
Insert Behavior If you add a new document without the _id field, the client library or the mongod instance adds an
_id field and populates the field with a unique ObjectId.
If you specify the _id field, the value must be unique within the collection. For operations with write concern
(page 47), if you try to create a document with a duplicate _id value, mongod returns a duplicate key exception.
Update
44
Example
db.users.update(
{ age: { $gt: 18 } },
{ $set: { status: "A" } },
{ multi: true }
)
This update operation on the users collection sets the status field to A for the documents that match the criteria
of age greater than 18.
For more information, see db.collection.update() and db.collection.save(), and Modify Documents (page 67) for examples.
Update Behavior By default, the db.collection.update() method updates a single document. However,
with the multi option, update() can update all documents in a collection that match a query.
The db.collection.update() method either updates specific fields in the existing document or replaces the
document. See db.collection.update() for details.
When performing update operations that increase the document size beyond the allocated space for that document, the
update operation relocates the document on disk and may reorder the document fields depending on the type of update.
The db.collection.save() method replaces a document and can only update a single document.
db.collection.save() and Insert Documents (page 59) for more information
See
45
Delete
Delete operations remove documents from a collection. In MongoDB, db.collection.remove() method performs delete operations. The db.collection.remove() method can accept query criteria to determine which
documents to remove.
The following diagram highlights the components of a MongoDB remove operation:
Example
db.users.remove(
{ status: "D" }
)
This delete operation on the users collection removes all documents that match the criteria of status equal to D.
For more information, see db.collection.remove() method and Remove Documents (page 68).
Remove Behavior By default, db.collection.remove() method removes all documents that match its query.
However, the method can accept a flag to limit the delete operation to a single document.
Isolation of Write Operations
The modification of a single document is always atomic, even if the write operation modifies multiple sub-documents
within that document. For write operations that modify multiple documents, the operation as a whole is not atomic,
and other operations may interleave.
No other operations are atomic. You can, however, attempt to isolate a write operation that affects multiple documents
using the isolation operator.
To isolate a sequence of write operations from other read and write operations, see Perform Two Phase Commits
(page 69).
46
Write Concern
Write concern describes the guarantee that MongoDB provides when reporting on the success of a write operation.
The strength of the write concerns determine the level of guarantee. When inserts, updates and deletes have a weak
write concern, write operations return quickly. In some failure cases, write operations issued with weak write concerns
may not persist. With stronger write concerns, clients wait after sending a write operation for MongoDB to confirm
the write operations.
MongoDB provides different levels of write concern to better address the specific needs of applications. Clients
may adjust write concern to ensure that the most important operations persist successfully to an entire MongoDB
deployment. For other less critical operations, clients can adjust the write concern to ensure faster performance rather
than ensure persistence to the entire deployment.
See also:
Write Concern Reference (page 83) for a reference of specific write concern configuration. Also consider Write
Operations (page 42) for a general overview of write operations with MongoDB and Write Concern for Replica Sets
(page 402) for considerations specific to replica sets.
Note: The driver write concern (page 634) change created a new connection class in all of the MongoDB drivers.
The new class, called MongoClient, changed the default write concern. See the release notes (page 634) for this
change and the release notes for your driver.
Clients issue write operations with some level of write concern. MongoDB has the following levels of conceptual
write concern, listed from weakest to strongest:
Errors Ignored With an errors ignored write concern, MongoDB does not acknowledge write operations. With
this level of write concern, the client cannot detect failed write operations. These errors include connection errors and
mongod exceptions such as duplicate key exceptions for unique indexes (page 334). Although the errors ignored write
concern provides fast performance, this performance gain comes at the cost of significant risks for data persistence
and durability.
To set errors ignored write concern, specify w values of -1 to your driver.
Warning: Do not use errors ignored write concern in normal operation.
Unacknowledged With an unacknowledged write concern, MongoDB does not acknowledge the receipt of write
operation. Unacknowledged is similar to errors ignored; however, drivers attempt to receive and handle network
errors when possible. The drivers ability to detect network errors depends on the systems networking configuration.
To set unacknowledged write concern, specify w values of 0 to your driver.
Before the releases outlined in Default Write Concern Change (page 634), this was the default write concern.
Acknowledged With a receipt acknowledged write concern, the mongod confirms the receipt of the write operation.
Acknowledged write concern allows clients to catch network, duplicate key, and other errors.
To set acknowledged write concern, specify w values of 1 to your driver.
MongoDB uses acknowledged write concern by default, after the releases outlined in Default Write Concern Change
(page 634).
47
Figure 2.19: Write operation to a mongod instance with write concern of unacknowledged. The client does not
wait for any acknowledgment.
Figure 2.20: Write operation to a mongod instance with write concern of acknowledged. The client waits for
acknowledgment of success or exception.
48
Internally, the default write concern calls getLastError with no arguments. For replica sets, you can define the
default write concern settings in the getLastErrorDefaults (page 470). When getLastErrorDefaults
(page 470) does not define a default write concern setting, getLastError defaults to basic receipt acknowledgment.
Journaled With a journaled write concern, the mongod acknowledges the write operation only after committing
the data to the journal. This write concern ensures that MongoDB can recover the data following a shutdown or power
interruption.
To set a journaled write concern, specify w values of 1 and set the journal or j option to true for your driver. You
must have journaling enabled to use this write concern.
With a journaled write concern, write operations must wait for the next journal commit. To reduce latency
for these operations, you can increase the frequency that MongoDB commits operations to the journal. See
journalCommitInterval for more information.
Figure 2.21: Write operation to a mongod instance with write concern of journaled. The mongod sends acknowledgment after it commits the write operation to the journal.
Note: Requiring journaled write concern in a replica set only requires a journal commit of the write operation to the
primary of the set regardless of the level of replica acknowledged write concern.
Replica Acknowledged Replica sets add several considerations for write concern. Basic write concerns affect write
operations on only one mongod instance. The w argument to getLastError provides replica acknowledged write
concerns. With replica acknowledged you can guarantee that the write operation propagates to the members of a
replica set. See Write Concern Reference (page 83) document for the values for w and Write Concern for Replica Sets
(page 402) for more information.
To set replica acknowledged write concern, specify w values greater than 1 to your driver.
Note: Requiring journaled write concern in a replica set only requires a journal commit of the write operation to the
primary of the set regardless of the level of replica acknowledged write concern.
2.2. MongoDB CRUD Concepts
49
Figure 2.22: Write operation to a replica set with write concern level of w:2 or write to the primary and at least one
secondary.
50
For sharded collections in a sharded cluster, the mongos directs write operations from applications to the shards that
are responsible for the specific portion of the data set. The mongos uses the cluster metadata from the config database
(page 488) to route the write operation to the appropriate shards.
51
Figure 2.24: Diagram of the shard key value space segmented into smaller ranges or chunks.
If the value of the shard key increases or decreases with every insert, all insert operations target a single shard. As a
result, the capacity of a single shard becomes the limit for the insert capacity of the sharded cluster.
For more information, see Sharded Cluster Tutorials (page 506) and Bulk Inserts in MongoDB (page 56).
Write Operations on Replica Sets
In replica sets, all write operations go to the sets primary, which applies the write operation then records the operations on the primarys operation log or oplog. The oplog is a reproducible sequence of operations to the data set.
Secondary members of the set are continuously replicating the oplog and applying the operations to themselves in an
asynchronous process.
Large volumes of write operations, particularly bulk operations, may create situations where the secondary members
have difficulty applying the replicating operations from the primary at a sufficient rate: this can cause the secondarys
state to fall behind that of the primary. Secondaries that are significantly behind the primary present problems for
normal operation of the replica set, particularly failover (page 397) in the form of rollbacks (page 401) as well as
general read consistency (page 402).
To help avoid this issue, you can customize the write concern (page 47) to return confirmation of the write operation
to another member 4 of the replica set every 100 or 1,000 operations. This provides an opportunity for secondaries
to catch up with the primary. Write concern can slow the overall progress of write operations but ensure that the
secondaries can maintain a largely current state with respect to the primary.
For more information on replica sets and write operations, see Replica Acknowledged (page 49), Oplog Size (page 411),
and Change the Size of the Oplog (page 445).
Write Operation Performance
Indexes
After every insert, update, or delete operation, MongoDB must update every index associated with the collection in
addition to the data itself. Therefore, every index on a collection adds some amount of overhead for the performance
of write operations. 5
4 Calling getLastError intermittently with a w value of 2 or majority will slow the throughput of write traffic; however, this practice will
allow the secondaries to remain current with the state of the primary.
5 For inserts and updates to un-indexed fields, the overhead for sparse indexes (page 335) is less than for non-sparse indexes. Also for non-sparse
indexes, updates that do not change the record size have less indexing overhead.
52
Figure 2.25: Diagram of default routing of reads and writes to the primary.
53
Figure 2.26: Write operation to a replica set with write concern level of w:2 or write to the primary and at least one
secondary.
54
In general, the performance gains that indexes provide for read operations are worth the insertion penalty. However,
in order to optimize write performance when possible, be careful when creating new indexes and evaluate the existing
indexes to ensure that your queries actually use these indexes.
For indexes and queries, see Query Optimization (page 36). For more information on indexes, see Indexes (page 313)
and Indexing Strategies (page 368).
Document Growth
If an update operation causes a document to exceed the currently allocated record size, MongoDB relocates the document on disk with enough contiguous space to hold the document. These relocations take longer than in-place updates,
particularly if the collection has indexes. If a collection has indexes, MongoDB must update all index entries. Thus,
for a collection with many indexes, the move will impact the write throughput.
Some update operations, such as the $inc operation, do not cause an increase in document size. For these update
operations, MongoDB can apply the updates in-place. Other update operations, such as the $push operation, change
the size of the document.
In-place-updates are significantly more efficient than updates that cause document growth. When possible, use data
models (page 99) that minimize the need for document growth.
See Record Padding (page 57) for more information.
Storage Performance
Hardware The capability of the storage system creates some important physical limits for the performance of MongoDBs write operations. Many unique factors related to the storage system of the drive affect write performance,
including random access patterns, disk caches, disk readahead and RAID configurations.
Solid state drives (SSDs) can outperform spinning hard disks (HDDs) by 100 times or more for random workloads.
See
Production Notes (page 153) for recommendations regarding additional hardware and configuration options.
Journaling MongoDB uses write ahead logging to an on-disk journal to guarantee write operation (page 42) durability and to provide crash resiliency. Before applying a change to the data files, MongoDB writes the change operation
to the journal.
While the durability assurance provided by the journal typically outweigh the performance costs of the additional write
operations, consider the following interactions between the journal and performance:
if the journal and the data file reside on the same block device, the data files and the journal may have to contend
for a finite number of available write operations. Moving the journal to a separate device may increase the
capacity for write operations.
if applications specify write concern (page 47) that includes journaled (page 49), mongod will decrease the
duration between journal commits, which can increases the overall write load.
the duration between journal commits is configurable using the journalCommitInterval run-time option.
Decreasing the period between journal commits will increase the number of write operations, which can limit
MongoDBs capacity for write operations. Increasing the amount of time between commits may decrease the
total number of write operation, but also increases the chance that the journal will not record a write operation
in the event of a failure.
For additional information on journaling, see Journaling Mechanics (page 234).
55
The insert() method, when passed an array of documents, performs a bulk insert, and inserts each document
atomically. Bulk inserts can significantly increase performance by amortizing write concern (page 47) costs.
New in version 2.2: insert() in the mongo shell gained support for bulk inserts in version 2.2.
In the drivers (page 95), you can configure write concern for batches rather than on a per-document level.
Drivers have a ContinueOnError option in their insert operation, so that the bulk operation will continue to insert
remaining documents in a batch even if an insert fails.
Note: If multiple errors occur during a bulk insert, clients only receive the last error generated.
See also:
Driver documentation (page 95) for details on performing bulk inserts in your application. Also see Import and Export
MongoDB Data (page 150).
Bulk Inserts on Sharded Clusters
While ContinueOnError is optional on unsharded clusters, all bulk operations to a sharded collection run with
ContinueOnError, which cannot be disabled.
Large bulk insert operations, including initial data inserts or routine data import, can affect sharded cluster performance. For bulk inserts, consider the following strategies:
Pre-Split the Collection If the sharded collection is empty, then the collection has only one initial chunk, which
resides on a single shard. MongoDB must then take time to receive data, create splits, and distribute the split chunks
to the available shards. To avoid this performance cost, you can pre-split the collection, as described in Split Chunks
in a Sharded Cluster (page 537).
Insert to Multiple mongos To parallelize import processes, send insert operations to more than one mongos
instance. Pre-split empty collections first as described in Split Chunks in a Sharded Cluster (page 537).
Avoid Monotonic Throttling If your shard key increases monotonically during an insert, then all inserted data goes
to the last chunk in the collection, which will always end up on a single shard. Therefore, the insert capacity of the
cluster will never exceed the insert capacity of that single shard.
If your insert volume is larger than what a single shard can process, and if you cannot avoid a monotonically increasing
shard key, then consider the following modifications to your application:
Reverse the binary bits of the shard key. This preserves the information and avoids correlating insertion order
with increasing sequence of values.
Swap the first and last 16-bit words to shuffle the inserts.
56
Example
The following example, in C++, swaps the leading and trailing 16-bit word of BSON ObjectIds generated so that they
are no longer monotonically increasing.
using namespace mongo;
OID make_an_id() {
OID x = OID::gen();
const unsigned char *p = x.getData();
swap( (unsigned short&) p[0], (unsigned short&) p[10] );
return x;
}
void foo() {
// create an object
BSONObj o = BSON( "_id" << make_an_id() << "x" << 3 << "name" << "jane" );
// now we may insert o into a sharded collection
}
See also:
Shard Keys (page 492) for information on choosing a sharded key. Also see Shard Key Internals (page 492) (in
particular, Choosing a Shard Key (page 511)).
Record Padding
Update operations can increase the size of the document 6 . If a document outgrows its current allocated record space,
MongoDB must allocate a new space and move the document to this new location.
To reduce the number of moves, MongoDB includes a small amount of extra space, or padding, when allocating the
record space. This padding reduces the likelihood that a slight increase in document size will cause the document to
exceed its allocated record size.
See also:
Write Operation Performance (page 52).
Padding Factor
To minimize document movements and their impact, MongoDB employs padding. MongoDB adaptively adjusts the
size of record allocations in a collection by adding a paddingFactor so that the documents have room to grow.
The paddingFactor indicates the padding for new inserts and moves.
To check the current paddingFactor on a collection, you can run the db.collection.stats() operation in
the mongo shell, as in the following example:
db.myCollection.stats()
Since MongoDB writes each document at a different point in time, the padding for each document will not be the
same. You can calculate the padding size by subtracting 1 from the paddingFactor, for example:
padding size = (paddingFactor - 1) * <document size>.
For example, a paddingFactor of 1.0 specifies no padding whereas a paddingFactor of 1.5 specifies a padding
size of 0.5 or 50 percent (50%) of the document size.
6
Documents in MongoDB can grow up to the full maximum BSON document size.
57
Because the paddingFactor is relative to the size of each document, you cannot calculate the exact amount of
padding for a collection based on the average document size and padding factor.
If an update operation causes the document to decrease in size, for instance if you perform an $unset or a $pop
update, the document remains in place and effectively has more padding. If the document remains this size, the space
is not reclaimed until you perform a compact or a repairDatabase operation.
Operations That Remove Padding
The following operations remove padding: compact, repairDatabase, and initial replica sync operations. However, with the compact command, you can run the command with a paddingFactor or a paddingBytes
parameter. See compact command for details.
Padding is also removed if you use mongoexport a collection. If you use mongoimport into a new collection, mongoimport will not add padding. If you use mongoimport with an existing collection with padding,
mongoimport will not affect the existing padding.
When a database operation removes padding from a collection, subsequent updates to the collection that increase the
record size will have reduced throughput until the collections padding factor grows. However, the collection will
require less storage.
Record Allocation Strategies
58
Isolate Sequence of Operations (page 76) Use the <isolation> isolated operator to isolate a single write
operation that affects multiple documents, preventing other operations from interrupting the sequence of write
operations.
Create an Auto-Incrementing Sequence Field (page 78) Describes how to create an incrementing sequence number
for the _id field using a Counters Collection or an Optimistic Loop.
Limit Number of Elements in an Array after an Update (page 81) Use $push with various modifiers to sort and
maintain an array of fixed size after update
In the example, the document has a user-specified _id field value of 10. The value must be unique within the
inventory collection.
For more examples, see insert().
Insert a Document with update() Method
Call the update() method with the upsert flag to create a new document if no document matches the updates
query criteria. 7
The following example creates a new document if no document in the inventory collection contains { type:
"books", item : "journal" }:
db.inventory.update(
{ type: "book", item : "journal" },
{ $set : { qty: 10 } },
{ upsert : true }
)
MongoDB adds the _id field and assigns as its value a unique ObjectId. The new document includes the item and
type fields from the <query> criteria and the qty field from the <update> parameter.
{ "_id" : ObjectId("51e8636953dbe31d5f34a38a"), "item" : "journal", "qty" : 10, "type" : "book" }
59
MongoDB adds the _id field and assigns as its value a unique ObjectId.
{ "_id" : ObjectId("51e866e48737f72b32ae4fbc"), "type" : "book", "item" : "notebook", "qty" : 40 }
The
This tutorial provides examples of read operations using the db.collection.find() method in the mongo
shell. In these examples, the retrieved documents contain all their fields. To restrict the fields to return in the retrieved
documents, see Limit Fields to Return from a Query (page 64).
Select All Documents in a Collection
An empty query document ({}) selects all documents in the collection:
db.inventory.find( {} )
Not specifying a query document to the find() is equivalent to specifying an empty query document. Therefore the
following operation is equivalent to the previous operation:
db.inventory.find()
The following example retrieves from the inventory collection all documents where the type field has the value
snacks:
db.inventory.find( { type: "snacks" } )
60
Internally, the
Although you can express this query using the $or operator, use the $in operator rather than the $or operator when
performing equality checks on the same field.
Refer to the http://docs.mongodb.org/manualreference/operator document for the complete list
of query operators.
Specify AND Conditions
A compound query can specify conditions for more than one field in the collections documents. Implicitly, a logical
AND conjunction connects the clauses of a compound query so that the query selects the documents in the collection
that match all the conditions.
In the following example, the query document specifies an equality match on the field food and a less than ($lt)
comparison match on the field price:
db.inventory.find( { type: 'food', price: { $lt: 9.95 } } )
This query selects all documents where the type field has the value food and the value of the price field is less
than 9.95. See comparison operators for other comparison operators.
Specify OR Conditions
Using the $or operator, you can specify a compound query that joins each clause with a logical OR conjunction so
that the query selects the documents in the collection that match at least one condition.
In the following example, the query document selects all documents in the collection where the field qty has a value
greater than ($gt) 100 or the value of the price field is less than ($lt) 9.95:
db.inventory.find(
{ $or: [
{ qty: { $gt: 100 } },
{ price: { $lt: 9.95 } }
]
}
)
Subdocuments
When the field holds an embedded document (i.e. subdocument), you can either specify the entire subdocument as
the value of a field, or reach into the subdocument using dot notation, to specify values for individual fields in the
subdocument:
2.3. MongoDB CRUD Tutorials
61
To specify an equality match on the whole subdocument, use the query document { <field>: <value> }
where <value> is the subdocument to match. Equality matches on a subdocument require that the subdocument
field match exactly the specified <value>, including the field order.
In the following example, the query matches all documents where the value of the field producer is a subdocument that contains only the field company with the value ABC123 and the field address with the value 123
Street, in the exact order:
db.inventory.find(
{
producer: {
company: 'ABC123',
address: '123 Street'
}
}
)
Equality matches for specific fields within subdocuments select the documents in the collection when the field in the
subdocument contains a field that matches the specified value.
In the following example, the query uses the dot notation to match all documents where the value of the field
producer is a subdocument that contains a field company with the value ABC123 and may contain other fields:
db.inventory.find( { 'producer.company': 'ABC123' } )
Arrays
When the field holds an array, you can query for an exact array match or for specific values in the array. If the array
holds sub-documents, you can query for specific fields within the sub-documents using dot notation:
Exact Match on an Array
To specify equality match on an array, use the query document { <field>: <value> } where <value> is
the array to match. Equality matches on the array require that the array field match exactly the specified <value>,
including the element order.
In the following example, the query matches all documents where the value of the field tags is an array that holds
exactly three elements, fruit, food, and citrus, in this order:
db.inventory.find( { tags: [ 'fruit', 'food', 'citrus' ] } )
Equality matches can specify a single element in the array to match. These specifications match if the array contains
at least one element with the specified value.
In the following example, the query matches all documents where the value of the field tags is an array that contains
fruit as one of its elements:
62
Equality matches can specify equality matches for an element at a particular index or position of the array.
In the following example, the query uses the dot notation to match all documents where the value of the tags field is
an array whose first element equals fruit:
db.inventory.find( { 'tags.0' : 'fruit' } )
Array of Subdocuments
Match a Field in the Subdocument Using the Array Index If you know the array index of the subdocument, you
can specify the document using the subdocuments position.
The following example selects all documents where the memos contains an array whose first element (i.e. index is 0)
is a subdocument with the field by with the value shipping:
db.inventory.find( { 'memos.0.by': 'shipping' } )
Match a Field Without Specifying Array Index If you do not know the index position of the subdocument, concatenate the name of the field that contains the array, with a dot (.) and the name of the field in the subdocument.
The following example selects all documents where the memos field contains an array that contains at least one
subdocument with the field by with the value shipping:
db.inventory.find( { 'memos.by': 'shipping' } )
Match Multiple Fields To match by multiple fields in the subdocument, you can use either dot notation or the
$elemMatch operator:
The following example uses dot notation to query for documents where the value of the memos field is an array that has
at least one subdocument that contains the field memo equal to on time and the field by equal to shipping:
db.inventory.find(
{
'memos.memo': 'on time',
'memos.by': 'shipping'
}
)
The following example uses $elemMatch to query for documents where the value of the memos field is an array that has at least one subdocument that contains the field memo equal to on time and the field by equal to
shipping:
db.inventory.find( {
memos: {
$elemMatch: {
memo : 'on time',
by: 'shipping'
}
}
}
)
63
This operation will return all documents in the inventory collection where the value of the type field is food.
The returned documents contain all its fields.
Return the Specified Fields and the _id Field Only
A projection can explicitly include several fields. In the following operation, find() method returns all documents
that match the query. In the result set, only the item and qty fields and, by default, the _id field return in the
matching documents.
db.inventory.find( { type: 'food' }, { item: 1, qty: 1 } )
This operation returns all documents that match the query. In the result set, only the item and qty fields return in
the matching documents.
Return All But the Excluded Field
To exclude a single field or group of fields you can use a projection in the following form:
db.inventory.find( { type: 'food' }, { type:0 } )
This operation returns all documents where the value of the type field is food. In the result set, the type field does
not return in the matching documents.
With the exception of the _id field you cannot combine inclusion and exclusion statements in projection documents.
64
You can also use the cursor method next() to access the documents, as in the following example:
var myCursor = db.inventory.find( { type: 'food' } );
var myDocument = myCursor.hasNext() ? myCursor.next() : null;
if (myDocument) {
var myItem = myDocument.item;
print(tojson(myItem));
}
As an alternative print operation, consider the printjson() helper method to replace print(tojson()):
if (myDocument) {
var myItem = myDocument.item;
printjson(myItem);
}
You can use the cursor method forEach() to iterate the cursor and access the documents, as in the following
example:
var myCursor =
myCursor.forEach(printjson);
See JavaScript cursor methods and your driver (page 95) documentation for more information on cursor methods.
9
You can use the DBQuery.shellBatchSize to change the number of iteration from the default value 20. See Executing Queries
(page 213) for more information.
65
Iterator Index
In the mongo shell, you can use the toArray() method to iterate the cursor and return the documents in an array,
as in the following:
var myCursor = db.inventory.find( { type: 'food' } );
var documentArray = myCursor.toArray();
var myDocument = documentArray[3];
The toArray() method loads into RAM all documents returned by the cursor; the toArray() method exhausts
the cursor.
Additionally, some drivers (page 95) provide access to the documents by using an index on the cursor (i.e.
cursor[index]). This is a shortcut for first calling the toArray() method and then using an index on the
resulting array.
Consider the following example:
var myCursor = db.inventory.find( { type: 'food' } );
var myDocument = myCursor[3];
66
"indexBounds" : { "type" : [
[ "food",
"food" ]
] },
"server" : "mongodbo0.example.net:27017" }
The BtreeCursor value of the cursor field indicates that the query used an index.
This query returned 5 documents, as indicated by the n field.
To return these 5 documents, the query scanned 5 documents from the index, as indicated by the nscanned field,
and then read 5 full documents from the collection, as indicated by the nscannedObjects field.
Without the index, the query would have scanned the whole collection to return the 5 documents.
See explain-results method for full details on the output.
These return the statistics regarding the execution of the query using the respective index.
Note: If you run explain() without including hint(), the query optimizer reevaluates the query and runs against
multiple indexes before returning the query statistics.
For more detail on the explain output, see explain-results.
This shows the syntax for MongoDB 2.2 and later. For syntax for versions prior to 2.2, see update().
67
db.inventory.update(
{ type : "book" },
{ $inc : { qty : -1 } },
{ multi: true }
)
To remove all documents from a collection, it may be more efficient to use the drop() method to drop the entire
collection, including the indexes, and then recreate the collection and rebuild the indexes.
Remove Documents that Match a Condition
To remove the documents that match a deletion criteria, call the remove() method with the <query> parameter.
The following example removes all documents from the inventory collection where the type field equals food:
db.inventory.remove( { type : "food" } )
For large deletion operations, it may be more efficient to copy the documents that you want to keep to a new collection
and then use drop() on the original collection.
68
To delete a single document sorted by some specified order, use the findAndModify() method.
Pattern
Overview
The most common example of transaction is to transfer funds from account A to B in a reliable way, and this pattern
uses this operation as an example. In a relational database system, this operation would encapsulate subtracting funds
69
from the source (A) account and adding them to the destination (B) within a single atomic transaction. For MongoDB,
you can use a two-phase commit in these situations to achieve a compatible response.
All of the examples in this document use the mongo shell to interact with the database, and assume that you have two
collections: First, a collection named accounts that will store data about accounts with one account per document,
and a collection named transactions which will store the transactions themselves.
Begin by creating two accounts named A and B, with the following command:
db.accounts.save({name: "A", balance: 1000, pendingTransactions: []})
db.accounts.save({name: "B", balance: 1000, pendingTransactions: []})
Transaction Description
Set Transaction State to Initial Create the transaction collection by inserting the following document. The
transaction document holds the source and destination, which refer to the name fields of the accounts
collection, as well as the value field that represents the amount of data change to the balance field. Finally, the
state field reflects the current state of the transaction.
db.transactions.save({source: "A", destination: "B", value: 100, state: "initial"})
Switch Transaction State to Pending Before modifying either records in the accounts collection, set the transaction state to pending from initial.
Set the local variable t in your shell session, to the transaction document using findOne():
t = db.transactions.findOne({state: "initial"})
After assigning this variable t, the shell will return the value of t, you will see the following output:
{
"_id" : ObjectId("4d7bc7a8b8a04f5126961522"),
"source" : "A",
"destination" : "B",
"value" : 100,
"state" : "initial"
}
70
The find() operation will return the contents of the transactions collection, which should resemble the following:
Apply Transaction to Both Accounts Continue by applying the transaction to both accounts. The update()
query will prevent you from applying the transaction if the transaction is not already marked as pending. Use the
following update() operation:
The find() operation will return the contents of the accounts collection, which should now resemble the following:
The find() operation will return the contents of the transactions collection, which should now resemble the
following:
Remove Pending Transaction Use the following update() operation to set remove the pending transaction from
the documents in the accounts collection:
db.accounts.update({name: t.source}, {$pull: {pendingTransactions: t._id}})
db.accounts.update({name: t.destination}, {$pull: {pendingTransactions: t._id}})
db.accounts.find()
The find() operation will return the contents of the accounts collection, which should now resemble the following:
Set Transaction State to Done Complete the transaction by setting the state of the transaction document to done:
db.transactions.update({_id: t._id}, {$set: {state: "done"}})
db.transactions.find()
The find() operation will return the contents of the transactions collection, which should now resemble the
following:
71
The most important part of the transaction procedure is not, the prototypical example above, but rather the possibility
for recovering from the various failure scenarios when transactions do not complete as intended. This section will
provide an overview of possible failures and provide methods to recover from these kinds of events.
There are two classes of failures:
all failures that occur after the first step (i.e. setting the transaction set to initial (page 70)) but before the third
step (i.e. applying the transaction to both accounts (page 71).)
To recover, applications should get a list of transactions in the pending state and resume from the second step
(i.e. switching the transaction state to pending (page 70).)
all failures that occur after the third step (i.e. applying the transaction to both accounts (page 71)) but before
the fifth step (i.e. setting the transaction state to done (page 71).)
To recover, application should get a list of transactions in the committed state and resume from the fourth
step (i.e. remove the pending transaction (page 71).)
Thus, the application will always be able to resume the transaction and eventually arrive at a consistent state. Run
the following recovery operations every time the application starts to catch any unfinished transactions. You may also
wish run the recovery operation at regular intervals to ensure that your data remains in a consistent state.
The time required to reach a consistent state depends, on how long the application needs to recover each transaction.
Rollback In some cases you may need to rollback or undo a transaction when the application needs to cancel
the transaction, or because it can never recover as in cases where one of the accounts doesnt exist, or stops existing
during the transaction.
There are two possible rollback operations:
1. After you apply the transaction (page 71) (i.e. the third step), you have fully committed the transaction and you
should not roll back the transaction. Instead, create a new transaction and switch the values in the source and
destination fields.
2. After you create the transaction (page 70) (i.e. the first step), but before you apply the transaction (page 71) (i.e
the third step), use the following process:
Set Transaction State to Canceling
update() operation:
Use the following sequence of operations to undo the transaction operation from both ac-
The find() operation will return the contents of the accounts collection, which should resemble the following:
72
Set Transaction State to Canceled Finally, use the following update() operation to set the transactions state to
canceled:
db.transactions.update({_id: t._id}, {$set: {state: "canceled"}})
Multiple Applications Transactions exist, in part, so that several applications can create and run operations concurrently without causing data inconsistency or conflicts. As a result, it is crucial that only one 1 application can handle
a given transaction at any point in time.
Consider the following example, with a single transaction (i.e. T1) and two applications (i.e. A1 and A2). If both
applications begin processing the transaction which is still in the initial state (i.e. step 1 (page 70)), then:
A1 can apply the entire whole transaction before A2 starts.
A2 will then apply T1 for the second time, because the transaction does not appear as pending in the accounts
documents.
To handle multiple applications, create a marker in the transaction document itself to identify the application that is
handling the transaction. Use findAndModify() method to modify the transaction:
t = db.transactions.findAndModify({query: {state: "initial", application: {$exists: 0}},
update: {$set: {state: "pending", application: "A1"}},
new: true})
When you modify and reassign the local shell variable t, the mongo shell will return the t object, which should
resemble the following:
{
"_id" : ObjectId("4d7be8af2c10315c0847fc85"),
"application" : "A1",
"destination" : "B",
"source" : "A",
"state" : "pending",
"value" : 150
}
Amend the transaction operations to ensure that only applications that match the identifier in the value of the
application field before applying the transaction.
If the application A1 fails during transaction execution, you can use the recovery procedures (page 72), but applications
should ensure that they owns the transaction before applying the transaction. For example to resume pending jobs,
use a query that resembles the following:
db.transactions.find({application: "A1", state: "pending"})
This will (or may) return a document from the transactions document that resembles the following:
73
when your application switches the transaction state to pending (page 70) (i.e. step 2) it would also make sure
that the account has sufficient funds for the transaction. During this update operation, the application would also
modify the values of the credits and debits as well as adding the transaction as pending.
when your application removes the pending transaction (page 71) (i.e. step 4) the application would apply the
transaction on balance, modify the credits and debits as well as removing the transaction from the pending
field., all in one update.
Because all of the changes in the above two operations occur within a single update() operation, these changes are
all atomic.
Additionally, for most important transactions, ensure that:
the database interface (i.e. client library or driver) has a reasonable write concern configured to ensure that
operations return a response on the success or failure of a write operation.
your mongod instance has journaling enabled to ensure that your data is always in a recoverable state, in the
event of an unclean mongod shutdown.
74
C++ Example
The tail function uses a tailable cursor to output the results from a query to a capped collection:
The function handles the case of the dead cursor by having the query be inside a loop.
To periodically check for new data, the cursor->more() statement is also inside a loop.
#include "client/dbclient.h"
using namespace mongo;
/*
* Example of a tailable cursor.
* The function "tails" the capped collection (ns) and output elements as they are added.
* The function also handles the possibility of a dead cursor by tracking the field 'insertDate'.
* New documents are added with increasing values of 'insertDate'.
*/
void tail(DBClientBase& conn, const char *ns) {
BSONElement lastValue = minKey.firstElement();
Query query = Query().hint( BSON( "$natural" << 1 ) );
while ( 1 ) {
auto_ptr<DBClientCursor> c =
conn.query(ns, query, 0, 0, 0,
QueryOption_CursorTailable | QueryOption_AwaitData );
while ( 1 ) {
if ( !c->more() ) {
if ( c->isDead() ) {
break;
}
continue;
}
BSONObj o = c->next();
lastValue = o["insertDate"];
cout << o.toString() << endl;
}
query = QUERY( "insertDate" << GT << lastValue ).hint( BSON( "$natural" << 1 ) );
}
}
75
auto_ptr<DBClientCursor> c =
conn.query(ns, query, 0, 0, 0,
QueryOption_CursorTailable | QueryOption_AwaitData );
* Loop through the outer while (1) loop to re-query with the new query condition and repeat.
See also:
Detailed blog post on tailable cursor12
76
Create a unique index (page 334), to ensure that a key doesnt exist when you insert it.
Update if Current
In this pattern, you will:
query for a document,
modify the fields in that document
and update the fields of a document only if the fields have not changed in the collection since the query.
Consider the following example in JavaScript which attempts to update the qty field of a document in the products
collection:
var myCollection = db.products;
var myDocument = myCollection.findOne( { sku: 'abc123' } );
if (myDocument) {
var oldQty = myDocument.qty;
if (myDocument.qty < 10) {
myDocument.qty *= 4;
} else if ( myDocument.qty < 20 ) {
myDocument.qty *= 3;
} else {
myDocument.qty *= 2;
}
myCollection.update(
{
_id: myDocument._id,
qty: oldQty
},
{
$set: { qty: myDocument.qty }
}
)
var err = db.getLastErrorObj();
if ( err && err.code ) {
print("unexpected error updating document: " + tojson( err ));
} else if ( err.n == 0 ) {
print("No update: no matching document for { _id: " + myDocument._id + ", qty: " + oldQty + " }")
}
}
Your application may require some modifications of this pattern, such as:
Use the entire document as the query in the update() operation, to generalize the operation and guarantee
that the original document was not modified, rather than ensuring that as single field was not changed.
Add a version variable to the document that applications increment upon each update operation to the documents.
Use this version variable in the query expression. You must be able to ensure that all clients that connect to your
database obey this constraint.
Use $set in the update expression to modify only your fields and prevent overriding other fields.
77
Use one of the methods described in Create an Auto-Incrementing Sequence Field (page 78).
A Counters Collection
Use a separate counters collection to track the last number sequence used. The _id field contains the sequence
name and the seq field contains the last value of the sequence.
1. Insert into the counters collection, the initial value for the userid:
db.counters.insert(
{
_id: "userid",
seq: 0
}
)
2. Create a getNextSequence function that accepts a name of the sequence. The function uses the
findAndModify() method to atomically increment the seq value and return this new value:
function getNextSequence(name) {
var ret = db.counters.findAndModify(
{
query: { _id: name },
update: { $inc: { seq: 1 } },
new: true
}
);
return ret.seq;
}
78
db.users.insert(
{
_id: getNextSequence("userid"),
name: "Bob D."
}
)
Note: When findAndModify() includes the upsert: true option and the query field(s) is not uniquely
indexed, the method could insert a document multiple times in certain circumstances. For instance, if multiple clients
each invoke the method with the same query condition and these methods complete the find phase before any of
methods perform the modify phase, these methods could insert the same document.
In the counters collection example, the query field is the _id field, which always has a unique index. Consider
that the findAndModify() includes the upsert: true option, as in the following modified example:
function getNextSequence(name) {
var ret = db.counters.findAndModify(
{
query: { _id: name },
update: { $inc: { seq: 1 } },
new: true,
upsert: true
}
);
return ret.seq;
}
If multiple clients were to invoke the getNextSequence() method with the same name parameter, then the
methods would observe one of the following behaviors:
Exactly one findAndModify() would successfully insert a new document.
Zero or more findAndModify() methods would update the newly inserted document.
Zero or more findAndModify() methods would fail when they attempted to insert a duplicate.
If the method fails due to a unique index constraint violation, retry the method. Absent a delete of the document, the
retry should not fail.
79
Optimistic Loop
In this pattern, an Optimistic Loop calculates the incremented _id value and attempts to insert a document with the
calculated _id value. If the insert is successful, the loop ends. Otherwise, the loop will iterate through possible _id
values until the insert is successful.
1. Create a function named insertDocument that performs the insert if not present loop. The function wraps
the insert() method and takes a doc and a targetCollection arguments.
function insertDocument(doc, targetCollection) {
while (1) {
var cursor = targetCollection.find( {}, { _id: 1 } ).sort( { _id: -1 } ).limit(1);
var seq = cursor.hasNext() ? cursor.next()._id + 1 : 1;
doc._id = seq;
targetCollection.insert(doc);
var err = db.getLastErrorObj();
if( err && err.code ) {
if( err.code == 11000 /* dup key */ )
continue;
else
print( "unexpected error inserting data: " + tojson( err ) );
}
break;
}
}
80
insertDocument(
{
name: "Ted R."
},
myCollection
)
The while loop may iterate many times in collections with larger insert volumes.
Pattern
Consider the following document in the collection students:
{
_id: 1,
scores: [
{ attempt: 1, score: 10 },
{ attempt: 2 , score:8 }
]
}
81
the $slice modifier to keep the last 3 elements of the ordered array.
db.students.update(
{ _id: 1 },
{ $push: { scores: { $each : [
{ attempt: 3, score: 7 },
{ attempt: 4, score: 4 }
],
$sort: { score: 1 },
$slice: -3
}
}
}
)
Note: When using the $sort modifier on the array element, access the field in the subdocument element directly
instead of using the dot notation on the array field.
After the operation, the document contains the only the top 3 scores in the scores array:
{
"_id" : 1,
"scores" : [
{ "attempt" : 3, "score" : 7 },
{ "attempt" : 2, "score" : 8 },
{ "attempt" : 1, "score" : 10 }
]
}
See also:
$push operator,
$each modifier,
$sort modifier, and
$slice modifier.
82
Write concern describes the guarantee that MongoDB provides when reporting on the success of a write operation.
The strength of the write concerns determine the level of guarantee. When inserts, updates and deletes have a weak
write concern, write operations return quickly. In some failure cases, write operations issued with weak write concerns
may not persist. With stronger write concerns, clients wait after sending a write operation for MongoDB to confirm
the write operations.
MongoDB provides different levels of write concern to better address the specific needs of applications. Clients
may adjust write concern to ensure that the most important operations persist successfully to an entire MongoDB
deployment. For other less critical operations, clients can adjust the write concern to ensure faster performance rather
than ensure persistence to the entire deployment.
See also:
Write Concern (page 47) for an introduction to write concern in MongoDB.
Available Write Concern
To provide write concern, drivers (page 95) issue the getLastError command after a write operation and receive
a document with information about the last operation. This documents err field contains either:
null, which indicates the write operations have completed successfully, or
83
If you set journal to true, and the mongod does not have journaling enabled, as with nojournal, then
getLastError will provide basic receipt acknowledgment, and will include a jnote field in its return
document.
w option
This option provides the ability to disable write concern entirely as well as specifies the write concern operations
for replica sets. See Write Concern Considerations (page 47) for an introduction to the fundamental concepts
of write concern. By default, the w option is set to 1, which provides basic receipt acknowledgment on a single
mongod instance or on the primary in a replica set.
The w option takes the following values:
-1:
Disables all acknowledgment of write operations, and suppresses all errors, including network and socket
errors.
0:
Disables basic acknowledgment of write operations, but returns information about socket exceptions and
networking errors to the application.
Note: If you disable basic write operation acknowledgment but require journal commit acknowledgment,
the journal commit prevails, and the driver will require that mongod will acknowledge the write operation.
1:
Provides acknowledgment of write operations on a standalone mongod or the primary in a replica set.
A number greater than 1:
Guarantees that write operations have propagated successfully to the specified number of replica set members including the primary. If you set w to a number that is greater than the number of set members that
hold data, MongoDB waits for the non-existent members to become available, which means MongoDB
blocks indefinitely.
majority:
Confirms that write operations have propagated to the majority of configured replica set: a majority of the
sets configured members must acknowledge the write operation before it succeeds. This allows you to
avoid hard coding assumptions about the size of your replica set into your application.
A tag set:
By specifying a tag set (page 450) you can have fine-grained control over which replica set members must
acknowledge a write operation to satisfy the required level of write concern.
84
getLastError also supports a wtimeout setting which allows clients to specify a timeout for the write concern:
if you dont specify wtimeout, or if you give it a value of 0, and the mongod cannot fulfill the write concern the
getLastError will block, potentially forever.
For more information on write concern and replica sets, see Write Concern for Replica Sets (page 49) for more
information.
In sharded clusters, mongos instances will pass write concern on to the shard mongod instances.
SQL to MongoDB Mapping Chart
In addition to the charts that follow, you might want to consider the Frequently Asked Questions (page 555) section for
a selection of common questions about MongoDB.
Terminology and Concepts
The following table presents the various SQL terminology and concepts and the corresponding MongoDB terminology
and concepts.
SQL Terms/Concepts
database
table
row
column
index
table joins
primary key
Specify any unique column or column combination as
primary key.
aggregation (e.g. group by)
MongoDB Terms/Concepts
database
collection
document or BSON document
field
index
embedded documents and linking
primary key
In MongoDB, the primary key is automatically set to
the _id field.
aggregation pipeline
See the SQL to Aggregation Mapping Chart
(page 310).
Executables
The following table presents the MySQL/Oracle executables and the corresponding MongoDB executables.
Database Server
Database Client
MySQL/Oracle
mysqld/oracle
mysql/sqlplus
MongoDB
mongod
mongo
Examples
The following table presents the various SQL statements and the corresponding MongoDB statements. The examples
in the table assume the following conditions:
The SQL examples assume a table named users.
The MongoDB examples assume a collection named users that contain documents of the following prototype:
{
_id: ObjectId("509a8fb2f3f4948bd2f983a0"),
user_id: "abc123",
age: 55,
85
status: 'A'
}
Create and Alter The following table presents the various SQL statements related to table-level actions and the
corresponding MongoDB statements.
86
db.users.drop()
87
Insert The following table presents the various SQL statements related to inserting records into tables and the corresponding MongoDB statements.
SQL INSERT Statements
db.users.insert( {
user_id: "bcd001",
age: 45,
status: "A"
} )
Reference
See insert() for more information.
Select The following table presents the various SQL statements related to reading records from tables and the corresponding MongoDB statements.
88
SELECT *
FROM users
db.users.find()
Reference
See find() for more information.
SELECT *
FROM users
WHERE status = "A"
db.users.find(
{ status: "A" }
)
SELECT *
FROM users
WHERE status != "A"
db.users.find(
{ status: { $ne: "A" } }
)
SELECT *
FROM users
WHERE status = "A"
AND age = 50
db.users.find(
{ status: "A",
age: 50 }
)
SELECT *
FROM users
WHERE status = "A"
OR age = 50
SELECT *
FROM users
WHERE age > 25
db.users.find(
{ age: { $gt: 25 } }
)
SELECT *
FROM users
WHERE age < 25
db.users.find(
{ age: { $lt: 25 } }
)
SELECT *
FROM users
WHERE age > 25
AND
age <= 50
2.4.
MongoDB CRUD Reference
SELECT
*
FROM users
WHERE user_id like "%bc%"
db.users.find(
{ user_id: /bc/ }
)
Update Records The following table presents the various SQL statements related to updating existing records in
tables and the corresponding MongoDB statements.
SQL Update Statements
UPDATE users
SET status = "C"
WHERE age > 25
db.users.update(
{ age: { $gt: 25 } },
{ $set: { status: "C" } },
{ multi: true }
)
UPDATE users
SET age = age + 3
WHERE status = "A"
db.users.update(
{ status: "A" } ,
{ $inc: { age: 3 } },
{ multi: true }
)
Reference
See update(), $gt, and $set for
more information.
Delete Records The following table presents the various SQL statements related to deleting records from tables and
the corresponding MongoDB statements.
SQL Delete Statements
DELETE FROM users
WHERE status = "D"
DELETE FROM users
Reference
See remove() for more informadb.users.remove( { status: "D" } )
tion.
db.users.remove( )
90
},
{
"award" : "National Medal of Science",
"year" : 1975,
"by" : "National Science Foundation"
},
{
"award" : "Turing Award",
"year" : 1977,
"by" : "ACM"
},
{
"award" : "Draper Prize",
"year" : 1993,
"by" : "National Academy of Engineering"
}
]
}
{
"_id" : ObjectId("51df07b094c6acd67e492f41"),
"name" : {
"first" : "John",
"last" : "McCarthy"
},
"birth" : ISODate("1927-09-04T04:00:00Z"),
"death" : ISODate("2011-12-24T05:00:00Z"),
"contribs" : [
"Lisp",
"Artificial Intelligence",
"ALGOL"
],
"awards" : [
{
"award" : "Turing Award",
"year" : 1971,
"by" : "ACM"
},
{
"award" : "Kyoto Prize",
"year" : 1988,
"by" : "Inamori Foundation"
},
{
"award" : "National Medal of Science",
"year" : 1990,
"by" : "National Science Foundation"
}
]
}
{
"_id" : 3,
"name" : {
"first" : "Grace",
"last" : "Hopper"
},
"title" : "Rear Admiral",
91
"birth" : ISODate("1906-12-09T05:00:00Z"),
"death" : ISODate("1992-01-01T05:00:00Z"),
"contribs" : [
"UNIVAC",
"compiler",
"FLOW-MATIC",
"COBOL"
],
"awards" : [
{
"award" : "Computer Sciences Man of the Year",
"year" : 1969,
"by" : "Data Processing Management Association"
},
{
"award" : "Distinguished Fellow",
"year" : 1973,
"by" : " British Computer Society"
},
{
"award" : "W. W. McDowell Award",
"year" : 1976,
"by" : "IEEE Computer Society"
},
{
"award" : "National Medal of Technology",
"year" : 1991,
"by" : "United States"
}
]
}
{
"_id" : 4,
"name" : {
"first" : "Kristen",
"last" : "Nygaard"
},
"birth" : ISODate("1926-08-27T04:00:00Z"),
"death" : ISODate("2002-08-10T04:00:00Z"),
"contribs" : [
"OOP",
"Simula"
],
"awards" : [
{
"award" : "Rosing Prize",
"year" : 1999,
"by" : "Norwegian Data Association"
},
{
"award" : "Turing Award",
"year" : 2001,
"by" : "ACM"
},
{
"award" : "IEEE John von Neumann Medal",
"year" : 2001,
92
"by" : "IEEE"
}
]
}
{
"_id" : 5,
"name" : {
"first" : "Ole-Johan",
"last" : "Dahl"
},
"birth" : ISODate("1931-10-12T04:00:00Z"),
"death" : ISODate("2002-06-29T04:00:00Z"),
"contribs" : [
"OOP",
"Simula"
],
"awards" : [
{
"award" : "Rosing Prize",
"year" : 1999,
"by" : "Norwegian Data Association"
},
{
"award" : "Turing Award",
"year" : 2001,
"by" : "ACM"
},
{
"award" : "IEEE John von Neumann Medal",
"year" : 2001,
"by" : "IEEE"
}
]
}
{
"_id" : 6,
"name" : {
"first" : "Guido",
"last" : "van Rossum"
},
"birth" : ISODate("1956-01-31T05:00:00Z"),
"contribs" : [
"Python"
],
"awards" : [
{
"award" : "Award for the Advancement of Free Software",
"year" : 2001,
"by" : "Free Software Foundation"
},
{
"award" : "NLUUG Award",
"year" : 2003,
"by" : "NLUUG"
}
]
93
}
{
"_id" : ObjectId("51e062189c6ae665454e301d"),
"name" : {
"first" : "Dennis",
"last" : "Ritchie"
},
"birth" : ISODate("1941-09-09T04:00:00Z"),
"death" : ISODate("2011-10-12T04:00:00Z"),
"contribs" : [
"UNIX",
"C"
],
"awards" : [
{
"award" : "Turing Award",
"year" : 1983,
"by" : "ACM"
},
{
"award" : "National Medal of Technology",
"year" : 1998,
"by" : "United States"
},
{
"award" : "Japan Prize",
"year" : 2011,
"by" : "The Japan Prize Foundation"
}
]
}
{
"_id" : 8,
"name" : {
"first" : "Yukihiro",
"aka" : "Matz",
"last" : "Matsumoto"
},
"birth" : ISODate("1965-04-14T04:00:00Z"),
"contribs" : [
"Ruby"
],
"awards" : [
{
"award" : "Award for the Advancement of Free Software",
"year" : "2011",
"by" : "Free Software Foundation"
}
]
}
{
"_id" : 9,
"name" : {
"first" : "James",
"last" : "Gosling"
94
},
"birth" : ISODate("1955-05-19T04:00:00Z"),
"contribs" : [
"Java"
],
"awards" : [
{
"award" : "The Economist Innovation Award",
"year" : 2002,
"by" : "The Economist"
},
{
"award" : "Officer of the Order of Canada",
"year" : 2007,
"by" : "Canada"
}
]
}
{
"_id" : 10,
"name" : {
"first" : "Martin",
"last" : "Odersky"
},
"contribs" : [
"Scala"
]
}
See the following pages for more information about the MongoDB drivers14 :
JavaScript (Language Center15 , docs16 )
Python (Language Center17 , docs18 )
Ruby (Language Center19 , docs20 )
PHP (Language Center21 , docs22 )
13 http://docs.mongodb.org/ecosystem/drivers
14 http://docs.mongodb.org/ecosystem/drivers
15 http://docs.mongodb.org/ecosystem/drivers/javascript
16 http://api.mongodb.org/js/current
17 http://docs.mongodb.org/ecosystem/drivers/python
18 http://api.mongodb.org/python/current
19 http://docs.mongodb.org/ecosystem/drivers/ruby
20 http://api.mongodb.org/ruby/current
21 http://docs.mongodb.org/ecosystem/drivers/php
22 http://php.net/mongo/
95
Driver version numbers use semantic versioning39 or major.minor.patch versioning system. The first number is the
major version, the second the minor version, and the third indicates a patch.
Example
Driver version numbers.
If your driver has a version number of 2.9.1, 2 is the major version, 9 is minor, and 1 is the patch.
The numbering scheme for drivers differs from the scheme for the MongoDB server. For more information on server
versioning, see MongoDB Version Numbers (page 634).
23 http://docs.mongodb.org/ecosystem/drivers/perl
24 http://api.mongodb.org/perl/current/
25 http://docs.mongodb.org/ecosystem/drivers/java
26 http://api.mongodb.org/java/current
27 http://docs.mongodb.org/ecosystem/drivers/scala
28 http://api.mongodb.org/scala/casbah/current/
29 http://docs.mongodb.org/ecosystem/drivers/csharp
30 http://api.mongodb.org/csharp/current/
31 http://docs.mongodb.org/ecosystem/drivers/c
32 http://api.mongodb.org/c/current/
33 http://docs.mongodb.org/ecosystem/drivers/cpp
34 http://api.mongodb.org/cplusplus/current/
35 http://hackage.haskell.org/package/mongoDB
36 http://api.mongodb.org/haskell/mongodb
37 http://docs.mongodb.org/ecosystem/drivers/erlang
38 http://api.mongodb.org/erlang/mongodb
39 http://semver.org/
96
CHAPTER 3
Data Models
Data in MongoDB has a flexible schema. Collections do not enforce document structure. This flexibility gives you
data-modeling choices to match your application and its performance requirements.
Read the Data Modeling Introduction (page 97) document for a high level introduction to data modeling, and proceed
to the documents in the Data Modeling Concepts (page 99) section for additional documentation of the data model
design process. The Data Model Examples and Patterns (page 106) documents provide examples of different data
models. In addition, the MongoDB Use Case Studies1 provide overviews of application design and include example
data models with MongoDB.
Data Modeling Introduction (page 97) An introduction to data modeling in MongoDB.
Data Modeling Concepts (page 99) The core documentation detailing the decisions you must make when determining a data model, and discussing considerations that should be taken into account.
Data Model Examples and Patterns (page 106) Examples of possible data models that you can use to structure your
MongoDB documents.
Data Model Reference (page 121) Reference material for data modeling for developers of MongoDB applications.
97
References
References store the relationships between data by including links or references from one document to another. Applications can resolve these references (page 124) to access the related data. Broadly, these are normalized data models.
Figure 3.1: Data model using references to link documents. Both the contact document and the access document
contain a reference to the user document.
See Normalized Data Models (page 101) for the strengths and weaknesses of using references.
Embedded Data
Embedded documents capture relationships between data by storing related data in a single document structure. MongoDB documents make it possible to embed document structures as sub-documents in a field or array within a document. These denormalized data models allow applications to retrieve and manipulate related data in a single database
operation.
See Embedded Data Models (page 100) for the strengths and weaknesses of embedding sub-documents.
98
Figure 3.2: Data model with embedded fields that contain all related information.
However, schemas that facilitate atomic writes may limit ways that applications can use the data or may limit ways to
modify applications. The Atomicity Considerations (page 102) documentation describes the challenge of designing a
schema that balances flexibility and atomicity.
99
For a general introduction to data modeling in MongoDB, see the Data Modeling Introduction (page 97). For example
data models, see Data Modeling Examples and Patterns (page 106).
Data Model Design (page 100) Presents the different strategies that you can choose from when determining your data
model, their strengths and their weaknesses.
Operational Factors and Data Models (page 102) Details features you should keep in mind when designing your
data model, such as lifecycle management, indexing, horizontal scalability, and document growth.
GridFS (page 104) GridFS is a specification for storing documents that exceeds the BSON-document size limit of
16MB.
Figure 3.3: Data model with embedded fields that contain all related information.
Embedded data models allow applications to store related pieces of information in the same database record. As a
result, applications may need to issue fewer queries and updates to complete common operations.
In general, use embedded data models when:
you have contains relationships between entities. See Model One-to-One Relationships with Embedded Documents (page 106).
100
you have one-to-many relationships between entities. In these relationships the many or child documents
always appear with or are viewed in the context of the one or parent documents. See Model One-to-Many
Relationships with Embedded Documents (page 107).
In general, embedding provides better performance for read operations, as well as the ability to request and retrieve
related data in a single database operation. Embedded data models make it possible to update related data in a single
atomic write operation.
However, embedding related data in documents may lead to situations where documents grow after creation. Document growth can impact write performance and lead to data fragmentation. See Document Growth (page 102) for
details. Furthermore, documents in MongoDB must be smaller than the maximum BSON document size. For
bulk binary data, consider GridFS (page 104).
To interact with embedded documents, use dot notation to reach into embedded documents. See query for data
in arrays (page 62) and query data in sub-documents (page 61) for more examples on accessing data in arrays and
embedded documents.
Normalized Data Models
Normalized data models describe relationships using references (page 124) between documents.
Figure 3.4: Data model using references to link documents. Both the contact document and the access document
contain a reference to the user document.
In general, use normalized data models:
when embedding would result in duplication of data but would not provide sufficient read performance advantages to outweigh the implications of the duplication.
to represent more complex many-to-many relationships.
101
Document-level atomic operations include all operations within a single MongoDB document record: operations that affect multiple subdocuments within that single record are still atomic.
102
To distribute data and application traffic in a sharded collection, MongoDB uses the shard key (page 492). Selecting
the proper shard key (page 492) has significant implications for performance, and can enable or prevent query isolation
and increased write capacity. It is important to consider carefully the field or fields to use as the shard key.
See Sharding Introduction (page 479) and Shard Keys (page 492) for more information.
Indexes
Use indexes to improve performance for common queries. Build indexes on fields that appear often in queries and for
all operations that return sorted results. MongoDB automatically creates a unique index on the _id field.
As you create indexes, consider the following behaviors of indexes:
Each index requires at least 8KB of data space.
Adding an index has some negative performance impact for write operations. For collections with high writeto-read ratio, indexes are expensive since each insert must also update any indexes.
Collections with high read-to-write ratio often benefit from additional indexes. Indexes do not affect un-indexed
read operations.
When active, each index consumes disk space and memory. This usage can be significant and should be tracked
for capacity planning, especially for concerns over working set size.
See Indexing Strategies (page 368) for more information on indexes as well as Analyze Query Performance (page 66).
Additionally, the MongoDB database profiler (page 174) may help identify inefficient queries.
Large Number of Collections
In certain situations, you might choose to store related information in several collections rather than in a single collection.
Consider a sample collection logs that stores log documents for various environment and applications. The logs
collection contains documents of the following form:
{ log: "dev", ts: ..., info: ... }
{ log: "debug", ts: ..., info: ...}
If the total number of documents is low, you may group documents into collection by type. For logs, consider maintaining distinct log collections, such as logs.dev and logs.debug. The logs.dev collection would contain
only the documents related to the dev environment.
Generally, having a large number of collections has no significant performance penalty and results in very good
performance. Distinct collections are very important for high-throughput batch processing.
When using models that have a large number of collections, consider the following behaviors:
Each collection has a certain minimum overhead of a few kilobytes.
Each index, including the index on _id, requires at least 8KB of data space.
For each database, a single namespace file (i.e. <database>.ns) stores all meta-data for that database, and
each index and collection has its own entry in the namespace file. MongoDB places limits on the size
of namespace files.
MongoDB has limits on the number of namespaces. You may wish to know the current number
of namespaces in order to determine how many additional namespaces the database can support. To get the
current number of namespaces, run the following in the mongo shell:
103
db.system.namespaces.count()
The limit on the number of namespaces depend on the <database>.ns size. The namespace file defaults to
16 MB.
To change the size of the new namespace file, start the server with the option --nssize <new size MB>.
For existing databases, after starting up the server with --nssize, run the db.repairDatabase() command from the mongo shell. For impacts and considerations on running db.repairDatabase(), see
repairDatabase.
Data Lifecycle Management
Data modeling decisions should take data lifecycle management into consideration.
The Time to Live or TTL feature (page 163) of collections expires documents after a period of time. Consider using
the TTL feature if your application requires some data to persist in the database for a limited period of time.
Additionally, if your application only uses recently inserted documents, consider Capped Collections (page 161).
Capped collections provide first-in-first-out (FIFO) management of inserted documents and efficiently support operations that insert and read documents based on insertion order.
3.2.3 GridFS
GridFS is a specification for storing and retrieving files that exceed the BSON-document size limit of 16MB.
Instead of storing a file in a single document, GridFS divides a file into parts, or chunks, 4 and stores each of those
chunks as a separate document. By default GridFS limits chunk size to 256k. GridFS uses two collections to store
files. One collection stores the file chunks, and the other stores file metadata.
When you query a GridFS store for a file, the driver or client will reassemble the chunks as needed. You can perform
range queries on files stored through GridFS. You also can access information from arbitrary sections of files, which
allows you to skip into the middle of a video or audio file.
GridFS is useful not only for storing files that exceed 16MB but also for storing any files for which you want access
without having to load the entire file into memory. For more information on the indications of GridFS, see When
should I use GridFS? (page 561).
Implement GridFS
To store and retrieve files using GridFS, use either of the following:
A MongoDB driver. See the drivers (page 95) documentation for information on using GridFS with your driver.
The
mongofiles
command-line
tool
in
the
mongo
shell.
http://docs.mongodb.org/manualreference/program/mongofiles.
See
GridFS Collections
GridFS stores files in two collections:
chunks stores the binary chunks. For details, see The chunks Collection (page 127).
files stores the files metadata. For details, see The files Collection (page 128).
4
The use of the term chunks in the context of GridFS is not related to the use of the term chunks in the context of sharding.
104
GridFS places the collections in a common bucket by prefixing each with the bucket name. By default, GridFS uses
two collections with names prefixed by fs bucket:
fs.files
fs.chunks
You can choose a different bucket name than fs, and create multiple buckets in a single database.
Each document in the chunks collection represents a distinct chunk of a file as represented in the GridFS store. Each
chunk is identified by its unique ObjectId stored in its _id field.
For descriptions of all fields in the chunks and files collections, see GridFS Reference (page 127).
GridFS Index
GridFS uses a unique, compound index on the chunks collection for the files_id and n fields. The files_id
field contains the _id of the chunks parent document. The n field contains the sequence number of the chunk.
GridFS numbers all chunks, starting with 0. For descriptions of the documents and fields in the chunks collection,
see GridFS Reference (page 127).
The GridFS index allows efficient retrieval of chunks using the files_id and n values, as shown in the following
example:
cursor = db.fs.chunks.find({files_id: myFileID}).sort({n:1});
See the relevant driver (page 95) documentation for the specific behavior of your GridFS application. If your driver
does not create this index, issue the following operation using the mongo shell:
db.fs.chunks.ensureIndex( { files_id: 1, n: 1 }, { unique: true } );
Example Interface
The following is an example of the GridFS interface in Java. The example is for demonstration purposes only. For
API specifics, see the relevant driver (page 95) documentation.
By default, the interface must support the default GridFS bucket, named fs, as in the following:
// returns default GridFS bucket (i.e. "fs" collection)
GridFS myFS = new GridFS(myDatabase);
// saves the file to "fs" GridFS bucket
myFS.createFile(new File("/tmp/largething.mpg"));
Optionally, interfaces may support other additional GridFS buckets as in the following example:
// returns GridFS bucket named "contracts"
GridFS myContracts = new GridFS(myDatabase, "contracts");
// retrieve GridFS object "smithco"
GridFSDBFile file = myContracts.findOne("smithco");
// saves the GridFS file to the file system
file.writeTo(new File("/tmp/smithco.pdf"));
105
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 99) for
a full high level overview of data modeling in MongoDB.
This document describes a data model that uses embedded (page 100) documents to describe relationships between
connected data.
106
Pattern
Consider the following example that maps patron and address relationships. The example illustrates the advantage of
embedding over referencing if you need to view one data entity in context of the other. In this one-to-one relationship
between patron and address data, the address belongs to the patron.
In the normalized data model, the address document contains a reference to the patron document.
{
_id: "joe",
name: "Joe Bookreader"
}
{
patron_id: "joe",
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: 12345
}
If the address data is frequently retrieved with the name information, then with referencing, your application needs
to issue multiple queries to resolve the reference. The better data model would be to embed the address data in the
patron data, as in the following document:
{
_id: "joe",
name: "Joe Bookreader",
address: {
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: 12345
}
}
With the embedded data model, your application can retrieve the complete patron information with one query.
Model One-to-Many Relationships with Embedded Documents
Overview
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 99) for
a full high level overview of data modeling in MongoDB.
This document describes a data model that uses embedded (page 100) documents to describe relationships between
connected data.
Pattern
Consider the following example that maps patron and multiple address relationships. The example illustrates the
advantage of embedding over referencing if you need to view many data entities in context of another. In this one-tomany relationship between patron and address data, the patron has multiple address entities.
In the normalized data model, the address documents contain a reference to the patron document.
107
{
_id: "joe",
name: "Joe Bookreader"
}
{
patron_id: "joe",
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: 12345
}
{
patron_id: "joe",
street: "1 Some Other Street",
city: "Boston",
state: "MA",
zip: 12345
}
If your application frequently retrieves the address data with the name information, then your application needs
to issue multiple queries to resolve the references. A more optimal schema would be to embed the address data
entities in the patron data, as in the following document:
{
_id: "joe",
name: "Joe Bookreader",
addresses: [
{
street: "123 Fake Street",
city: "Faketon",
state: "MA",
zip: 12345
},
{
street: "1 Some Other Street",
city: "Boston",
state: "MA",
zip: 12345
}
]
}
With the embedded data model, your application can retrieve the complete patron information with one query.
Model One-to-Many Relationships with Document References
Overview
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 99) for
a full high level overview of data modeling in MongoDB.
This document describes a data model that uses references (page 101) between documents to describe relationships
between connected data.
108
Pattern
Consider the following example that maps publisher and book relationships. The example illustrates the advantage of
referencing over embedding to avoid repetition of the publisher information.
Embedding the publisher document inside the book document would lead to repetition of the publisher data, as the
following documents show:
{
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English",
publisher: {
name: "O'Reilly Media",
founded: 1980,
location: "CA"
}
}
{
title: "50 Tips and Tricks for MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English",
publisher: {
name: "O'Reilly Media",
founded: 1980,
location: "CA"
}
}
To avoid repetition of the publisher data, use references and keep the publisher information in a separate collection
from the book collection.
When using references, the growth of the relationships determine where to store the reference. If the number of books
per publisher is small with limited growth, storing the book reference inside the publisher document may sometimes
be useful. Otherwise, if the number of books per publisher is unbounded, this data model would lead to mutable,
growing arrays, as in the following example:
{
name: "O'Reilly Media",
founded: 1980,
location: "CA",
books: [12346789, 234567890, ...]
}
{
_id: 123456789,
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English"
}
{
109
_id: 234567890,
title: "50 Tips and Tricks for MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English"
}
To avoid mutable, growing arrays, store the publisher reference inside the book document:
{
_id: "oreilly",
name: "O'Reilly Media",
founded: 1980,
location: "CA"
}
{
_id: 123456789,
title: "MongoDB: The Definitive Guide",
author: [ "Kristina Chodorow", "Mike Dirolf" ],
published_date: ISODate("2010-09-24"),
pages: 216,
language: "English",
publisher_id: "oreilly"
}
{
_id: 234567890,
title: "50 Tips and Tricks for MongoDB Developer",
author: "Kristina Chodorow",
published_date: ISODate("2011-05-06"),
pages: 68,
language: "English",
publisher_id: "oreilly"
}
110
111
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 99) for
a full high level overview of data modeling in MongoDB.
This document describes a data model that describes a tree-like structure in MongoDB documents by storing references
(page 101) to parent nodes in children nodes.
Pattern
The Parent References pattern stores each tree node in a document; in addition to the tree node, the document stores
the id of the nodes parent.
Consider the following hierarchy of categories:
112
You can create an index on the field parent to enable fast search by the parent node:
db.categories.ensureIndex( { parent: 1 } )
You can query by the parent field to find its immediate children nodes:
db.categories.find( { parent: "Databases" } )
The Parent Links pattern provides a simple solution to tree storage but requires multiple queries to retrieve subtrees.
Model Tree Structures with Child References
Overview
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 99) for
a full high level overview of data modeling in MongoDB.
This document describes a data model that describes a tree-like structure in MongoDB documents by storing references
(page 101) in the parent-nodes to children nodes.
Pattern
The Child References pattern stores each tree node in a document; in addition to the tree node, document stores in an
array the id(s) of the nodes children.
Consider the following hierarchy of categories:
The following example models the tree using Child References, storing the reference to the nodes children in the field
children:
db.categories.insert(
db.categories.insert(
db.categories.insert(
db.categories.insert(
db.categories.insert(
db.categories.insert(
{
{
{
{
{
{
_id:
_id:
_id:
_id:
_id:
_id:
"MongoDB", children: [] } )
"dbm", children: [] } )
"Databases", children: [ "MongoDB", "dbm" ] } )
"Languages", children: [] } )
"Programming", children: [ "Databases", "Languages" ] } )
"Books", children: [ "Programming" ] } )
The query to retrieve the immediate children of a node is fast and straightforward:
db.categories.findOne( { _id: "Databases" } ).children
You can create an index on the field children to enable fast search by the child nodes:
db.categories.ensureIndex( { children: 1 } )
You can query for a node in the children field to find its parent node as well as its siblings:
db.categories.find( { children: "MongoDB" } )
113
114
The Child References pattern provides a suitable solution to tree storage as long as no operations on subtrees are
necessary. This pattern may also provide a suitable solution for storing graphs where a node may have multiple
parents.
Model Tree Structures with an Array of Ancestors
Overview
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 99) for
a full high level overview of data modeling in MongoDB.
This document describes a data model that describes a tree-like structure in MongoDB documents using references
(page 101) to parent nodes and an array that stores all ancestors.
Pattern
The Array of Ancestors pattern stores each tree node in a document; in addition to the tree node, document stores in
an array the id(s) of the nodes ancestors or path.
Consider the following hierarchy of categories:
115
The following example models the tree using Array of Ancestors. In addition to the ancestors field, these documents also store the reference to the immediate parent category in the parent field:
db.categories.insert(
db.categories.insert(
db.categories.insert(
db.categories.insert(
db.categories.insert(
db.categories.insert(
{
{
{
{
{
{
_id:
_id:
_id:
_id:
_id:
_id:
The query to retrieve the ancestors or path of a node is fast and straightforward:
db.categories.findOne( { _id: "MongoDB" } ).ancestors
You can create an index on the field ancestors to enable fast search by the ancestors nodes:
db.categories.ensureIndex( { ancestors: 1 } )
You can query by the field ancestors to find all its descendants:
db.categories.find( { ancestors: "Programming" } )
The Array of Ancestors pattern provides a fast and efficient solution to find the descendants and the ancestors of a node
by creating an index on the elements of the ancestors field. This makes Array of Ancestors a good choice for working
with subtrees.
The Array of Ancestors pattern is slightly slower than the Materialized Paths (page 116) pattern but is more straightforward to use.
Model Tree Structures with Materialized Paths
Overview
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 99) for
a full high level overview of data modeling in MongoDB.
This document describes a data model that describes a tree-like structure in MongoDB documents by storing full
relationship paths between documents.
Pattern
The Materialized Paths pattern stores each tree node in a document; in addition to the tree node, document stores as
a string the id(s) of the nodes ancestors or path. Although the Materialized Paths pattern requires additional steps of
working with strings and regular expressions, the pattern also provides more flexibility in working with the path, such
as finding nodes by partial paths.
Consider the following hierarchy of categories:
The following example models the tree using Materialized Paths, storing the path in the field path; the path string
uses the comma , as a delimiter:
db.categories.insert(
db.categories.insert(
db.categories.insert(
db.categories.insert(
db.categories.insert(
db.categories.insert(
116
{
{
{
{
{
{
_id:
_id:
_id:
_id:
_id:
_id:
117
You can query to retrieve the whole tree, sorting by the field path:
db.categories.find().sort( { path: 1 } )
You can use regular expressions on the path field to find the descendants of Programming:
db.categories.find( { path: /,Programming,/ } )
You can also retrieve the descendants of Books where the Books is also at the topmost level of the hierarchy:
db.categories.find( { path: /^,Books,/ } )
Data in MongoDB has a flexible schema. Collections do not enforce document structure. Decisions that affect how
you model data can affect application performance and database capacity. See Data Modeling Concepts (page 99) for
a full high level overview of data modeling in MongoDB.
This document describes a data model that describes a tree like structure that optimizes discovering subtrees at the
expense of tree mutability.
Pattern
The Nested Sets pattern identifies each node in the tree as stops in a round-trip traversal of the tree. The application
visits each node in the tree twice; first during the initial trip, and second during the return trip. The Nested Sets pattern
stores each tree node in a document; in addition to the tree node, document stores the id of nodes parent, the nodes
initial stop in the left field, and its return stop in the right field.
Consider the following hierarchy of categories:
The following example models the tree using Nested Sets:
db.categories.insert(
db.categories.insert(
db.categories.insert(
db.categories.insert(
db.categories.insert(
db.categories.insert(
{
{
{
{
{
{
_id:
_id:
_id:
_id:
_id:
_id:
118
Figure 3.10: Example of a hierarchical data. The numbers identify the stops at nodes during a roundtrip traversal of a
tree.
The Nested Sets pattern provides a fast and efficient solution for finding subtrees but is inefficient for modifying the
tree structure. As such, this pattern is best for static trees that do not change.
Consider the following example that keeps a library book and its checkout information. The example illustrates how
embedding fields related to an atomic update within the same document ensures that the fields are in sync.
Consider the following book document that stores the number of available copies for checkout and the current checkout information:
book = {
_id: 123456789,
title: "MongoDB: The Definitive Guide",
119
You can use the db.collection.findAndModify() method to atomically determine if a book is available for
checkout and update with the new checkout information. Embedding the available field and the checkout field
within the same document ensures that the updates to these fields are in sync:
db.books.findAndModify ( {
query: {
_id: 123456789,
available: { $gt: 0 }
},
update: {
$inc: { available: -1 },
$push: { checkout: { by: "abc", date: new Date() } }
}
} )
To add structures to your document to support keyword-based queries, create an array field in your documents and add
the keywords as strings in the array. You can then create a multi-key index (page 324) on the array and create queries
that select values from the array.
Example
Given a collection of library volumes that you want to provide topic-based search. For each volume, you add the array
topics, and you add as many keywords as needed for a given volume.
For the Moby-Dick volume you might have the following document:
{ title : "Moby-Dick" ,
author : "Herman Melville" ,
published : 1851 ,
ISBN : 0451526996 ,
120
The multi-key index creates separate index entries for each keyword in the topics array. For example the index
contains one entry for whaling and another for allegory.
You then query based on the keywords. For example:
db.volumes.findOne( { topics : "voyage" }, { title: 1 } )
Note: An array with a large number of elements, such as one with several hundreds or thousands of keywords will
incur greater indexing costs on insertion.
MongoDB can support keyword searches using specific data models and multi-key indexes (page 324); however, these
keyword indexes are not sufficient or comparable to full-text products in the following respects:
Stemming. Keyword queries in MongoDB can not parse keywords for root or related words.
Synonyms. Keyword-based search features must provide support for synonym or related queries in the application layer.
Ranking. The keyword look ups described in this document do not provide a way to weight results.
Asynchronous Indexing. MongoDB builds indexes synchronously, which means that the indexes used for keyword indexes are always current and can operate in real-time. However, asynchronous bulk indexes may be
more efficient for some kinds of content and workloads.
121
3.4.1 Documents
MongoDB stores all data in documents, which are JSON-style data structures composed of field-and-value pairs:
{ "item": "pencil", "qty": 500, "type": "no.2" }
value1,
value2,
value3,
valueN
The value of a field can be any of the BSON data types (page 131), including other documents, arrays, and arrays of
documents. The following document contains values of varying types:
var mydoc = {
_id: ObjectId("5099803df3f4948bd2f98391"),
name: { first: "Alan", last: "Turing" },
birth: new Date('Jun 23, 1912'),
death: new Date('Jun 07, 1954'),
contribs: [ "Turing machine", "Turing test", "Turingery" ],
views : NumberLong(1250000)
}
122
123
Dot Notation
MongoDB uses the dot notation to access the elements of an array and to access the fields of a subdocument.
To access an element of an array by the zero-based index position, concatenate the array name with the dot (.) and
zero-based index position, and enclose in quotes:
'<array>.<index>'
To access a field of a subdocument with dot-notation, concatenate the subdocument name with the dot (.) and the
field name, and enclose in quotes:
'<subdocument>.<field>'
See also:
Subdocuments (page 61) for dot notation examples with subdocuments.
Arrays (page 62) for dot notation examples with arrays.
124
queries to return the referenced documents. Many drivers (page 95) have helper methods that form the query
for the DBRef automatically. The drivers 8 do not automatically resolve DBRefs into documents.
Use a DBRef when you need to embed documents from multiple collections in documents from one collection.
DBRefs also provide a common format and type to represent these relationships among documents. The DBRef
format provides common semantics for representing links between documents if your database must interact
with multiple frameworks and tools.
Unless you have a compelling reason for using a DBRef, use manual references.
Manual References
Background
Manual references refers to the practice of including one documents _id field in another document. The application
can then issue a second query to resolve the referenced fields as needed.
Process
Consider the following operation to insert two documents, using the _id field of the first document as a reference in
the second document:
original_id = ObjectId()
db.places.insert({
"_id": original_id,
"name": "Broadway Center",
"url": "bc.example.net"
})
db.people.insert({
"name": "Erin",
"places_id": original_id,
"url": "bc.example.net/Erin"
})
Then, when a query returns the document from the people collection you can, if needed, make a second query for
the document referenced by the places_id field in the places collection.
Use
For nearly every case where you want to store a relationship between two documents, use manual references
(page 125). The references are simple to create and your application can resolve references as needed.
The only limitation of manual linking is that these references do not convey the database and collection name. If you
have documents in a single collection that relate to documents in more than one collection, you may need to consider
using DBRefs (page 126).
8
Some community supported drivers may have alternate behavior and may resolve a DBRef into a document automatically.
125
DBRefs
Background
DBRefs are a convention for representing a document, rather than a specific reference type. They include the name of
the collection, and in some cases the database, in addition to the value from the _id field.
Format
The DBRef in this example, points to a document in the creators collection of the users database that has
ObjectId("5126bc054aed4daf9e2ab772") in its _id field.
Note: The order of fields in the DBRef matters, and you must use the above sequence when using a DBRef.
Support
C++ The C++ driver contains no support for DBRefs. You can transverse references manually.
C# The C# driver provides access to DBRef objects with the MongoDBRef Class9 and supplies the FetchDBRef
Method10 for accessing these objects.
9 http://api.mongodb.org/csharp/current/html/46c356d3-ed06-a6f8-42fa-e0909ab64ce2.htm
10 http://api.mongodb.org/csharp/current/html/1b0b8f48-ba98-1367-0a7d-6e01c8df436f.htm
126
Java The DBRef11 class provides supports for DBRefs from Java.
JavaScript The mongo shells JavaScript interface provides a DBRef.
Perl The Perl driver contains no support for DBRefs. You can transverse references manually or use the MongoDBx::AutoDeref12 CPAN module.
PHP The PHP driver does support DBRefs, including the optional $db reference, through The MongoDBRef class13 .
Python The Python driver provides the DBRef class14 , and the dereference method15 for interacting with DBRefs.
Ruby The Ruby Driver supports DBRefs using the DBRef class16 and the deference method17 .
Use
In most cases you should use the manual reference (page 125) method for connecting two or more related documents.
However, if you need to reference documents from multiple collections, consider a DBRef.
127
chunks._id
The unique ObjectId of the chunk.
chunks.files_id
The _id of the parent document, as specified in the files collection.
chunks.n
The sequence number of the chunk. GridFS numbers all chunks, starting with 0.
chunks.data
The chunks payload as a BSON binary type.
The chunks collection uses a compound index on files_id and n, as described in GridFS Index (page 105).
The files Collection
Each document in the files collection represents a file in the GridFS store. Consider the following prototype of a
document in the files collection:
{
"_id" : <ObjectId>,
"length" : <num>,
"chunkSize" : <num>
"uploadDate" : <timestamp>
"md5" : <hash>
"filename" : <string>,
"contentType" : <string>,
"aliases" : <string array>,
"metadata" : <dataObject>,
}
Documents in the files collection contain some or all of the following fields. Applications may create additional
arbitrary fields:
files._id
The unique ID for this document. The _id is of the data type you chose for the original document. The default
type for MongoDB documents is BSON ObjectId.
files.length
The size of the document in bytes.
files.chunkSize
The size of each chunk. GridFS divides the document into chunks of the size specified here. The default size is
256 kilobytes.
files.uploadDate
The date the document was first stored by GridFS. This value has the Date type.
files.md5
An MD5 hash returned from the filemd5 API. This value has the String type.
files.filename
Optional. A human-readable name for the document.
files.contentType
Optional. A valid MIME type for the document.
files.aliases
Optional. An array of alias strings.
128
files.metadata
Optional. Any additional information you want to store.
3.4.4 ObjectId
Overview
ObjectId is a 12-byte BSON type, constructed using:
a 4-byte value representing the seconds since the Unix epoch,
a 3-byte machine identifier,
a 2-byte process id, and
a 3-byte counter, starting with a random value.
In MongoDB, documents stored in a collection require a unique _id field that acts as a primary key. Because ObjectIds
are small, most likely unique, and fast to generate, MongoDB uses ObjectIds as the default value for the _id field if
the _id field is not specified. MongoDB clients should add an _id field with a unique ObjectId. However, if a client
does not add an _id field, mongod will add an _id field that holds an ObjectId.
Using ObjectIds for the _id field provides the following additional benefits:
in the mongo shell, you can access the creation time of the ObjectId, using the getTimestamp() method.
sorting on an _id field that stores ObjectId values is roughly equivalent to sorting by creation time.
Important: The relationship between the order of ObjectId values and generation time is not strict within a
single second. If multiple systems, or multiple processes or threads on a single system generate values, within a
single second; ObjectId values do not represent a strict insertion order. Clock skew between clients can also
result in non-strict ordering even for values, because client drivers generate ObjectId values, not the mongod
process.
Also consider the Documents (page 122) section for related information on MongoDBs document orientation.
ObjectId()
The mongo shell provides the ObjectId() wrapper class to generate a new ObjectId, and to provide the following
helper attribute and methods:
str
The hexadecimal string value of the ObjectId() object.
getTimestamp()
Returns the timestamp portion of the ObjectId() object as a Date.
toString()
Returns the string representation of the ObjectId() object. The returned string literal has the
format ObjectId(...).
Changed in version 2.2: In previous versions toString() returns the value of the ObjectId as a
hexadecimal string.
valueOf()
129
Returns the value of the ObjectId() object as a hexadecimal string. The returned string is the
str attribute.
Changed in version 2.2: In previous versions valueOf() returns the ObjectId() object.
Examples
Consider the following uses ObjectId() class in the mongo shell:
Generate a new ObjectId
To generate a new ObjectId using the ObjectId() constructor with a unique hexadecimal string:
y = ObjectId("507f191e810c19729de860ea")
To return the timestamp of an ObjectId() object, use the getTimestamp() method as follows:
Convert an ObjectId into a Timestamp
To return the timestamp of an ObjectId() object, use the getTimestamp() method as follows:
ObjectId("507f191e810c19729de860ea").getTimestamp()
To return the value of an ObjectId() object as a hexadecimal string, use the valueOf() method as follows:
ObjectId("507f191e810c19729de860ea").valueOf()
130
507f191e810c19729de860ea
To return the string representation of an ObjectId() object, use the toString() method as follows:
ObjectId("507f191e810c19729de860ea").toString()
Number
1
2
3
4
5
7
8
9
10
11
13
14
15
16
17
18
255
127
When comparing values of different BSON types, MongoDB uses the following comparison order, from lowest to
highest:
1. MinKey (internal type)
2. Null
3. Numbers (ints, longs, doubles)
4. Symbol, String
5. Object
6. Array
7. BinData
8. ObjectID
9. Boolean
18 http://bsonspec.org/
131
132
Changed in version 2.1: mongo shell displays the Timestamp value with the wrapper:
Timestamp(<time_t>, <ordinal>)
Prior to version 2.1, the mongo shell display the Timestamp value as a document:
{ t : <time_t>, i : <ordinal> }
Date
BSON Date is a 64-bit integer that represents the number of milliseconds since the Unix epoch (Jan 1, 1970). The
official BSON specification21 refers to the BSON Date type as the UTC datetime.
Changed in version 2.0: BSON Date type is signed.
22
Example
Construct a Date using the new Date() constructor in the mongo shell:
var mydate1 = new Date()
Example
Construct a Date using the ISODate() constructor in the mongo shell:
var mydate2 = ISODate()
Example
Return the Date value as string:
mydate1.toString()
Example
Return the month portion of the Date value; months are zero-indexed, so that January is month 0:
mydate1.getMonth()
21 http://bsonspec.org/#/specification
22 Prior to version 2.0, Date values were incorrectly interpreted as unsigned integers, which affected sorts, range queries, and indexes on Date
fields. Because indexes are not recreated when upgrading, please re-index if you created an index on Date values with an earlier version, and dates
before 1970 are relevant to your application.
133
134
CHAPTER 4
Administration
The administration documentation addresses the ongoing operation and maintenance of MongoDB instances and deployments. This documentation includes both high level overviews of these concerns as well as tutorials that cover
specific procedures and processes for operating MongoDB.
Administration Concepts (page 135) Core conceptual documentation of operational practices for managing MongoDB deployments and systems.
MongoDB Backup Methods (page 136) Describes approaches and considerations for backing up a MongoDB
database.
Data Center Awareness (page 159) Presents the MongoDB features that allow application developers and
database administrators to configure their deployments to be more data center aware or allow operational
and location-based separation.
Monitoring for MongoDB (page 138) An overview of monitoring tools, diagnostic strategies, and approaches
to monitoring replica sets and sharded clusters.
Administration Tutorials (page 169) Tutorials that describe common administrative procedures and practices for operations for MongoDB instances and deployments.
Configuration, Maintenance, and Analysis (page 170) Describes routine management operations, including
configuration and performance analysis.
Backup and Recovery (page 190) Outlines procedures for data backup and restoration with mongod instances
and deployments.
Administration Reference (page 223) Reference and documentation of internal mechanics of administrative features,
systems and functions and operations.
See also:
The MongoDB Manual contains administrative documentation and tutorials though out several sections. See Replica
Set Tutorials (page 419) and Sharded Cluster Tutorials (page 506) for additional tutorials and information.
Monitoring for MongoDB (page 138) An overview of monitoring tools, diagnostic strategies, and approaches
to monitoring replica sets and sharded clusters.
Run-time Database Configuration (page 146) Outlines common MongoDB configurations and examples of
best-practice configurations for common use cases.
Data Management (page 159) Core documentation that addresses issues in data management, organization, maintenance, and lifestyle management.
Data Center Awareness (page 159) Presents the MongoDB features that allow application developers and
database administrators to configure their deployments to be more data center aware or allow operational
and location-based separation.
Expire Data from Collections by Setting TTL (page 163) TTL collections make it possible to automatically
remove data from a collection based on the value of a timestamp and are useful for managing data like
machine generated event data that are only useful for a limited period of time.
Capped Collections (page 161) Capped collections provide a special type of size-constrained collections that
preserve insertion order and can support high volume inserts.
Optimization Strategies for MongoDB (page 165) Techniques for optimizing application performance with MongoDB.
136
Chapter 4. Administration
Backups with the MongoDB Management Service (MMS) The MongoDB Management Service1 supports backup
and restore for MongoDB deployments.
MMS continually backs up MongoDB replica sets and sharded systems by reading the oplog data from your MongoDB
cluster.
MMS Backup offers point in time recovery of MongoDB replica sets and a consistent snapshot of sharded systems.
MMS achieves point in time recovery by storing oplog data so that it can create a restore for any moment in time in
the last 24 hours for a particular replica set.
For sharded systems, MMS does not provide restores for arbitrary moments in time. MMS does provide periodic consistent snapshots of the entire sharded cluster. Sharded cluster snapshots are difficult to achieve with other MongoDB
backup methods.
To restore a MongoDB cluster from an MMS Backup snapshot, you download a compressed archive of your MongoDB
data files and distribute those files before restarting the mongod processes.
To get started with MMS Backup sign up for MMS2 , and consider the complete documentation of MMS see the MMS
Manual3 .
Backup by Copying Underlying Data Files You can create a backup by copying MongoDBs underlying data files.
If the volume where MongoDB stores data files supports point in time snapshots, you can use these snapshots to create
backups of a MongoDB system at an exact moment in time.
File systems snapshots are an operating system volume manager feature, and are not specific to MongoDB. The
mechanics of snapshots depend on the underlying storage system. For example, if you use Amazons EBS storage
system for EC2 supports snapshots. On Linux the LVM manager can create a snapshot.
To get a correct snapshot of a running mongod process, you must have journaling enabled and the journal must reside
on the same logical volume as the other MongoDB data files. Without journaling enabled, there is no guarantee that
the snapshot will be consistent or valid.
To get a consistent snapshot of a sharded system, you must disable the balancer and capture a snapshot from every
shard and a config server at approximately the same moment in time.
If your storage system does not support snapshots, you can copy the files directly using cp, rsync, or a similar tool.
Since copying multiple files is not an atomic operation, you must stop all writes to the mongod before copying the
files. Otherwise, you will copy the files in an invalid state.
Backups produced by copying the underlying data do not support point in time recovery for replica sets and are
difficult to manage for larger sharded clusters. Additionally, these backups are larger because they include the indexes
and duplicate underlying storage padding and fragmentation. mongodump by contrast create smaller backups.
For more information, see Backup and Restore with Filesystem Snapshots (page 190) and Backup a Sharded Cluster
with Filesystem Snapshots (page 199) documents for complete instructions on using LVM to create snapshots. Also
see Back up and Restore Processes for MongoDB on Amazon EC24 .
Backup with mongodump The mongodump tool reads data from a MongoDB database and creates high fidelity
BSON files. The mongorestore tool can populate a MongoDB database with the data from these BSON files.
These tools are simple and efficient for backing up small MongoDB deployments, but are not ideal for capturing
backups of larger systems.
mongodump and mongorestore can operate against a running mongod process, and can manipulate the underlying data files directly. By default, mongodump does not capture the contents of the local database (page 472).
1 https://mms.10gen.com/?pk_campaign=MongoDB-Org&pk_kwd=Backup-Docs
2 http://mms.mongodb.com
3 https://mms.mongodb.com/help/
4 http://docs.mongodb.org/ecosystem/tutorial/backup-and-restore-mongodb-on-amazon-ec2
137
mongodump only captures the documents in the database. The resulting backup is space efficient, but
mongorestore or mongod must rebuild the indexes after restoring data.
When connected to a MongoDB instance, mongodump can adversely affect mongod performance. If your data is
larger than system memory, the queries will push the working set out of memory.
To mitigate the impact of mongodump on the performance of the replica set, use mongodump to capture backups from a secondary (page 382) member of a replica set. Alternatively, you can shut down a secondary and use
mongodump with the data files directly. If you shut down a secondary to capture data with mongodump ensure that
the operation can complete before its oplog becomes too stale to continue replicating.
For replica sets, mongodump also supports a point in time feature with the --oplog option. Applications may
continue modifying data while mongodump captures the output. To restore a point in time backup created with
--oplog, use mongorestore with the --oplogReplay option.
If applications modify data while mongodump is creating a backup, mongodump will compete for resources with
those applications.
See Back Up and Restore with MongoDB Tools (page 195), Backup a Small Sharded Cluster with mongodump
(page 198), and Backup a Sharded Cluster with Database Dumps (page 200) for more information.
Further Reading
Backup and Restore with Filesystem Snapshots (page 190) An outline of procedures for creating MongoDB data set
backups using system-level file snapshot tool, such as LVM or native storage appliance tools.
Restore a Replica Set from MongoDB Backups (page 194) Describes procedure for restoring a replica set from an
archived backup such as a mongodump or MMS Backup5 file.
Back Up and Restore with MongoDB Tools (page 195) The procedure for writing the contents of a database to a
BSON (i.e. binary) dump file for backing up MongoDB databases.
Backup and Restore Sharded Clusters (page 198) Detailed procedures and considerations for backing up sharded
clusters and single shards.
Recover Data after an Unexpected Shutdown (page 203) Recover data from MongoDB data files that were not properly closed or have an invalid state.
Monitoring for MongoDB
Monitoring is a critical component of all database administration. A firm grasp of MongoDBs reporting will allow you
to assess the state of your database and maintain your deployment without crisis. Additionally, a sense of MongoDBs
normal operational parameters will allow you to diagnose before they escalate to failures.
This document presents an overview of the available monitoring utilities and the reporting statistics available in MongoDB. It also introduces diagnostic strategies and suggestions for monitoring replica sets and sharded clusters.
Note: MongoDB Management Service (MMS)6 is a hosted monitoring service which collects and aggregates data
to provide insight into the performance and operation of MongoDB deployments. See the MMS documentation7 for
more information.
5 http://mms.mongodb.com
6 https://mms.mongodb.com/?pk_campaign=mongodb-org&pk_kwd=monitoring
7 http://mms.mongodb.com/help/
138
Chapter 4. Administration
Monitoring Strategies
There are three methods for collecting data about the state of a running MongoDB instance:
First, there is a set of utilities distributed with MongoDB that provides real-time reporting of database activities.
Second, database commands return statistics regarding the current database state with greater fidelity.
Third, MMS Monitoring Service8 collects data from running MongoDB deployments and provides visualization
and alerts based on that data. MMS is a free service provided by MongoDB.
Each strategy can help answer different questions and is useful in different contexts. These methods are complementary.
MongoDB Reporting Tools
This section provides an overview of the reporting methods distributed with MongoDB. It also offers examples of the
kinds of questions that each method is best suited to help you address.
Utilities The MongoDB distribution includes a number of utilities that quickly return statistics about instances
performance and activity. Typically, these are most useful for diagnosing issues and assessing normal operation.
mongostat mongostat captures and returns the counts of database operations by type (e.g. insert, query, update,
delete, etc.). These counts report on the load distribution on the server.
Use mongostat to understand the distribution of operation types and to inform capacity planning. See the
mongostat manual for details.
mongotop mongotop tracks and reports the current read and write activity of a MongoDB instance, and reports
these statistics on a per collection basis.
Use mongotop to check if your database activity and use match your expectations. See the mongotop manual
for details.
REST Interface MongoDB provides a simple REST interface that can be useful for configuring monitoring and
alert scripts, and for other administrative tasks.
To enable, configure mongod to use REST, either by starting mongod with the --rest option, or by setting the
rest setting to true in a configuration file.
For more information on using the REST Interface see, the Simple REST Interface9 documentation.
HTTP Console MongoDB provides a web interface that exposes diagnostic and monitoring information in a simple
web page. The web interface is accessible at localhost:<port>, where the <port> number is 1000 more than
the mongod port .
For example, if a locally running mongod is using the default port 27017, access the HTTP console at
http://localhost:28017.
8 https://mms.mongodb.com/?pk_campaign=mongodb-org&pk_kwd=monitoring
9 http://docs.mongodb.org/ecosystem/tools/http-interfaces
139
Commands MongoDB includes a number of commands that report on the state of the database.
These data may provide a finer level of granularity than the utilities discussed above. Consider using their output
in scripts and programs to develop custom alerts, or to modify the behavior of your application in response to the
activity of your instance. The db.currentOp method is another useful tool for identifying the database instances
in-progress operations.
serverStatus The serverStatus command, or db.serverStatus() from the shell, returns a general
overview of the status of the database, detailing disk usage, memory use, connection, journaling, and index access.
The command returns quickly and does not impact MongoDB performance.
serverStatus outputs an account of the state of a MongoDB instance. This command is rarely run directly. In
most cases, the data is more meaningful when aggregated, as one would see with monitoring tools including MMS10 .
Nevertheless, all administrators should be familiar with the data provided by serverStatus.
dbStats The dbStats command, or db.stats() from the shell, returns a document that addresses storage use
and data volumes. The dbStats reflect the amount of storage used, the quantity of data contained in the database,
and object, collection, and index counters.
Use this data to monitor the state and storage capacity of a specific database. This output also allows you to compare
use between databases and to determine the average document size in a database.
collStats The collStats provides statistics that resemble dbStats on the collection level, including a count
of the objects in the collection, the size of the collection, the amount of disk space used by the collection, and information about its indexes.
replSetGetStatus The replSetGetStatus command (rs.status() from the shell) returns an
overview of your replica sets status. The replSetGetStatus document details the state and configuration of
the replica set and statistics about its members.
Use this data to ensure that replication is properly configured, and to check the connections between the current host
and the other members of the replica set.
Third Party Tools A number of third party monitoring tools have support for MongoDB, either directly, or through
their own plugins.
Self Hosted Monitoring Tools These are monitoring tools that you must install, configure and maintain on your
own servers. Most are open source.
10 http://mms.mongodb.com
140
Chapter 4. Administration
Tool
Ganglia24
Ganglia
Motop27
mtop28
Munin29
Munin
Munin
Nagios33
Zabbix35
Plugin
mongodb-ganglia25
Description
Python script to report operations per second, memory usage, btree statistics,
master/slave status and current connections.
26
gmond_python_modules
Parses
output from the serverStatus and replSetGetStatus
commands.
None
Realtime monitoring tool for MongoDB servers. Shows current operations
ordered by durations every second.
None
A top like tool.
mongo-munin30
Retrieves server statistics.
mongomon31
Retrieves collection statistics (sizes, index sizes, and each (configured) collection
count for one DB).
munin-plugins
Some additional munin plugins not in the main distribution.
Ubuntu PPA32
nagios-pluginA simple Nagios check script, written in Python.
mongodb34
mikoomiMonitors availability, resource utilization, health, performance and other
mongodb36
important metrics.
Also consider dex37 , an index and query analyzing tool for MongoDB that compares MongoDB log files and indexes
to make indexing recommendations.
As part of MongoDB Enterprise38 , you can run MMS On-Prem39 , which offers the features of MMS in a package that
runs within your infrastructure.
Hosted (SaaS) Monitoring Tools These are monitoring tools provided as a hosted service, usually through a paid
subscription.
11 http://sourceforge.net/apps/trac/ganglia/wiki
12 https://github.com/quiiver/mongodb-ganglia
13 https://github.com/ganglia/gmond_python_modules
14 https://github.com/tart/motop
15 https://github.com/beaufour/mtop
16 http://munin-monitoring.org/
17 https://github.com/erh/mongo-munin
18 https://github.com/pcdummy/mongomon
19 https://launchpad.net/
chris-lea/+archive/munin-plugins
20 http://www.nagios.org/
21 https://github.com/mzupan/nagios-plugin-mongodb
22 http://www.zabbix.com/
23 https://code.google.com/p/mikoomi/wiki/03
24 http://sourceforge.net/apps/trac/ganglia/wiki
25 https://github.com/quiiver/mongodb-ganglia
26 https://github.com/ganglia/gmond_python_modules
27 https://github.com/tart/motop
28 https://github.com/beaufour/mtop
29 http://munin-monitoring.org/
30 https://github.com/erh/mongo-munin
31 https://github.com/pcdummy/mongomon
32 https://launchpad.net/
chris-lea/+archive/munin-plugins
33 http://www.nagios.org/
34 https://github.com/mzupan/nagios-plugin-mongodb
35 http://www.zabbix.com/
36 https://code.google.com/p/mikoomi/wiki/03
37 https://github.com/mongolab/dex
38 http://www.mongodb.com/products/mongodb-enterprise
39 http://mms.mongodb.com
141
Name
MongoDB
Management
Service47
Scout48
Server Density52
Notes
MMS is a cloud-based suite of services for managing MongoDB deployments. MMS
provides monitoring and backup functionality.
Several plugins, including MongoDB Monitoring49 , MongoDB Slow Queries50 , and
MongoDB Replica Set Monitoring51 .
Dashboard for MongoDB53 , MongoDB specific alerts, replication failover timeline and
iPhone, iPad and Android mobile apps.
Process Logging
During normal operation, mongod and mongos instances report a live account of all server activity and operations to
either standard output or a log file. The following runtime settings control these options.
quiet. Limits the amount of information written to the log or output.
verbose. Increases the amount of information written to the log or output.
You can also specify this as v (as in -v). For higher levels of verbosity, set multiple v, as in vvvv = True.
You can also change the verbosity of a running mongod or mongos instance with the setParameter command.
logpath. Enables logging to a file, rather than the standard output. You must specify the full path to the log
file when adjusting this setting.
logappend. Adds information to a log file instead of overwriting the file.
Note: You can specify these configuration operations as the command line arguments to mongod or mongos
For example:
mongod -v --logpath /var/log/mongodb/server1.log --logappend
mode,
appending
data
to
the
log
file
at
142
Chapter 4. Administration
Degraded performance in MongoDB is typically a function of the relationship between the quantity of data stored
in the database, the amount of system RAM, the number of connections to the database, and the amount of time the
database spends in a locked state.
In some cases performance issues may be transient and related to traffic load, data access patterns, or the availability
of hardware on the host system for virtualized environments. Some users also experience performance limitations as a
result of inadequate or inappropriate indexing strategies, or as a consequence of poor schema design patterns. In other
situations, performance issues may indicate that the database may be operating at capacity and that it is time to add
additional capacity to the database.
The following are some causes of degraded performance in MongoDB.
Locks MongoDB uses a locking system to ensure data set validity. However, if certain operations are long-running,
or a queue forms, performance will slow as requests and operations wait for the lock. Lock-related slowdowns can
be intermittent. To see if the lock has been affecting your performance, look to the data in the globalLock section of
the serverStatus output. If globalLock.currentQueue.total is consistently high, then there is a chance
that a large number of requests are waiting for a lock. This indicates a possible concurrency issue that may be affecting
performance.
If globalLock.totalTime is high relative to uptime, the database has existed in a lock state for a significant
amount of time. If globalLock.ratio is also high, MongoDB has likely been processing a large number of
long running queries. Long queries are often the result of a number of factors: ineffective use of indexes, nonoptimal schema design, poor query structure, system architecture issues, or insufficient RAM resulting in page faults
(page 143) and disk reads.
Memory Usage MongoDB uses memory mapped files to store data. Given a data set of sufficient size, the MongoDB
process will allocate all available memory on the system for its use. While this is part of the design, and affords
MongoDB superior performance, the memory mapped files make it difficult to determine if the amount of RAM is
sufficient for the data set.
The memory usage statuses metrics of the serverStatus output can provide insight into MongoDBs memory use.
Check the resident memory use (i.e. mem.resident): if this exceeds the amount of system memory and there is a
significant amount of data on disk that isnt in RAM, you may have exceeded the capacity of your system.
You should also check the amount of mapped memory (i.e. mem.mapped.) If this value is greater than the amount of
system memory, some operations will require disk access page faults to read data from virtual memory and negatively
affect performance.
Page Faults A page fault occurs when MongoDB requires data not located in physical memory, and must read from
virtual memory. To check for page faults, see the extra_info.page_faults value in the serverStatus
output. This data is only available on Linux systems.
A single page fault completes quickly and is not problematic. However, in aggregate, large volumes of page faults
typically indicate that MongoDB is reading too much data from disk. In many situations, MongoDBs read locks will
yield after a page fault to allow other processes to read and avoid blocking while waiting for the next page to read
into memory. This approach improves concurrency, and also improves overall throughput in high volume systems.
Increasing the amount of RAM accessible to MongoDB may help reduce the number of page faults. If this is not
possible, you may want to consider deploying a sharded cluster and/or adding shards to your deployment to distribute
load among mongod instances.
143
Number of Connections In some cases, the number of connections between the application layer (i.e. clients) and
the database can overwhelm the ability of the server to handle requests. This can produce performance irregularities.
The following fields in the serverStatus document can provide insight:
globalLock.activeClients contains a counter of the total number of clients with active operations in
progress or queued.
connections is a container for the following two fields:
current the total number of current clients that connect to the database instance.
available the total number of unused collections available for new clients.
Note: Unless constrained by system-wide limits MongoDB has a hard connection limit of 20,000 connections. You
can modify system limits using the ulimit command, or by editing your systems /etc/sysctl file.
If requests are high because there are numerous concurrent application requests, the database may have trouble keeping
up with demand. If this is the case, then you will need to increase the capacity of your deployment. For read-heavy
applications increase the size of your replica set and distribute read operations to secondary members. For write heavy
applications, deploy sharding and add one or more shards to a sharded cluster to distribute load among mongod
instances.
Spikes in the number of connections can also be the result of application or driver errors. All of the officially supported
MongoDB drivers implement connection pooling, which allows clients to use and reuse connections more efficiently.
Extremely high numbers of connections, particularly without corresponding workload is often indicative of a driver or
other configuration error.
Database Profiling MongoDBs Profiler is a database profiling system that can help identify inefficient queries
and operations.
The following profiling levels are available:
Level
0
1
2
Setting
Off. No profiling
On. Only includes slow operations
On. Includes all operations
Enable the profiler by setting the profile value using the following command in the mongo shell:
db.setProfilingLevel(1)
The slowms setting defines what constitutes a slow operation. To set the threshold above which the profiler considers operations slow (and thus, included in the level 1 profiling data), you can configure slowms at runtime as an
argument to the db.setProfilingLevel() operation.
See
The documentation of db.setProfilingLevel() for more information about this command.
By default, mongod records all slow queries to its log, as defined by slowms. Unlike log data, the data in
system.profile does not persist between mongod restarts.
Note: Because the database profiler can negatively impact performance, only enable profiling for strategic intervals
and as minimally as possible on production systems.
You may enable profiling on a per-mongod basis. This setting will not propagate across a replica set or sharded
cluster.
144
Chapter 4. Administration
You can view the output of the profiler in the system.profile collection of your database by issuing the show
profile command in the mongo shell, or with the following operation:
db.system.profile.find( { millis : { $gt : 100 } } )
This returns all operations that lasted longer than 100 milliseconds. Ensure that the value specified here (100, in this
example) is above the slowms threshold.
See also:
Optimization Strategies for MongoDB (page 165) addresses strategies that may improve the performance of your
database queries and operations.
Replication and Monitoring
Beyond the basic monitoring requirements for any MongoDB instance, for replica sets, administrators must monitor
replication lag. Replication lag refers to the amount of time that it takes to copy (i.e. replicate) a write operation
on the primary to a secondary. Some small delay period may be acceptable, but two significant problems emerge as
replication lag grows:
First, operations that occurred during the period of lag are not replicated to one or more secondaries. If youre
using replication to ensure data persistence, exceptionally long delays may impact the integrity of your data set.
Second, if the replication lag exceeds the length of the operation log (oplog) then MongoDB will have to perform
an initial sync on the secondary, copying all data from the primary and rebuilding all indexes. This is uncommon
under normal circumstances, but if you configure the oplog to be smaller than the default, the issue can arise.
Note: The size of the oplog is only configurable during the first run using the --oplogSize argument to the
mongod command, or preferably, the oplogSize in the MongoDB configuration file. If you do not specify
this on the command line before running with the --replSet option, mongod will create a default sized
oplog.
By default, the oplog is 5 percent of total available disk space on 64-bit systems. For more information about
changing the oplog size, see the Change the Size of the Oplog (page 445)
For causes of replication lag, see Replication Lag (page 461).
Replication issues are most often the result of network connectivity issues between members, or the result of a primary
that does not have the resources to support application and replication traffic. To check the status of a replica, use the
replSetGetStatus or the following helper in the shell:
rs.status()
The http://docs.mongodb.org/manualreference/command/replSetGetStatus document provides a more in-depth overview view of this output. In general, watch the value of optimeDate, and pay particular
attention to the time difference between the primary and the secondary members.
Sharding and Monitoring
In most cases, the components of sharded clusters benefit from the same monitoring and analysis as all other MongoDB
instances. In addition, clusters require further monitoring to ensure that data is effectively distributed among nodes
and that sharding operations are functioning appropriately.
See also:
See the Sharding Concepts (page 484) documentation for more information.
145
Config Servers The config database maintains a map identifying which documents are on which shards. The cluster
updates this map as chunks move between shards. When a configuration server becomes inaccessible, certain sharding
operations become unavailable, such as moving chunks and starting mongos instances. However, clusters remain
accessible from already-running mongos instances.
Because inaccessible configuration servers can seriously impact the availability of a sharded cluster, you should monitor your configuration servers to ensure that the cluster remains well balanced and that mongos instances can restart.
MMS Monitoring54 monitors config servers and can create notifications if a config server becomes inaccessible.
Balancing and Chunk Distribution The most effective sharded cluster deployments evenly balance chunks among
the shards. To facilitate this, MongoDB has a background balancer process that distributes data to ensure that chunks
are always optimally distributed among the shards.
Issue the db.printShardingStatus() or sh.status() command to the mongos by way of the mongo
shell. This returns an overview of the entire cluster including the database name, and a list of the chunks.
Stale Locks In nearly every case, all locks used by the balancer are automatically released when they become stale.
However, because any long lasting lock can block future balancing, its important to ensure that all locks are legitimate.
To check the lock status of the database, connect to a mongos instance using the mongo shell. Issue the following
command sequence to switch to the config database and display all outstanding locks on the shard database:
use config
db.locks.find()
For active deployments, the above query can provide insights. The balancing process, which originates on a randomly
selected mongos, takes a special balancer lock that prevents other balancing activity from transpiring. Use the
following command, also to the config database, to check the status of the balancer lock.
db.locks.find( { _id : "balancer" } )
If this lock exists, make sure that the balancer process is actively using this lock.
Run-time Database Configuration
The command line and configuration file interfaces provide MongoDB administrators with a large number of options and settings for controlling the operation of the database system. This document provides an overview
of common configurations and examples of best-practice configurations for common use cases.
While both interfaces provide access to the same collection of options and settings, this document primarily uses the
configuration file interface. If you run MongoDB using a control script or installed from a package for your operating
system, you likely already have a configuration file located at /etc/mongodb.conf. Confirm this by checking the
contents of the /etc/init.d/mongod or /etc/rc.d/mongod script to ensure that the control scripts start the
mongod with the appropriate configuration file (see below.)
To start a MongoDB instance using this configuration issue a command in the following form:
mongod --config /etc/mongodb.conf
mongod -f /etc/mongodb.conf
Modify the values in the /etc/mongodb.conf file on your system to control the configuration of your database
instance.
54 http://mms.mongodb.com
146
Chapter 4. Administration
For most standalone servers, this is a sufficient base configuration. It makes several assumptions, but consider the
following explanation:
fork is true, which enables a daemon mode for mongod, which detaches (i.e. forks) the MongoDB from
the current session and allows you to run the database as a conventional server.
bind_ip is 127.0.0.1, which forces the server to only listen for requests on the localhost IP. Only bind to
secure interfaces that the application-level systems can access with access control provided by system network
filtering (i.e. firewall).
port is 27017, which is the default MongoDB port for database instances. MongoDB can bind to any port.
You can also filter access based on port using network filtering tools.
Note: UNIX-like systems require superuser privileges to attach processes to ports lower than 1024.
quiet is true. This disables all but the most critical entries in output/log file. In normal operation this is
the preferable operation to avoid log noise. In diagnostic or testing situations, set this value to false. Use
setParameter to modify this setting during run time.
dbpath is /srv/mongodb, which specifies where MongoDB will store its data files. /srv/mongodb and
/var/lib/mongodb are popular locations. The user account that mongod runs under will need read and
write access to this directory.
logpath is /var/log/mongodb/mongod.log which is where mongod will write its output. If you do
not set this value, mongod writes all output to standard output (e.g. stdout.)
logappend is true, which ensures that mongod does not overwrite an existing log file following the server
start operation.
journal is true, which enables journaling. Journaling ensures single instance write-durability. 64-bit builds
of mongod enable journaling by default. Thus, this setting may be redundant.
Given the default configuration, some of these values may be redundant. However, in many situations explicitly stating
the configuration increases overall system intelligibility.
Security Considerations
The following collection of configuration options are useful for limiting access to a mongod instance. Consider the
following:
bind_ip = 127.0.0.1,10.8.0.10,192.168.4.24
nounixsocket = true
auth = true
147
bind_ip has three values: 127.0.0.1, the localhost interface; 10.8.0.10, a private IP address typically
used for local networks and VPN interfaces; and 192.168.4.24, a private network interface typically used
for local networks.
Because production MongoDB instances need to be accessible from multiple database servers, it is important
to bind MongoDB to multiple interfaces that are accessible from your application servers. At the same time its
important to limit these interfaces to interfaces controlled and protected at the network layer.
nounixsocket to true disables the UNIX Socket, which is otherwise enabled by default. This limits
access on the local system. This is desirable when running MongoDB on systems with shared access, but in
most situations has minimal impact.
auth is true enables the authentication system within MongoDB. If enabled you will need to log in by
connecting over the localhost interface for the first time to create user credentials.
See also:
Security Concepts (page 239)
Replication and Sharding Configuration
Replication Configuration Replica set configuration is straightforward, and only requires that the replSet have
a value that is consistent among all members of the set. Consider the following:
replSet = set0
Use descriptive names for sets. Once configured use the mongo shell to add hosts to the replica set.
See also:
Replica set reconfiguration (page 470).
To enable authentication for the replica set, add the following option:
keyFile = /srv/mongodb/keyfile
New in version 1.8: for replica sets, and 1.9.1 for sharded replica sets.
Setting keyFile enables authentication and specifies a key file for the replica set member use to when authenticating
to each other. The content of the key file is arbitrary, but must be the same on all members of the replica set and
mongos instances that connect to the set. The keyfile must be less than one kilobyte in size and may only contain
characters in the base64 set and the file must not have group or world permissions on UNIX systems.
See also:
The Replica set Reconfiguration (page 470) section for information regarding the process for changing replica set
during operation.
Additionally, consider the Replica Set Security (page 240) section for information on configuring authentication with
replica sets.
Finally, see the Replication (page 377) document for more information on replication in MongoDB and replica set
configuration in general.
Sharding Configuration Sharding requires a number of mongod instances with different configurations. The config servers store the clusters metadata, while the cluster distributes data among one or more shard servers.
Note: Config servers are not replica sets.
148
Chapter 4. Administration
To set up one or three config server instances as normal (page 147) mongod instances, and then add the following
configuration option:
configsvr = true
bind_ip = 10.8.0.12
port = 27001
This creates a config server running on the private IP address 10.8.0.12 on port 27001. Make sure that there are
no port conflicts, and that your config server is accessible from all of your mongos and mongod instances.
To set up shards, configure two or more mongod instance using your base configuration (page 147), adding the
shardsvr setting:
shardsvr = true
Finally, to establish the cluster, configure at least one mongos process with the following settings:
configdb = 10.8.0.12:27001
chunkSize = 64
You can specify multiple configdb instances by specifying hostnames and ports in the form of a comma separated
list. In general, avoid modifying the chunkSize from the default value of 64, 55 and should ensure this setting is
consistent among all mongos instances.
See also:
The Sharding (page 479) section of the manual for more information on sharding and cluster configuration.
Run Multiple Database Instances on the Same System
In many cases running multiple instances of mongod on a single system is not recommended. On some types of
deployments 56 and for testing purposes you may need to run more than one mongod on a single system.
In these cases, use a base configuration (page 147) for each instance, but consider the following configuration values:
dbpath = /srv/mongodb/db0/
pidfilepath = /srv/mongodb/db0.pid
The dbpath value controls the location of the mongod instances data directory. Ensure that each database has a
distinct and well labeled data directory. The pidfilepath controls where mongod process places its process id
file. As this tracks the specific mongod file, it is crucial that file be unique and well labeled to make it easy to start
and stop these processes.
Create additional control scripts and/or adjust your existing MongoDB configuration and control script as needed to
control these processes.
Diagnostic Configurations
The following configuration options control various mongod behaviors for diagnostic purposes. The following settings have default values that tuned for general production purposes:
55 Chunk size is 64 megabytes by default, which provides the ideal balance between the most even distribution of data, for which smaller chunk
sizes are best, and minimizing chunk migration, for which larger chunk sizes are optimal.
56 Single-tenant systems with SSD or other high performance disks may provide acceptable performance levels for multiple mongod instances.
Additionally, you may find that multiple databases with small working sets may function acceptably on a single system.
149
slowms = 50
profile = 3
verbose = true
diaglog = 3
objcheck = true
cpu = true
Use the base configuration (page 147) and add these options if you are experiencing some unknown issue or performance problem as needed:
slowms configures the threshold for the database profiler to consider a query slow. The default value is
100 milliseconds. Set a lower value if the database profiler does not return useful results. See Optimization
Strategies for MongoDB (page 165) for more information on optimizing operations in MongoDB.
profile sets the database profiler level. The profiler is not active by default because of the possible impact
on the profiler itself on performance. Unless this setting has a value, queries are not profiled.
verbose enables a verbose logging mode that modifies mongod output and increases logging to include a
greater number of events. Only use this option if you are experiencing an issue that is not reflected in the normal
logging level. If you require additional verbosity, consider the following options:
v = true
vv = true
vvv = true
vvvv = true
vvvvv = true
Each additional level v adds additional verbosity to the logging. The verbose option is equal to v = true.
diaglog enables diagnostic logging. Level 3 logs all read and write options.
objcheck forces mongod to validate all requests from clients upon receipt. Use this option to ensure that
invalid requests are not causing errors, particularly when running a database with untrusted clients. This option
may affect database performance.
cpu forces mongod to report the percentage of the last interval spent in write lock. The interval is typically 4
seconds, and each output line in the log includes both the actual interval since the last report and the percentage
of time spent in write lock.
Import and Export MongoDB Data
This document provides an overview of the import and export programs included in the MongoDB distribution. These
tools are useful when you want to backup or export a portion of your data without capturing the state of the entire
database, or for simple data ingestion cases. For more complex data migration tasks, you may want to write your own
import and export scripts using a client driver to interact with the database itself. For disaster recovery protection and
routine database backup operation, use full database instance backups (page 136).
Warning: Because these tools primarily operate by interacting with a running mongod instance, they can impact
the performance of your running database.
Not only do these processes create traffic for a running database instance, they also force the database to read all
data through memory. When MongoDB reads infrequently used data, it can supplant more frequently accessed
data, causing a deterioration in performance for the databases regular workload.
mongoimport and mongoexport do not reliably preserve all rich BSON data types, because BSON is a superset of JSON. Thus, mongoimport and mongoexport cannot represent BSON data accurately in JSON. As
a result data exported or imported with these tools may lose some measure of fidelity. See MongoDB Extended
JSON (page 227) for more information about MongoDB Extended JSON.
150
Chapter 4. Administration
See also:
See the MongoDB Backup Methods (page 136) document or the MMS Backup Manual57 for more information on
backing up MongoDB instances. Additionally, consider the following references for commands addressed in this
document:
http://docs.mongodb.org/manualreference/program/mongoexport
http://docs.mongodb.org/manualreference/program/mongorestore
http://docs.mongodb.org/manualreference/program/mongodump
If you want to transform and process data once youve imported it in MongoDB consider the documents in the Aggregation (page 277) section, including:
Map-Reduce (page 284) and
Aggregation Concepts (page 281).
Data Type Fidelity
JSON does not have the following data types that exist in BSON documents: data_binary, data_date,
data_timestamp, data_regex, data_oid and data_ref. As a result using any tool that decodes BSON
documents into JSON will suffer some loss of fidelity.
If maintaining type fidelity is important, consider writing a data import and export system that does not force BSON
documents into JSON form as part of the process. The following list of types contain examples for how MongoDB
will represent how BSON documents render in JSON.
data_binary
{ "$binary" : "<bindata>", "$type" : "<t>" }
<bindata> is the base64 representation of a binary string. <t> is the hexadecimal representation of a single
byte indicating the data type.
data_date
Date( <date> )
<date> is the JSON representation of a 64-bit signed integer for milliseconds since epoch.
data_timestamp
Timestamp( <t>, <i> )
<t> is the JSON representation of a 32-bit unsigned integer for milliseconds since epoch. <i> is a 32-bit
unsigned integer for the increment.
data_regex
/<jRegex>/<jOptions>
<jRegex> is a string that may contain valid JSON characters and unescaped double quote (i.e. ") characters,
but may not contain unescaped forward slash (i.e. http://docs.mongodb.org/manual) characters.
<jOptions> is a string that may contain only the characters g, i, m, and s.
data_oid
ObjectId( "<id>" )
57 https://mms.mongodb.com/help/backup
151
<id> is a 24 character hexadecimal string. These representations require that data_oid values have an
associated field named _id.
data_ref
DBRef( "<name>", "<id>" )
For resilient and non-disruptive backups, use a file system or block-level disk snapshot function, such as the methods described in the MongoDB Backup Methods (page 136) document. The tools and operations discussed provide
functionality thats useful in the context of providing some kinds of backups.
By contrast, use import and export tools to backup a small subset of your data or to move data to or from a 3rd party
system. These backups may capture a small crucial set of data or a frequently modified section of data, for extra
insurance, or for ease of access. No matter how you decide to import or export your data, consider the following
guidelines:
Label files so that you can identify what point in time the export or backup reflects.
Labeling should describe the contents of the backup, and reflect the subset of the data corpus, captured in the
backup or export.
Do not create or apply exports if the backup process itself will have an adverse effect on a production system.
Make sure that they reflect a consistent data state. Export or backup processes can impact data integrity (i.e.
type fidelity) and consistency if updates continue during the backup process.
Test backups and exports by restoring and importing to ensure that the backups are useful.
Human Intelligible Import/Export Formats
This section describes a process to import/export your database, or a portion thereof, to a file in a JSON or CSV format.
See also:
The
http://docs.mongodb.org/manualreference/program/mongoimport
and
http://docs.mongodb.org/manualreference/program/mongoexport documents contain complete
documentation of these tools. If you have questions about the function and parameters of these tools not covered here,
please refer to these documents.
If you want to simply copy a database or collection from one instance to another, consider using the copydb,
clone, or cloneCollection commands, which may be more suited to this task. The mongo shell provides
the db.copyDatabase() method.
These tools may also be useful for importing data into a MongoDB database from third party applications.
Collection Export with mongoexport With the mongoexport utility you can create a backup file. In the most
simple invocation, the command takes the following form:
mongoexport --collection collection --out collection.json
152
Chapter 4. Administration
This will export all documents in the collection named collection into the file collection.json. Without
the output specification (i.e. --out collection.json), mongoexport writes output to standard output (i.e.
stdout). You can further narrow the results by supplying a query filter using the --query and limit results to a
single database using the --db option. For instance:
mongoexport --db sales --collection contacts --query '{"field": 1}'
This command returns all documents in the sales databases contacts collection, with a field named field with
a value of 1. Enclose the query in single quotes (e.g. ) to ensure that it does not interact with your shell environment.
The resulting documents will return on standard output.
By default, mongoexport returns one JSON document per MongoDB document. Specify the --jsonArray
argument to return the export as a single JSON array. Use the --csv file to return the result in CSV (comma
separated values) format.
If your mongod instance is not running, you can use the --dbpath option to specify the location to your MongoDB instances database files. See the following example:
mongoexport --db sales --collection contacts --dbpath /srv/MongoDB/
This reads the data files directly. This locks the data directory to prevent conflicting writes. The mongod process must
not be running or attached to these data files when you run mongoexport in this configuration.
The --host and --port options allow you to specify a non-local host to connect to capture the export. Consider
the following example:
mongoexport --host mongodb1.example.net --port 37017 --username user --password pass --collection con
On any mongoexport command you may, as above specify username and password credentials as above.
Collection Import with mongoimport To restore a backup taken with mongoexport. Most of the arguments
to mongoexport also exist for mongoimport. Consider the following command:
mongoimport --collection collection --file collection.json
This imports the contents of the file collection.json into the collection named collection. If you do not
specify a file with the --file option, mongoimport accepts input over standard input (e.g. stdin.)
If you specify the --upsert option, all of mongoimport operations will attempt to update existing documents
in the database and insert other documents. This option will cause some performance impact depending on your
configuration.
You can specify the database option --db to import these documents to a particular database. If your MongoDB
instance is not running, use the --dbpath option to specify the location of your MongoDB instances database
files. Consider using the --journal option to ensure that mongoimport records its operations in the journal. The mongod process must not be running or attached to these data files when you run mongoimport in this
configuration.
Use the --ignoreBlanks option to ignore blank fields. For CSV and TSV imports, this option provides the
desired functionality in most cases: it avoids inserting blank fields in MongoDB documents.
Production Notes
This page details system configurations that affect MongoDB, especially in production.
Note: MongoDB Management Service (MMS)58 is a hosted monitoring service which collects and aggregates diag-
153
nostic data to provide insight into the performance and operation of MongoDB deployments. See the MMS Website59
and the MMS documentation60 for more information.
Packages
MongoDB Be sure you have the latest stable release. All releases are available on the Downloads61 page. This is a
good place to verify what is current, even if you then choose to install via a package manager.
Always use 64-bit builds for production. The 32-bit build MongoDB offers for test and development environments is
not suitable for production deployments as it can store no more than 2GB of data. See the 32-bit limitations (page 558)
for more information.
32-bit builds exist to support use on development machines.
Operating Systems MongoDB distributions are currently available for Mac OS X, Linux, Windows Server 2008 R2
64bit, Windows 7 (32 bit and 64 bit), Windows Vista, and Solaris platforms.
Note: MongoDB uses the GNU C Library62 (glibc) if available on a system. MongoDB requires version at least
glibc-2.12-1.2.el6 to avoid a known bug with earlier versions. For best results use at least version 2.13.
Concurrency
In earlier versions of MongoDB, all write operations contended for a single readers-writer lock on the MongoDB
instance. As of version 2.2, each database has a readers-writer lock that allows concurrent reads access to a database,
but gives exclusive access to a single write operation per database. See the Concurrency (page 569) page for more
information.
Journaling
MongoDB uses write ahead logging to an on-disk journal to guarantee that MongoDB is able to quickly recover the
write operations (page 42) following a crash or other serious failure.
In order to ensure that mongod will be able to recover and remain in a valid state following a crash, you should leave
journaling enabled. See Journaling (page 234) for more information.
Networking
Use Trusted Networking Environments Always run MongoDB in a trusted environment, with network rules that
prevent access from all unknown machines, systems, and networks. As with any sensitive system dependent on
network access, your MongoDB deployment should only be accessible to specific systems that require access, such as
application servers, monitoring services, and other MongoDB components.
Note: By default, auth is not enabled and mongod assumes a trusted environment. You can enable security/auth
(page 239) mode if you need it.
58 http://mms.mongodb.com
59 http://mms.mongodb.com/
60 http://mms.mongodb.com/help/
61 http://www.mongodb.org/downloads
62 http://www.gnu.org/software/libc/
154
Chapter 4. Administration
See documents in the Security (page 237) section for additional information, specifically:
Configuration Options (page 242)
Firewalls (page 243)
Configure Linux iptables Firewall for MongoDB (page 245)
Configure Windows netsh Firewall for MongoDB (page 249)
For Windows users, consider the Windows Server Technet Article on TCP Configuration63 when deploying MongoDB
on Windows.
Connection Pools To avoid overloading the connection resources of a single mongod or mongos instance, ensure
that clients maintain reasonable connection pool sizes.
The connPoolStats database command returns information regarding the number of open connections to the
current database for mongos instances and mongod instances in sharded clusters.
Hardware Considerations
MongoDB is designed specifically with commodity hardware in mind and has few hardware requirements or limitations. MongoDBs core components run on little-endian hardware, primarily x86/x86_64 processors. Client libraries
(i.e. drivers) can run on big or little endian systems.
Hardware Requirements and Limitations
following properties:
Allocate Sufficient RAM and CPU
for performance.
The hardware for the most effective MongoDB deployments have the
As with all software, more RAM and a faster CPU clock speed are important
In general, databases are not CPU bound. As such, increasing the number of cores can help, but does not provide
significant marginal return.
Use Solid State Disks (SSDs)
(Solid State Disk).
MongoDB has good results and a good price-performance ratio with SATA SSD
Use SSD if available and economical. Spinning disks can be performant, but SSDs capacity for random I/O operations
works well with the update model of mongod.
Commodity (SATA) spinning drives are often a good option, as the increase to random I/O for more expensive drives
is not that dramatic (only on the order of 2x). Using SSDs or increasing RAM may be more effective in increasing I/O
throughput.
Avoid Remote File Systems
Remote file storage can create performance problems in MongoDB. See Remote Filesystems (page 156) for
more information about storage and MongoDB.
63 http://technet.microsoft.com/en-us/library/dd349797.aspx
155
Then, disable zone reclaim in the proc settings using the following command:
echo 0 > /proc/sys/vm/zone_reclaim_mode
To fully disable NUMA, you must perform both operations. For more information, see the Documentation for
/proc/sys/vm/*64 .
See the The MySQL swap insanity problem and the effects of NUMA65 post, which describes the effects of NUMA
on databases. This blog post addresses the impact of NUMA for MySQL, but the issues for MongoDB are similar. The
post introduces NUMA and its goals, and illustrates how these goals are not compatible with production databases.
Disk and Storage Systems
Swap Assign swap space for your systems. Allocating swap space can avoid issues with memory contention and
can prevent the OOM Killer on Linux systems from killing mongod.
The method mongod uses to map memory files to memory ensures that the operating system will never store MongoDB data in swap space.
RAID Most MongoDB deployments should use disks backed by RAID-10.
RAID-5 and RAID-6 do not typically provide sufficient performance to support a MongoDB deployment.
Avoid RAID-0 with MongoDB deployments. While RAID-0 provides good write performance, it also provides limited
availability and can lead to reduced performance on read operations, particularly when using Amazons EBS volumes.
Remote Filesystems The Network File System protocol (NFS) is not recommended for use with MongoDB as some
versions perform poorly.
Performance problems arise when both the data files and the journal files are hosted on NFS. You may experience
better performance if you place the journal on local or iscsi volumes. If you must use NFS, add the following NFS
options to your /etc/fstab file: bg, nolock, and noatime.
64 http://www.kernel.org/doc/Documentation/sysctl/vm.txt
65 http://jcole.us/blog/archives/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/
156
Chapter 4. Administration
Separate Components onto Different Storage Devices For improved performance, consider separating your
databases data, journal, and logs onto different storage devices, based on your applications access and write pattern.
Note: This will affect your ability to create snapshot-style backups of your data, since the files will be on different
devices and volumes.
Architecture
Write Concern Write concern describes the guarantee that MongoDB provides when reporting on the success of
a write operation. The strength of the write concerns determine the level of guarantee. When inserts, updates and
deletes have a weak write concern, write operations return quickly. In some failure cases, write operations issued with
weak write concerns may not persist. With stronger write concerns, clients wait after sending a write operation for
MongoDB to confirm the write operations.
MongoDB provides different levels of write concern to better address the specific needs of applications. Clients
may adjust write concern to ensure that the most important operations persist successfully to an entire MongoDB
deployment. For other less critical operations, clients can adjust the write concern to ensure faster performance rather
than ensure persistence to the entire deployment.
See Write Concern (page 47) for more information about choosing an appropriate write concern level for your deployment.
Replica Sets See Replica Set Deployment Architectures (page 390) for an overview of architectural considerations
for replica set deployments.
Sharded Clusters See Production Cluster Architecture (page 490) for an overview of recommended sharded cluster
architectures for production deployments.
Platforms
MongoDB on Linux
Important: The following discussion only applies to Linux, and therefore does not affect deployments where
mongod instances run other UNIX-like systems or on Windows.
Kernel and File Systems When running MongoDB in production on Linux, it is recommended that you use Linux
kernel version 2.6.36 or later.
MongoDB preallocates its database files before using them and often creates large files. As such, you should use the
Ext4 and XFS file systems:
In general, if you use the Ext4 file system, use at least version 2.6.23 of the Linux Kernel.
In general, if you use the XFS file system, use at least version 2.6.25 of the Linux Kernel.
Some Linux distributions require different versions of the kernel to support using ext4 and/or xfs:
157
Linux Distribution
CentOS 5.5
CentOS 5.6
CentOS 5.8
CentOS 6.1
RHEL 5.6
RHEL 6.0
Ubuntu 10.04.4 LTS
Amazon Linux AMI release 2012.03
Filesystem
ext4, xfs
ext4, xfs
ext4, xfs
ext4, xfs
ext4
xfs
ext4, xfs
ext4
Kernel Version
2.6.18-194.el5
2.6.18-238.el5
2.6.18-308.8.2.el5
2.6.32-131.0.15.el6.x86_64
2.6.18-238
2.6.32-71
2.6.32-38-server
3.2.12-3.2.4.amzn1.x86_64
Important: MongoDB requires a filesystem that supports fsync() on directories. For example, HGFS and Virtual
Boxs shared folders do not support this operation.
Recommended Configuration
Turn off atime for the storage volume containing the database files.
Set the file descriptor limit, -n, and the user process limit (ulimit), -u, above 20,000, according to the suggestions in the UNIX ulimit Settings (page 223). A low ulimit will affect MongoDB when under heavy use and can
produce errors and lead to failed connections to MongoDB processes and loss of service.
Disable transparent huge pages as MongoDB performs better with normal (4096 bytes) virtual memory pages.
Disable NUMA in your BIOS. If that is not possible see MongoDB on NUMA Hardware (page 156).
Ensure that readahead settings for the block devices that store the database files are appropriate. For random
access use patterns, set low readahead values. A readahead of 32 (16kb) often works well.
Use the Network Time Protocol (NTP) to synchronize time among your hosts. This is especially important in
sharded clusters.
MongoDB on Virtual Environments
more common virtual environments.
EC2 MongoDB is compatible with EC2 and requires no configuration changes specific to the environment.
You may alternately choose to obtain a set of Amazon Machine Images (AMI) that bundle together MongoDB and
Amazons Provisioned IOPS storage volumes. Provisioned IOPS can greatly increase MongoDBs performance and
ease of use. For more information, see this blog post66 .
VMWare MongoDB is compatible with VMWare. As some users have run into issues with VMWares memory
overcommit feature, disabling the feature is recommended.
It is possible to clone a virtual machine running MongoDB. You might use this function to spin up a new virtual host
to add as a member of a replica set. If you clone a VM with journaling enabled, the clone snapshot will be valid. If
not using journaling, first stop mongod, then clone the VM, and finally, restart mongod.
OpenVZ Some users have had issues when running MongoDB on some older version of OpenVZ due to its handling
of virtual memory, as with VMWare.
This issue seems to have been resolved in the more recent versions of OpenVZ.
66 http://www.mongodb.com/blog/post/provisioned-iops-aws-marketplace-significantly-boosts-mongodb-performance-ease-use
158
Chapter 4. Administration
Performance Monitoring
iostat On Linux, use the iostat command to check if disk I/O is a bottleneck for your database. Specify a number
of seconds when running iostat to avoid displaying stats covering the time since server boot.
For example, the following command will display extended statistics and the time for each displayed report, with
traffic in MB/s, at one second intervals:
iostat -xmt 1
To make backups of your MongoDB database, please refer to MongoDB Backup Methods (page 136).
159
Operational Segregation in MongoDB Deployments (page 160) MongoDB lets you specify that certain application
operations use certain mongod instances.
Tag Aware Sharding (page 540) Tags associate specific ranges of shard key values with specific shards for use in
managing deployment patterns.
Manage Shard Tags (page 541) Use tags to associate specific ranges of shard key values with specific shards.
Operational Segregation in MongoDB Deployments
Operational Overview MongoDB includes a number of features that allow database administrators and developers
to segregate application operations to MongoDB deployments by functional or geographical groupings.
This capability provides data center awareness, which allows applications to target MongoDB deployments with
consideration of the physical location of the mongod instances. MongoDB supports segmentation of operations
across different dimensions, which may include multiple data centers and geographical regions in multi-data center
deployments, racks, networks, or power circuits in single data center deployments.
MongoDB also supports segregation of database operations based on functional or operational parameters, to ensure
that certain mongod instances are only used for reporting workloads or that certain high-frequency portions of a
sharded collection only exist on specific shards.
Specifically, with MongoDB, you can:
ensure write operations propagate to specific members of a replica set, or to specific members of replica sets.
ensure that specific members of a replica set respond to queries.
ensure that specific ranges of your shard key balance onto and reside on specific shards.
combine the above features in a single distributed deployment, on a per-operation (for read and write operations)
and collection (for chunk distribution in sharded clusters distribution) basis.
For full documentation of these features, see the following documentation in the MongoDB Manual:
Read Preferences (page 406), which controls how drivers help applications target read operations to members
of a replica set.
Write Concerns (page 47), which controls how MongoDB ensures that write operations propagate to members
of a replica set.
Replica Set Tags (page 450), which control how applications create and interact with custom groupings of replica
set members to create custom application-specific read preferences and write concerns.
Tag Aware Sharding (page 540), which allows MongoDB administrators to define an application-specific balancing policy, to control how documents belonging to specific ranges of a shard key distribute to shards in the
sharded cluster.
See also:
Before adding operational segregation features to your application and MongoDB deployment, become familiar with
all documentation of replication (page 377), :and doc:sharding </sharding>.
Further Reading
The Write Concern (page 47) and Read Preference (page 406) documents, which address capabilities related to
data center awareness.
Deploy a Geographically Redundant Replica Set (page 425).
160
Chapter 4. Administration
Capped Collections
Capped collections are fixed-size collections that support high-throughput operations that insert, retrieve, and delete
documents based on insertion order. Capped collections work in a way similar to circular buffers: once a collection
fills its allocated space, it makes room for new documents by overwriting the oldest documents in the collection.
See createCollection() or createCollection for more information on creating capped collections.
Capped collections have the following behaviors:
Capped collections guarantee preservation of the insertion order. As a result, queries do not need an index to
return documents in insertion order. Without this indexing overhead, they can support higher insertion throughput.
Capped collections guarantee that insertion order is identical to the order on disk (natural order) and do so
by prohibiting updates that increase document size. Capped collections only allow updates that fit the original
document size, which ensures a document does not change its location on disk.
Capped collections automatically remove the oldest documents in the collection without requiring scripts or
explicit remove operations.
For example, the oplog.rs collection that stores a log of the operations in a replica set uses a capped collection.
Consider the following potential use cases for capped collections:
Store log information generated by high-volume systems. Inserting documents in a capped collection without
an index is close to the speed of writing log information directly to a file system. Furthermore, the built-in
first-in-first-out property maintains the order of events, while managing storage use.
Cache small amounts of data in a capped collections. Since caches are read rather than write heavy, you would
either need to ensure that this collection always remains in the working set (i.e. in RAM) or accept some write
penalty for the required index or indexes.
Recommendations and Restrictions
You can update documents in a collection after inserting them. However, these updates cannot cause the documents to grow. If the update operation causes the document to grow beyond their original size, the update
operation will fail.
If you plan to update documents in a capped collection, create an index so that these update operations do not
require a table scan.
You cannot delete documents from a capped collection. To remove all records from a capped collection, use the
emptycapped command. To remove the collection entirely, use the drop() method.
You cannot shard a capped collection.
Capped collections created after 2.2 have an _id field and an index on the _id field by default. Capped
collections created before 2.2 do not have an index on the _id field by default. If you are using capped
collections with replication prior to 2.2, you should explicitly create an index on the _id field.
Warning: If you have a capped collection in a replica set outside of the local database, before 2.2,
you should create a unique index on _id. Ensure uniqueness using the unique: true option to
the ensureIndex() method or by using an ObjectId for the _id field. Alternately, you can use the
autoIndexId option to create when creating the capped collection, as in the Query a Capped Collection (page 162) procedure.
Use natural ordering to retrieve the most recently inserted elements from the collection efficiently. This is
(somewhat) analogous to tail on a log file.
161
Procedures
Create a Capped Collection You must create capped collections explicitly using the createCollection()
method, which is a helper in the mongo shell for the create command. When creating a capped collection you must
specify the maximum size of the collection in bytes, which MongoDB will pre-allocate for the collection. The size of
the capped collection includes a small amount of space for internal overhead.
db.createCollection( "log", { capped: true, size: 100000 } )
Additionally, you may also specify a maximum number of documents for the collection using the max field as in the
following document:
db.createCollection("log", { capped : true, size : 5242880, max : 5000 } )
Important: The size argument is always required, even when you specify max number of documents. MongoDB
will remove older documents if a collection reaches the maximum size limit before it reaches the maximum document
count.
See
createCollection() and create.
Query a Capped Collection If you perform a find() on a capped collection with no ordering specified, MongoDB
guarantees that the ordering of results is the same as the insertion order.
To retrieve documents in reverse insertion order, issue find() along with the sort() method with the $natural
parameter set to -1, as shown in the following example:
db.cappedCollection.find().sort( { $natural: -1 } )
Check if a Collection is Capped Use the isCapped() method to determine if a collection is capped, as follows:
db.collection.isCapped()
Convert a Collection to Capped You can convert a non-capped collection to a capped collection with the
convertToCapped command:
db.runCommand({"convertToCapped": "mycoll", size: 100000});
The size parameter specifies the size of the capped collection in bytes.
Warning: This command obtains a global write lock and will block other operations until it has completed.
Changed in version 2.2: Before 2.2, capped collections did not have an index on _id unless you specified
autoIndexId to the create, after 2.2 this became the default.
Automatically Remove Data After a Specified Period of Time For additional flexibility when expiring data, consider MongoDBs TTL indexes, as described in Expire Data from Collections by Setting TTL (page 163). These indexes
allow you to expire and remove data from normal collections using a special type, based on the value of a date-typed
field and a TTL value for the index.
TTL Collections (page 163) are not compatible with capped collections.
162
Chapter 4. Administration
Tailable Cursor You can use a tailable cursor with capped collections. Similar to the Unix tail -f command,
the tailable cursor tails the end of a capped collection. As new documents are inserted into the capped collection,
you can use the tailable cursor to continue retrieving documents.
See Create Tailable Cursor (page 74) for information on creating a tailable cursor.
Expire Data from Collections by Setting TTL
New in version 2.2.
This document provides an introduction to MongoDBs time to live or TTL collection feature. TTL collections
make it possible to store data in MongoDB and have the mongod automatically remove data after a specified number
of seconds or at a specific clock time.
Data expiration is useful for some classes of information, including machine generated event data, logs, and session
information that only need to persist for a limited period of time.
A special index type supports the implementation of TTL collections. TTL relies on a background thread in mongod
that reads the date-typed values in the index and removes expired documents from the collection.
Considerations
When you build a TTL index in the background (page 336), the TTL thread can begin deleting documents
while the index is building. If you build a TTL index in the foreground, MongoDB begins removing expired
documents as soon as the index finishes building.
When the TTL thread is active, you will see delete (page 42) operations in the output of db.currentOp() or in the
data collected by the database profiler (page 174).
When using TTL indexes on replica sets, the TTL background thread only deletes documents on primary members.
However, the TTL background thread does run on secondaries. Secondary members replicate deletion operations from
the primary.
The TTL index does not guarantee that expired data will be deleted immediately. There may be a delay between the
time a document expires and the time that MongoDB removes the document from the database.
The background task that removes expired documents runs every 60 seconds. As a result, documents may remain in a
collection after they expire but before the background task runs or completes.
The duration of the removal operation depends on the workload of your mongod instance. Therefore, expired data
may exist for some time beyond the 60 second period between runs of the background task.
163
All collections with an index using the expireAfterSeconds option have usePowerOf2Sizes enabled. Users
cannot modify this setting. As a result of enabling usePowerOf2Sizes, MongoDB must allocate more disk space
relative to data size. This approach helps mitigate the possibility of storage fragmentation caused by frequent delete
operations and leads to more predictable storage use patterns.
Procedures
To enable TTL for a collection, use the ensureIndex() method to create a TTL index, as shown in the examples
below.
With the exception of the background thread, a TTL index supports queries in the same way normal indexes do. You
can use TTL indexes to expire documents in one of two ways, either:
remove documents a certain number of seconds after creation. The index will support queries for the creation
time of the documents. Alternately,
specify an explicit expiration time. The index will support queries for the expiration-time of the document.
Expire Documents after a Certain Number of Seconds To expire data after a certain number of seconds, create
a TTL index on a field that holds values of BSON date type or an array of BSON date-typed objects and specify a
positive non-zero value in the expireAfterSeconds field. A document will expire when the number of seconds
in the expireAfterSeconds field has passed since the time specified in its indexed field. 68
For example, the following operation creates an index on the log.events collections createdAt field and specifies the expireAfterSeconds value of 3600 to set the expiration time to be one hour after the time specified by
createdAt.
db.log.events.ensureIndex( { "createdAt": 1 }, { expireAfterSeconds: 3600 } )
When adding documents to the log.events collection, set the createdAt field to the current time:
db.log.events.insert( {
"createdAt": new Date(),
"logEvent": 2,
"logMessage": "Success!"
} )
MongoDB will automatically delete documents from the log.events collection when the documents createdAt
value 1 is older than the number of seconds specified in expireAfterSeconds.
Expire Documents at a Certain Clock Time To expire documents at a certain clock time, begin by creating a
TTL index on a field that holds values of BSON date type or an array of BSON date-typed objects and specify an
expireAfterSeconds value of 0. For each document in the collection, set the indexed date field to a value
corresponding to the time the document should expire. If the indexed date field contains a date in the past, MongoDB
considers the document expired.
For example, the following operation creates an index on the app.events collections expireAt field and specifies
the expireAfterSeconds value of 0:
db.app.events.ensureIndex( { "expireAt": 1 }, { expireAfterSeconds: 0 } )
For each document, set the value of expireAt to correspond to the time the document should expire. For instance,
the following insert() operation adds a document that should expire at July 22, 2013 14:00:00.
68 If the field contains an array of BSON date-typed objects, data expires if at least one of BSON date-typed object is older than the number of
seconds specified in expireAfterSeconds.
164
Chapter 4. Administration
db.app.events.insert( {
"expireAt": new Date('July 22, 2013 14:00:00'),
"logEvent": 2,
"logMessage": "Success!"
} )
MongoDB will automatically delete documents from the app.events collection when the documents expireAt
value is older than the number of seconds specified in expireAfterSeconds, i.e. 0 seconds older in this case. As
such, the data expires at the specified expireAt value.
MongoDB provides a database profiler that shows performance characteristics of each operation against the database.
Use the profiler to locate any queries or write operations that are running slow. You can use this information, for
example, to determine what indexes to create.
For more information, see Database Profiling (page 144).
Use db.currentOp() to Evaluate mongod Operations
The explain() method returns statistics on a query, and reports the index MongoDB selected to fulfill the query, as
well as information about the internal operation of the query.
Example
To use explain() on a query for documents matching the expression { a:
records, use an operation that resembles the following in the mongo shell:
4.1. Administration Concepts
165
db.records.find( { a: 1 } ).explain()
Capped Collections (page 161) are circular, fixed-size collections that keep documents well-ordered, even without the
use of an index. This means that capped collections can receive very high-speed writes and sequential reads.
These collections are particularly useful for keeping log files but are not limited to that purpose. Use capped collections
where appropriate.
Use Natural Order for Fast Reads
To return documents in the order they exist on disk, return sorted operations using the $natural operator. On a
capped collection, this also returns the documents in the order in which they were written.
Natural order does not use indexes but can be fast for operations when you want to select the first or last items on disk.
See also:
sort() and limit().
Optimize Query Performance
Create Indexes to Support Queries
For commonly issued queries, create indexes (page 313). If a query searches multiple fields, create a compound index
(page 322). Scanning an index is much faster than scanning a collection. The indexes structures are smaller than the
documents reference, and store references in order.
Example
If you have a posts collection containing blog posts, and if you regularly issue a query that sorts on the
author_name field, then you can optimize the query by creating an index on the author_name field:
db.posts.ensureIndex( { author_name : 1 } )
Indexes also improve efficiency on queries that routinely sort on a given field.
Example
If you regularly issue a query that sorts on the timestamp field, then you can optimize the query by creating an
index on the timestamp field:
Creating this index:
db.posts.ensureIndex( { timestamp : 1 } )
166
Chapter 4. Administration
Because MongoDB can read indexes in both ascending and descending order, the direction of a single-key index does
not matter.
Indexes support queries, update operations, and some phases of the aggregation pipeline (page 283).
Index keys that are of the BinData type are more efficiently stored in the index if:
the binary subtype value is in the range of 0-7 or 128-135, and
the length of the byte array is: 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 16, 20, 24, or 32.
Limit the Number of Query Results to Reduce Network Demand
MongoDB cursors return results in groups of multiple documents. If you know the number of results you want, you
can reduce the demand on network resources by issuing the limit() method.
This is typically used in conjunction with sort operations. For example, if you need only 10 results from your query to
the posts collection, you would issue the following command:
db.posts.find().sort( { timestamp : -1 } ).limit(10)
When you need only a subset of fields from documents, you can achieve better performance by returning only the
fields you need:
For example, if in your query to the posts collection, you need only the timestamp, title, author, and
abstract fields, you would issue the following command:
db.posts.find( {}, { timestamp : 1 , title : 1 , author : 1 , abstract : 1} ).sort( { timestamp : -1
For more information on using projections, see Limit Fields to Return from a Query (page 64).
Use $hint to Select a Particular Index
In most cases the query optimizer (page 37) selects the optimal index for a specific operation; however, you can force
MongoDB to use a specific index using the hint() method. Use hint() to support performance testing, or on
some queries where you must select a field or field included in several indexes.
Use the Increment Operator to Perform Operations Server-Side
Use MongoDBs $inc operator to increment or decrement values in documents. The operator increments the value
of the field on the server side, as an alternative to selecting a document, making simple modifications in the client
and then writing the entire document to the server. The $inc operator can also help avoid race conditions, which
would result when two application instances queried for a document, manually incremented a field, and saved the
entire document back at the same time.
Design Notes
This page details features of MongoDB that may be important to bear in mind when designing your applications.
167
Schema Considerations
Dynamic Schema Data in MongoDB has a dynamic schema. Collections do not enforce document structure. This
facilitates iterative development and polymorphism. Nevertheless, collections often hold documents with highly homogeneous structures. See Data Modeling Concepts (page 99) for more information.
Some operational considerations include:
the exact set of collections to be used;
the indexes to be used: with the exception of the _id index, all indexes must be created explicitly;
shard key declarations: choosing a good shard key is very important as the shard key cannot be changed once
set.
Avoid importing unmodified data directly from a relational database. In general, you will want to roll up certain
data into richer documents that take advantage of MongoDBs support for sub-documents and nested arrays.
Case Sensitive Strings MongoDB strings are case sensitive. So a search for "joe" will not find "Joe".
Consider:
storing data in a normalized case format, or
using regular expressions ending with http://docs.mongodb.org/manuali, and/or
using $toLower or $toUpper in the aggregation framework (page 281).
Type Sensitive Fields MongoDB data is stored in the BSON69 format, a binary encoded serialization of JSON-like
documents. BSON encodes additional type information. See bsonspec.org70 for more information.
Consider the following document which has a field x with the string value "123":
{ x : "123" }
Then the following query which looks for a number value 123 will not return that document:
db.mycollection.find( { x : 123 } )
General Considerations
By Default, Updates Affect one Document To update multiple documents that meet your query criteria, set the
update multi option to true or 1. See: Update Multiple Documents (page 45).
Prior to MongoDB 2.2, you would specify the upsert and multi options in the update method as positional
boolean options. See: the update method reference documentation.
BSON Document Size Limit The BSON Document Size limit is currently set at 16MB per document. If you
require larger documents, use GridFS (page 104).
No Fully Generalized Transactions MongoDB does not have fully generalized transactions (page 76). If you
model your data using rich documents that closely resemble your applications objects, each logical object will be in
one MongoDB document. MongoDB allows you to modify a document in a single atomic operation. These kinds of
data modification pattern covers most common uses of transactions in other systems.
69 http://docs.mongodb.org/meta-driver/latest/legacy/bson/
70 http://bsonspec.org/#/specification
168
Chapter 4. Administration
Use an Odd Number of Replica Set Members Replica sets (page 377) perform consensus elections. To ensure
that elections will proceed successfully, either use an odd number of members, typically three, or else use an arbiter
to ensure an odd number of votes.
Keep Replica Set Members Up-to-Date MongoDB replica sets support automatic failover (page 397). It is important for your secondaries to be up-to-date. There are various strategies for assessing consistency:
1. Use monitoring tools to alert you to lag events. See Monitoring for MongoDB (page 138) for a detailed discussion of MongoDBs monitoring options.
2. Specify appropriate write concern.
3. If your application requires manual fail over, you can configure your secondaries as priority 0 (page 386).
Priority 0 secondaries require manual action for a failover. This may be practical for a small replica set, but
large deployments should fail over automatically.
See also:
replica set rollbacks (page 401).
Sharding Considerations
Pick your shard keys carefully. You cannot choose a new shard key for a collection that is already sharded.
Shard key values are immutable.
When enabling sharding on an existing collection, MongoDB imposes a maximum size on those collections to ensure that it is possible to create chunks. For a detailed explanation of this limit, see:
<sharding-existing-collection-data-size>.
To shard large amounts of data, create a new empty sharded collection, and ingest the data from the source
collection using an application level import operation.
Unique indexes are not enforced across shards except for the shard key itself. See Enforce Unique Keys for
Sharded Collections (page 542).
Consider pre-splitting (page 506) a sharded collection before a massive bulk import.
169
Backup and Restore Sharded Clusters (page 198) Detailed procedures and considerations for backing up
sharded clusters and single shards.
Recover Data after an Unexpected Shutdown (page 203) Recover data from MongoDB data files that were not
properly closed or have an invalid state.
MongoDB Scripting (page 205) An introduction to the scripting capabilities of the mongo shell and the scripting
capabilities embedded in MongoDB instances.
MongoDB Tutorials (page 186) A complete list of tutorials in the MongoDB Manual that address MongoDB operation and use.
You specify a command first by constructing a standard BSON document whose first key is the name of the command.
For example, specify the isMaster command using the following BSON document:
{ isMaster: 1 }
170
Chapter 4. Administration
Issue Commands
The mongo shell provides a helper method for running commands called db.runCommand(). The following
operation in mongo runs the above command:
db.runCommand( { isMaster: 1 } )
Many drivers (page 95) provide an equivalent for the db.runCommand() method. Internally, running commands
with db.runCommand() is equivalent to a special query against the $cmd collection.
Many common commands have their own shell helpers or wrappers in the mongo shell and drivers, such as the
db.isMaster() method in the mongo JavaScript shell.
admin Database Commands
You must run some commands on the admin database. Normally, these operations resemble the followings:
use admin
db.runCommand( {buildInfo: 1} )
However, theres also a command helper that automatically runs the command in the context of the admin database:
db._adminCommand( {buildInfo: 1} )
Command Responses
All commands return, at minimum, a document with an ok field indicating whether the command has succeeded:
{ 'ok': 1 }
171
Start mongod
By default, MongoDB stores data in the /data/db directory. On Windows, MongoDB stores data in C:\data\db.
On all platforms, MongoDB listens for connections from clients on port 27017.
To start MongoDB using all defaults, issue the following command at the system shell:
mongod
Specify a Data Directory If you want mongod to store data files at a path other than /data/db you can specify
a dbpath. The dbpath must exist before you start mongod. If it does not exist, create the directory and the
permissions so that mongod can read and write data to this path. For more information on permissions, see the
security operations documentation (page 238).
To specify a dbpath for mongod to use as a data directory, use the --dbpath option. The following invocation
will start a mongod instance and store data in the /srv/mongodb path
mongod --dbpath /srv/mongodb/
Specify a TCP Port Only a single process can listen for connections on a network interface at a time. If you run
multiple mongod processes on a single machine, or have other processes that must use this port, you must assign each
a different port to listen on for client connections.
To specify a port to mongod, use the --port option on the command line. The following command starts mongod
listening on port 12345:
mongod --port 12345
Additional Configuration Options For an overview of common configurations and common configuration deployments. configurations for common use cases, see Run-time Database Configuration (page 146).
Stop mongod
In a clean shutdown a mongod completes all pending operations, flushes all data to data files, and closes all data files.
Other shutdowns are unclean and can compromise the validity the data files.
To ensure a clean shutdown, always shutdown mongod instances using one of the following methods:
Use shutdownServer() Shut down the mongod from the mongo shell using the db.shutdownServer()
method as follows:
use admin
db.shutdownServer()
172
Chapter 4. Administration
Calling the same method from a control script accomplishes the same result.
For systems with auth enabled, users may only issue db.shutdownServer() when authenticated to the admin
database or via the localhost interface on systems without authentication enabled.
Use --shutdown From the Linux command line, shut down the mongod using the --shutdown option in the
following command:
mongod --shutdown
Use CTRL-C When running the mongod instance in interactive mode (i.e. without --fork), issue Control-C
to perform a clean shutdown.
Use kill From the Linux command line, shut down a specific mongod instance using the following command:
kill <mongod process ID>
Procedure If the mongod is the primary in a replica set, the shutdown process for these mongod instances has the
following steps:
1. Check how up-to-date the secondaries are.
2. If no secondary is within 10 seconds of the primary, mongod will return a message that it will not shut down.
You can pass the shutdown command a timeoutSecs argument to wait for a secondary to catch up.
3. If there is a secondary within 10 seconds of the primary, the primary will step down and wait for the secondary
to catch up.
4. After 60 seconds or once the secondary has caught up, the primary will shut down.
Force Replica Set Shutdown If there is no up-to-date secondary and you want the primary to shut down, issue the
shutdown command with the force argument, as in the following mongo shell operation:
db.adminCommand({shutdown : 1, force : true})
To keep checking the secondaries for a specified number of seconds if none are immediately up-to-date, issue
shutdown with the timeoutSecs argument. MongoDB will keep checking the secondaries for the specified
number of seconds if none are immediately up-to-date. If any of the secondaries catch up within the allotted time, the
primary will shut down. If no secondaries catch up, it will not shut down.
The following command issues shutdown with timeoutSecs set to 5:
db.adminCommand({shutdown : 1, timeoutSecs : 5})
Alternately you can use the timeoutSecs argument with the db.shutdownServer() method:
db.shutdownServer({timeoutSecs : 5})
173
You can enable database profiling from the mongo shell or through a driver using the profile command. This
section will describe how to do so from the mongo shell. See your driver documentation (page 95) if you want to
control the profiler from within your application.
When you enable profiling, you also set the profiling level (page 174). The profiler records data in the
system.profile (page 227) collection. MongoDB creates the system.profile (page 227) collection in a
database after you enable profiling for that database.
To enable profiling and set the profiling level, use the db.setProfilingLevel() helper in the mongo shell,
passing the profiling level as a parameter. For example, to enable profiling for all database operations, consider the
following operation in the mongo shell:
db.setProfilingLevel(2)
The shell returns a document showing the previous level of profiling. The "ok" :
operation succeeded:
To verify the new setting, see the Check Profiling Level (page 175) section.
174
Chapter 4. Administration
Specify the Threshold for Slow Operations The threshold for slow operations applies to the entire mongod instance. When you change the threshold, you change it for all databases on the instance.
Important: Changing the slow operation threshold for the database profiler also affects the profiling subsystems
slow operation threshold for the entire mongod instance. Always set the threshold to the highest useful value.
By default the slow operation threshold is 100 milliseconds. Databases with a profiling level of 1 will log operations
slower than 100 milliseconds.
To change the threshold, pass two parameters to the db.setProfilingLevel() helper in the mongo shell. The
first parameter sets the profiling level for the current database, and the second sets the default slow operation threshold
for the entire mongod instance.
For example, the following command sets the profiling level for the current database to 0, which disables profiling,
and sets the slow-operation threshold for the mongod instance to 20 milliseconds. Any database on the instance with
a profiling level of 1 will use this threshold:
db.setProfilingLevel(0,20)
Check Profiling Level To view the profiling level (page 174), issue the following from the mongo shell:
db.getProfilingStatus()
Disable Profiling To disable profiling, use the following helper in the mongo shell:
db.setProfilingLevel(0)
Enable Profiling for an Entire mongod Instance For development purposes in testing environments, you can
enable database profiling for an entire mongod instance. The profiling level applies to all databases provided by the
mongod instance.
To enable profiling for a mongod instance, pass the following parameters to mongod at startup or within the
configuration file:
mongod --profile=1 --slowms=15
This sets the profiling level to 1, which collects profiling data for slow operations only, and defines slow operations as
those that last longer than 15 milliseconds.
See also:
profile and slowms.
175
Database Profiling and Sharding You cannot enable profiling on a mongos instance. To enable profiling in a
sharded cluster, you must enable profiling for each mongod instance in the cluster.
View Profiler Data
The database profiler logs information about database operations in the system.profile (page 227) collection.
To view profiling information, query the system.profile (page 227) collection. To view example queries, see
Example Profiler Data Queries (page 176)
For an explanation of the output data, see Database Profiler Output (page 230).
Example Profiler Data Queries This section displays example queries to the system.profile (page 227) collection. For an explanation of the query output, see Database Profiler Output (page 230).
To return the most recent 10 log entries in the system.profile (page 227) collection, run a query similar to the
following:
db.system.profile.find().limit(10).sort( { ts : -1 } ).pretty()
To return all operations except command operations ($cmd), run a query similar to the following:
db.system.profile.find( { op: { $ne : 'command' } } ).pretty()
To return operations for a particular collection, run a query similar to the following. This example returns operations
in the mydb databases test collection:
db.system.profile.find( { ns : 'mydb.test' } ).pretty()
To return operations slower than 5 milliseconds, run a query similar to the following:
db.system.profile.find( { millis : { $gt : 5 } } ).pretty()
To return information from a certain time range, run a query similar to the following:
db.system.profile.find(
{
ts : {
$gt : new ISODate("2012-12-09T03:00:00Z") ,
$lt : new ISODate("2012-12-09T03:40:00Z")
}
}
).pretty()
The following example looks at the time range, suppresses the user field from the output to make it easier to read,
and sorts the results by how long each operation took to run:
db.system.profile.find(
{
ts : {
$gt : new ISODate("2011-07-12T03:00:00Z") ,
$lt : new ISODate("2011-07-12T03:40:00Z")
}
},
{ user : 0 }
).sort( { millis : -1 } )
176
Chapter 4. Administration
Show the Five Most Recent Events On a database that has profiling enabled, the show profile helper in the
mongo shell displays the 5 most recent operations that took at least 1 millisecond to execute. Issue show profile
from the mongo shell, as follows:
show profile
Profiler Overhead
When enabled, profiling has a minor effect on performance. The system.profile (page 227) collection is a
capped collection with a default size of 1 megabyte. A collection of this size can typically store several thousand
profile documents, but some application may use more or less profiling data per operation.
To change the size of the system.profile (page 227) collection, you must:
1. Disable profiling.
2. Drop the system.profile (page 227) collection.
3. Create a new system.profile (page 227) collection.
4. Re-enable profiling.
For example, to create a new system.profile (page 227) collection thats 4000000 bytes, use the following
sequence of operations in the mongo shell:
db.setProfilingLevel(0)
db.system.profile.drop()
db.createCollection( "system.profile", { capped: true, size:4000000 } )
db.setProfilingLevel(1)
To change the size of the system.profile (page 227) collection on a secondary, you
must stop the secondary, run it as a standalone, and then perform the steps above.
When
done, restart the standalone as a member of the replica set.
For more information, see
http://docs.mongodb.org/manualtutorial/perform-maintence-on-replica-set-members.
Monitor MongoDB with SNMP
New in version 2.2.
Enterprise Feature
This feature is only available in MongoDB Enterprise.
This document outlines the use and operation of MongoDBs SNMP extension, which is only available in MongoDB
Enterprise71 .
Prerequisites
177
Red Hat Enterprise Linux 6.x series and Amazon Linux AMI require libssl, net-snmp,
net-snmp-libs, and net-snmp-utils. Issue a command such as the following to install these packages:
sudo yum install openssl net-snmp net-snmp-libs net-snmp-utils
Configure SNMP
Start Up You can control MongoDB Enterprise using default or custom control scripts, just as with any other
mongod:
Use the following command to view all SNMP options available in your MongoDB:
mongod --help | grep snmp
178
Chapter 4. Administration
--logpath /var/log/mongodb/1.log
The command should return output that includes the following line. This indicates that the proper mongod instance is
running:
systemuser 31415 10260
0 Jul13 pts/16
Test SNMP Check for the snmp agent process listening on port 1161 with the following command:
sudo lsof -i :1161
FD
10u
0 127.0.0.1:1161
0.0.0.0:*
9238/<path>/mongod
Run snmpwalk Locally snmpwalk provides tools for retrieving and parsing the SNMP data according to the
MIB. If you installed all of the required packages above, your system will have snmpwalk.
Issue the following command to collect data from mongod using SNMP:
snmpwalk -m MONGO-MIB -v 2c -c mongodb 127.0.0.1:1161 1.3.6.1.4.1.37601
You may also choose to specify the path to the MIB file:
snmpwalk -m /usr/share/snmp/mibs/MONGO-MIB -v 2c -c mongodb 127.0.0.1:1161 1.3.6.1.4.1.37601
Use this command only to ensure that you can retrieve and validate SNMP data from MongoDB.
Troubleshooting
Always check the logs for errors if something does not run as expected; see the log at /var/log/mongodb/1.log.
The presence of the following line indicates that the mongod cannot read the /etc/snmp/mongod.conf file:
179
Log rotation using MongoDBs standard approach archives the current log file and starts a new one. To do this, the
mongod or mongos instance renames the current log file by appending a UTC (GMT) timestamp to the filename, in
ISODate format. It then opens a new log file, closes the old log file, and sends all new log entries to the new log file.
MongoDBs standard approach to log rotation only rotates logs in response to the logRotate command, or when
the mongod or mongos process receives a SIGUSR1 signal from the operating system.
Alternately, you may configure mongod to send log data to syslog. In this case, you can take advantage of alternate
logrotation tools.
See also:
For information on logging, see the Process Logging (page 142) section.
Log Rotation With MongoDB
This is the only available method to rotate log files on Windows systems.
For Linux systems, rotate logs for a single process by issuing the following command:
kill -SIGUSR1 <mongod process id>
For results you get something similar to the following. The timestamps will be different.
server1.log
180
server1.log.2011-11-24T23-30-00
Chapter 4. Administration
The example results indicate a log rotation performed at exactly 11:30 pm on November 24th, 2011
UTC, which is the local time offset by the local time zone. The original log file is the one with the timestamp.
The new log is server1.log file.
If you issue a second logRotate command an hour later, then an additional file would appear when listing
matching files, as in the following example:
server1.log
server1.log.2011-11-24T23-30-00
server1.log.2011-11-25T00-30-00
This operation does not modify the server1.log.2011-11-24T23-30-00 file created earlier, while
server1.log.2011-11-25T00-30-00 is the previous server1.log file, renamed. server1.log
is a new, empty file that receives all new log output.
Syslog Log Rotation
Manage Journaling
MongoDB uses write ahead logging to an on-disk journal to guarantee write operation (page 42) durability and to
provide crash resiliency. Before applying a change to the data files, MongoDB writes the change operation to the
journal. If MongoDB should terminate or encounter an error before it can write the changes from the journal to the
data files, MongoDB can re-apply the write operation and maintain a consistent state.
Without a journal, if mongod exits unexpectedly, you must assume your data is in an inconsistent state, and you must
run either repair (page 203) or, preferably, resync (page 449) from a clean member of the replica set.
With journaling enabled, if mongod stops unexpectedly, the program can recover everything written to the journal,
and the data remains in a consistent state. By default, the greatest extent of lost writes, i.e., those not made to the
journal, are those made in the last 100 milliseconds. See journalCommitInterval for more information on the
default.
With journaling, if you want a data set to reside entirely in RAM, you need enough RAM to hold the data set plus
the write working set. The write working set is the amount of unique data you expect to see written between
re-mappings of the private view. For information on views, see Storage Views used in Journaling (page 234).
Important: Changed in version 2.0: For 64-bit builds of mongod, journaling is enabled by default. For other
platforms, see journal.
Procedures
Enable Journaling Changed in version 2.0: For 64-bit builds of mongod, journaling is enabled by default.
To enable journaling, start mongod with the --journal command line option.
181
If no journal files exist, when mongod starts, it must preallocate new journal files. During this operation, the mongod
is not listening for connections until preallocation completes: for some systems this may take a several minutes.
During this period your applications and the mongo shell are not available.
Disable Journaling
Warning: Do not disable journaling on production systems. If your mongod instance stops without shutti
down cleanly unexpectedly for any reason, (e.g. power failure) and you are not running with journaling, then y
must recover from an unaffected replica set member or backup, as described in repair (page 203).
To disable journaling, start mongod with the --nojournal command line option.
Get Commit Acknowledgment You can get commit acknowledgment with the getLastError command and the
j option. For details, see Write Concern Reference (page 83).
Avoid Preallocation Lag To avoid preallocation lag (page 234), you can preallocate files in the journal directory by
copying them from another instance of mongod.
Preallocated files do not contain data. It is safe to later remove them. But if you restart mongod with journaling,
mongod will create them again.
Example
The following sequence preallocates journal files for an instance of mongod running on port 27017 with a database
path of /data/db.
For demonstration purposes, the sequence starts by creating a set of journal files in the usual way.
1. Create a temporary directory into which to create a set of journal files:
mkdir ~/tmpDbpath
2. Create a set of journal files by staring a mongod instance that uses the temporary directory:
mongod --port 10000 --dbpath ~/tmpDbpath --journal
3. When you see the following log output, indicating mongod has the files, press CONTROL+C to stop the
mongod instance:
[initandlisten] waiting for connections on port 10000
4. Preallocate journal files for the new instance of mongod by moving the journal files from the data directory of
the existing instance to the data directory of the new instance:
mv ~/tmpDbpath/journal /data/db/
Monitor Journal Status Use the following commands and methods to monitor journal status:
serverStatus
The serverStatus command returns database status information that is useful for assessing performance.
182
Chapter 4. Administration
journalLatencyTest
Use journalLatencyTest to measure how long it takes on your volume to write to the disk in an appendonly fashion. You can run this command on an idle system to get a baseline sync time for journaling. You can
also run this command on a busy system to see the sync time on a busy system, which may be higher if the
journal directory is on the same volume as the data files.
The journalLatencyTest command also provides a way to check if your disk drive is buffering writes in
its local cache. If the number is very low (i.e., less than 2 milliseconds) and the drive is non-SSD, the drive
is probably buffering writes. In that case, enable cache write-through for the device in your operating system,
unless you have a disk controller card with battery backed RAM.
Change the Group Commit Interval Changed in version 2.0.
You can set the group commit interval using the --journalCommitInterval command line option. The allowed
range is 2 to 300 milliseconds.
Lower values increase the durability of the journal at the expense of disk performance.
Recover Data After Unexpected Shutdown On a restart after a crash, MongoDB replays all journal files in the
journal directory before the server becomes available. If MongoDB must replay journal files, mongod notes these
events in the log output.
There is no reason to run repairDatabase in these situations.
Store a JavaScript Function on the Server
Note: We do not recommend using server-side stored functions if possible.
There is a special system collection named system.js that can store JavaScript functions for reuse.
To store a function, you can use the db.collection.save(), as in the following example:
db.system.js.save(
{
_id : "myAddFunction" ,
value : function (x, y){ return x + y; }
}
);
The _id field holds the name of the function and is unique per database.
The value field holds the function definition
Once you save a function in the system.js collection, you can use the function from any JavaScript context (e.g.
eval command or the mongo shell method db.eval(), $where operator, mapReduce or mongo shell method
db.collection.mapReduce()).
Consider the following example from the mongo shell that first saves a function named echoFunction to the
system.js collection and calls the function using db.eval() method:
db.system.js.save(
{ _id: "echoFunction",
value : function(x) { return x; }
}
)
db.eval( "echoFunction( 'test' )" )
183
Ensure you have an up-to-date backup of your data set. See MongoDB Backup Methods (page 136).
Consult the following documents for any special considerations or compatibility issues specific to your MongoDB release:
The release notes, located at Release Notes (page 591).
The documentation for your driver. See MongoDB Drivers and Client Libraries (page 95).
If your installation includes replica sets, plan the upgrade during a predefined maintenance window.
Before you upgrade a production environment, use the procedures in this document to upgrade a staging environment that reproduces your production environment, to ensure that your production configuration is compatible
with all changes.
Upgrade Procedure
184
Chapter 4. Administration
185
To upgrade a replica set, upgrade each member individually, starting with the secondaries and finishing with the
primary. Plan the upgrade during a predefined maintenance window.
Upgrade Secondaries Upgrade each secondary separately as follows:
1. Upgrade the secondarys mongod binary by following the instructions below in Upgrade a MongoDB Instance
(page 185).
2. After upgrading a secondary, wait for the secondary to recover to the SECONDARY state before upgrading the
next instance. To check the members state, issue rs.status() in the mongo shell.
The secondary may briefly go into STARTUP2 or RECOVERING. This is normal. Make sure to wait for the
secondary to fully recover to SECONDARY before you continue the upgrade.
Upgrade the Primary
1. Step down the primary to initiate the normal failover (page 397) procedure. Using one of the following:
The rs.stepDown() helper in the mongo shell.
The replSetStepDown database command.
During failover, the set cannot accept writes. Typically this takes 10-20 seconds. Plan the upgrade during a
predefined maintenance window.
Note: Stepping down the primary is preferable to directly shutting down the primary. Stepping down expedites
the failover procedure.
2. Once the primary has stepped down, call the rs.status() method from the mongo shell until you see that
another member has assumed the PRIMARY state.
3. Shut down the original primary and upgrade its instance by following the instructions below in Upgrade a
MongoDB Instance (page 185).
MongoDB Tutorials
This page lists the tutorials available as part of the MongoDB Manual. In addition to these documents, you can refer
to the introductory MongoDB Tutorial (page 19). If there is a process or pattern that you would like to see included
here, please open a Jira Case73 .
73 https://jira.mongodb.org/browse/DOCS
186
Chapter 4. Administration
Getting Started
Replica Sets
Deploy a Replica Set (page 420)
Convert a Standalone to a Replica Set (page 432)
Add Members to a Replica Set (page 433)
Remove Members from Replica Set (page 435)
Replace a Replica Set Member (page 437)
Adjust Priority for Replica Set Member (page 437)
Resync a Member of a Replica Set (page 449)
Deploy a Geographically Redundant Replica Set (page 425)
Change the Size of the Oplog (page 445)
Force a Member to Become Primary (page 447)
Change Hostnames in a Replica Set (page 457)
Add an Arbiter to Replica Set (page 431)
Convert a Secondary to an Arbiter (page 443)
Configure a Secondarys Sync Target (page 461)
Configure a Delayed Replica Set Member (page 441)
Configure a Hidden Replica Set Member (page 439)
Configure Non-Voting Replica Set Member (page 442)
Prevent Secondary from Becoming Primary (page 438)
Configure Replica Set Tag Sets (page 450)
Manage Chained Replication (page 456)
Reconfigure a Replica Set with Unavailable Members (page 454)
Recover Data after an Unexpected Shutdown (page 203)
Troubleshoot Replica Sets (page 461)
187
Sharding
Deploy a Sharded Cluster (page 507)
Convert a Replica Set to a Replicated Sharded Cluster (page 515)
Add Shards to a Cluster (page 514)
Remove Shards from an Existing Sharded Cluster (page 534)
Deploy Three Config Servers for Production Deployments (page 515)
Migrate Config Servers with the Same Hostname (page 524)
Migrate Config Servers with Different Hostnames (page 524)
Replace a Config Server (page 525)
Migrate a Sharded Cluster to Different Hardware (page 526)
Backup Cluster Metadata (page 529)
Backup a Small Sharded Cluster with mongodump (page 198)
Backup a Sharded Cluster with Filesystem Snapshots (page 199)
Backup a Sharded Cluster with Database Dumps (page 200)
Restore a Single Shard (page 202)
Restore a Sharded Cluster (page 202)
Schedule Backup Window for Sharded Clusters (page 202)
Manage Shard Tags (page 541)
Basic Operations
Use Database Commands (page 170)
Recover Data after an Unexpected Shutdown (page 203)
Expire Data from Collections by Setting TTL (page 163)
Analyze Performance of Database Operations (page 174)
Rotate Log Files (page 180)
Build Old Style Indexes (page 346)
Manage mongod Processes (page 171)
Back Up and Restore with MongoDB Tools (page 195)
Backup and Restore with Filesystem Snapshots (page 190)
Security
Configure Linux iptables Firewall for MongoDB (page 245)
Configure Windows netsh Firewall for MongoDB (page 249)
Enable Authentication (page 257)
Create a User Administrator (page 258)
Add a User to a Database (page 259)
Generate a Key File (page 261)
188
Chapter 4. Administration
189
Snapshots work by creating pointers between the live data and a special snapshot volume. These pointers are theoretically equivalent to hard links. As the working data diverges from the snapshot, the snapshot process uses a
copy-on-write strategy. As a result the snapshot only stores modified data.
After making the snapshot, you mount the snapshot image on your file system and copy data from the snapshot. The
resulting backup contains a full copy of all data.
Snapshots have the following limitations:
The database must be valid when the snapshot takes place. This means that all writes accepted by the database
need to be fully written to disk: either to the journal or to data files.
If all writes are not on disk when the backup occurs, the backup will not reflect these changes. If writes are in
progress when the backup occurs, the data files will reflect an inconsistent state. With journaling all data-file
states resulting from in-progress writes are recoverable; without journaling you must flush all pending writes
to disk before running the backup operation and must ensure that no writes occur during the entire backup
procedure.
If you do use journaling, the journal must reside on the same volume as the data.
Snapshots create an image of an entire disk image. Unless you need to back up your entire system, consider
isolating your MongoDB data files, journal (if applicable), and configuration on one logical disk that doesnt
contain any other data.
74 http://mms.mongodb.com
190
Chapter 4. Administration
Alternately, store all MongoDB data files on a dedicated device so that you can make backups without duplicating extraneous data.
Ensure that you copy data from snapshots and onto other systems to ensure that data is safe from site failures.
Although different snapshots methods provide different capability, the LVM method outlined below does not
provide any capacity for capturing incremental backups.
Snapshots With Journaling If your mongod instance has journaling enabled, then you can use any kind of file
system or volume/block level snapshot tool to create backups.
If you manage your own infrastructure on a Linux-based system, configure your system with LVM to provide your disk
packages and provide snapshot capability. You can also use LVM-based setups within a cloud/virtualized environment.
Note: Running LVM provides additional flexibility and enables the possibility of using snapshots to back up MongoDB.
Snapshots with Amazon EBS in a RAID 10 Configuration If your deployment depends on Amazons Elastic
Block Storage (EBS) with RAID configured within your instance, it is impossible to get a consistent state across all
disks using the platforms snapshot tool. As an alternative, you can do one of the following:
Flush all writes to disk and create a write lock to ensure consistent state during the backup process.
If you choose this option see Create Backups on Instances that do not have Journaling Enabled (page 193).
Configure LVM to run and hold your MongoDB data files on top of the RAID within your system.
If you choose this option, perform the LVM backup operation described in Create a Snapshot (page 191).
Backup and Restore Using LVM on a Linux System
This section provides an overview of a simple backup process using LVM on a Linux system. While the tools, commands, and paths may be (slightly) different on your system the following steps provide a high level overview of the
backup operation.
Note: Only use the following procedure as a guideline for a backup system and infrastructure. Production backup
systems must consider a number of application specific requirements and factors unique to specific environments.
Create a Snapshot To create a snapshot with LVM, issue a command as root in the following format:
lvcreate --size 100M --snapshot --name mdb-snap01 /dev/vg0/mongodb
This command creates an LVM snapshot (with the --snapshot option) named mdb-snap01 of the mongodb
volume in the vg0 volume group.
191
Warning: Ensure that you create snapshots with enough space to account for data growth, particularly for the
period of time that it takes to copy data out of the system or to a temporary image.
If your snapshot runs out of space, the snapshot image becomes unusable. Discard this logical volume and create
another.
The snapshot will exist when the command returns. You can restore directly from the snapshot at any time or by
creating a new logical volume and restoring from this snapshot to the alternate image.
While snapshots are great for creating high quality backups very quickly, they are not ideal as a format for storing
backup data. Snapshots typically depend and reside on the same storage infrastructure as the original disk images.
Therefore, its crucial that you archive these snapshots and store them elsewhere.
Archive a Snapshot After creating a snapshot, mount the snapshot and move the data to separate storage. Your
system might try to compress the backup images as you move the offline. The following procedure fully archives the
data from the snapshot:
umount /dev/vg0/mdb-snap01
dd if=/dev/vg0/mdb-snap01 | gzip > mdb-snap01.gz
Restore a Snapshot
mands:
To restore a snapshot created with the above method, issue the following sequence of com-
192
Chapter 4. Administration
umount /dev/vg0/mdb-snap01
lvcreate --size 1G --name mdb-new vg0
dd if=/dev/vg0/mdb-snap01 of=/dev/vg0/mdb-new
mount /dev/vg0/mdb-new /srv/mongodb
You can implement off-system backups using the combined process (page 193) and SSH.
This sequence is identical to procedures explained above, except that it archives and compresses the backup on a
remote system using SSH.
Consider the following procedure:
umount /dev/vg0/mdb-snap01
dd if=/dev/vg0/mdb-snap01 | ssh [email protected] gzip > /opt/backup/mdb-snap01.gz
lvcreate --size 1G --name mdb-new vg0
ssh [email protected] gzip -d -c /opt/backup/mdb-snap01.gz | dd of=/dev/vg0/mdb-new
mount /dev/vg0/mdb-new /srv/mongodb
If your mongod instance does not run with journaling enabled, or if your journal is on a separate volume, obtaining a
functional backup of a consistent state is more complicated. As described in this section, you must flush all writes to
disk and lock the database to prevent writes during the backup process. If you have a replica set configuration, then
for your backup use a secondary which is not receiving reads (i.e. hidden member).
1. To flush writes to disk and to lock the database (to prevent further writes), issue the db.fsyncLock()
method in the mongo shell:
db.fsyncLock();
Note: Changed in version 2.0: MongoDB 2.0 added db.fsyncLock() and db.fsyncUnlock() helpers
to the mongo shell. Prior to this version, use the fsync command with the lock option, as follows:
db.runCommand( { fsync: 1, lock: true } );
db.runCommand( { fsync: 1, lock: false } );
The database cannot be locked with db.fsyncLock() while profiling is enabled. You must disable profiling
before locking the database with db.fsyncLock(). Disable profiling using db.setProfilingLevel()
as follows in the mongo shell:
db.setProfilingLevel(0)
Warning: Changed in version 2.2: When used in combination with fsync or db.fsyncLock(),
mongod may block some reads, including those from mongodump, when queued write operation waits
behind the fsync lock.
193
1. Obtain backup MongoDB Database files. These files may come from a file system snapshot. The
MongoDB Management Service (MMS)75 produces MongoDB database files for stored snapshots76 and point
and time snapshots77 . You can also use mongorestore to restore database files using data created with
mongodump. See Back Up and Restore with MongoDB Tools (page 195) for more information.
2. Start a mongod using data files from the backup as the dbpath. In the following example, /data/db is the
dbpath to the data files:
mongod --dbpath /data/db
3. Convert your standalone mongod process to a single node replica set by shutting down the mongod instance,
and restarting it with the --replSet option, as in the following example:
mongod --dbpath /data/db --replSet <replName>
Optional
Consider explicitly setting a oplogSize to control the size of the oplog created for this replica set member.
4. Connect to the mongod instance.
5. Use rs.initiate() to initiate the new replica set.
Add Members to the Replica Set
MongoDB provides two options for restoring secondary members of a replica set:
1. Manually copy the database files to each data directory.
2. Allow initial sync (page 412) to distribute data automatically.
The following sections outlines both approaches.
Note: If your database is large, initial sync can take a long time to complete. For large databases, it might be
preferable to copy the database files onto each host.
Copy Database Files and Restart mongod Instance Use the following sequence of operations to seed additional
members of the replica set with the restored data by copying MongoDB data files directly.
1. Shut down the mongod instance that you restored. Using --shutdown or db.shutdownServer() to
ensure a clean shut down.
75 https://mms.mongodb.com/?pk_campaign=mongodb-docs-restore-rs-tutorial
76 https://mms.mongodb.com/help/backup/tutorial/restore-from-snapshot/
77 https://mms.mongodb.com/help/backup/tutorial/restore-from-point-in-time-snapshot/
194
Chapter 4. Administration
2. Copy the primarys data directory into the dbpath of the other members of the replica set. The dbpath is
/data/db by default.
3. Start the mongod instance that you restored.
4. In a mongo shell connected to the primary, add the secondaries to the replica set using rs.add(). See Deploy
a Replica Set (page 420) for more information about deploying a replica set.
Update Secondaries using Initial Sync Use the following sequence of operations to seed additional members of
the replica set with the restored data using the default initial sync operation.
1. Ensure that the data directories on the prospective replica set members are empty.
2. Add each prospective member to the replica set. Initial Sync (page 412) will copy the data from the primary to
the other members of the replica set.
Back Up and Restore with MongoDB Tools
This document describes the process for writing and restoring backups to files in binary format with the mongodump
and mongorestore tools.
Use these tools for backups if other backup methods, such as the MMS Backup Service78 or file system snapshots
(page 190) are unavailable.
See also:
MongoDB Backup Methods (page 136), http://docs.mongodb.org/manualreference/program/mongodump,
and http://docs.mongodb.org/manualreference/program/mongorestore.
Backup a Database with mongodump
Important: mongodump does not dump the content of the local database.
Basic mongodump Operations The mongodump utility can back up data by either:
connecting to a running mongod or mongos instance, or
accessing data files without an active instance.
The utility can create a backup for an entire server, database or collection, or can use a query to backup just part of a
collection.
When you run mongodump without any arguments, the command connects to the MongoDB instance on the local
system (e.g. 127.0.0.1 or localhost) on port 27017 and creates a database backup named dump/ in the
current directory.
To backup data from a mongod or mongos instance running on the same machine and on the default port of 27017
use the following command:
mongodump
Warning: The data format used by mongodump from version 2.2 or later is incompatible with earlier versions
of mongod. Do not use recent versions of mongodump to back up older data stores.
78 https://mms.mongodb.com/?pk_campaign=mongodb-docs-tools
195
To limit the amount of data included in the database dump, you can specify --db and --collection as options to
the mongodump command. For example:
mongodump --dbpath /data/db/ --out /data/backup/
mongodump --host mongodb.example.net --port 27017
mongodump will write BSON files that hold a copy of data accessible via the mongod listening on port 27017 of
the mongodb.example.net host.
mongodump --collection collection --db test
This command creates a dump of the collection named collection from the database test in a dump/ subdirectory of the current working directory.
Point in Time Operation Using Oplogs Use the --oplog option with mongodump to collect the oplog entries to
build a point-in-time snapshot of a database within a replica set. With --oplog, mongodump copies all the data from
the source database as well as all of the oplog entries from the beginning of the backup procedure to until the backup
procedure completes. This backup procedure, in conjunction with mongorestore --oplogReplay, allows you
to restore a backup that reflects the specific moment in time that corresponds to when mongodump completed creating
the dump file.
Create Backups Without a Running mongod Instance If your MongoDB instance is not running, you can use
the --dbpath option to specify the location to your MongoDB instances database files. mongodump reads from
the data files directly with this operation. This locks the data directory to prevent conflicting writes. The mongod
process must not be running or attached to these data files when you run mongodump in this configuration. Consider
the following example:
Example
Backup a MongoDB Instance Without a Running mongod
Given a MongoDB instance that contains the customers, products, and suppliers databases, the following mongodump operation backs up the databases using the --dbpath option, which specifies the location of the
database files on the host:
mongodump --dbpath /data -o dataout
The --out option allows you to specify the directory where mongodump will save the backup. mongodump creates
a separate backup directory for each of the backed up databases: dataout/customers, dataout/products,
and dataout/suppliers.
Create Backups from Non-Local mongod Instances The --host and --port options for mongodump allow
you to connect to and backup from a remote host. Consider the following example:
mongodump --host mongodb1.example.net --port 3017 --username user --password pass --out /opt/backup/m
On any mongodump command you may, as above, specify username and password credentials to specify database
authentication.
Restore a Database with mongorestore
The mongorestore utility restores a binary backup created by mongodump. By default, mongorestore looks
for a database backup in the dump/ directory.
196
Chapter 4. Administration
To use mongorestore to write to data files without using a running mongod, use a command with the following
prototype form:
mongorestore --dbpath <database path> <path to the backup>
Here, mongorestore imports the database backup in the dump-2012-10-25 directory to the mongod instance
running on the localhost interface.
Restore Point in Time Oplog Backup If you created your database dump using the --oplog option to ensure a
point-in-time snapshot, call mongorestore with the --oplogReplay option, as in the following example:
mongorestore --oplogReplay
You may also consider using the mongorestore --objcheck option to check the integrity of objects while
inserting them into the database, or you may consider the mongorestore --drop option to drop each collection
from the database before restoring from backups.
Restore a Subset of data from a Binary Database Dump mongorestore also includes the ability to a filter to
all input before inserting it into the new database. Consider the following example:
mongorestore --filter '{"field": 1}'
Here, mongorestore only adds documents to the database from the dump located in the dump/ folder if the
documents have a field name field that holds a value of 1. Enclose the filter in single quotes (e.g. ) to prevent the
filter from interacting with your shell environment.
Restore Without a Running mongod mongorestore can write data to MongoDB data files without needing to
connect to a mongod directly.
Example
Restore a Database Without a Running mongod
Given a set of backed up databases in the /data/backup/ directory:
/data/backup/customers,
/data/backup/products, and
/data/backup/suppliers
The following mongorestore command restores the products database. The command uses the --dbpath
option to specify the path to the MongoDB data files:
197
The mongorestore imports the database backup in the /data/backup/products directory to the mongod
instance that runs on the localhost interface. The mongorestore operation imports the backup even if the mongod
is not running.
The --journal option ensures that mongorestore records all operation in the durability journal. The journal
prevents data file corruption if anything (e.g. power failure, disk failure, etc.) interrupts the restore operation.
See also:
http://docs.mongodb.org/manualreference/program/mongodump
http://docs.mongodb.org/manualreference/program/mongorestore.
and
Restore Backups to Non-Local mongod Instances By default, mongorestore connects to a MongoDB instance
running on the localhost interface (e.g. 127.0.0.1) and on the default port (27017). If you want to restore to a
different host or port, use the --host and --port options.
Consider the following example:
mongorestore --host mongodb1.example.net --port 3017 --username user --password pass /opt/backup/mong
As above, you may specify username and password connections if your mongod requires authentication.
Backup and Restore Sharded Clusters
The following tutorials describe backup and restoration for sharded clusters:
Backup a Small Sharded Cluster with mongodump (page 198) If your sharded cluster holds a small data set, you
can use mongodump to capture the entire backup in a reasonable amount of time.
Backup a Sharded Cluster with Filesystem Snapshots (page 199) Use file system snapshots back up each component in the sharded cluster individually. The procedure involves stopping the cluster balancer. If your system
configuration allows file system backups, this might be more efficient than using MongoDB tools.
Backup a Sharded Cluster with Database Dumps (page 200) Create backups using mongodump to back up each
component in the cluster individually.
Schedule Backup Window for Sharded Clusters (page 202) Limit the operation of the cluster balancer to provide a
window for regular backup operations.
Restore a Single Shard (page 202) An outline of the procedure and consideration for restoring a single shard from a
backup.
Restore a Sharded Cluster (page 202) An outline of the procedure and consideration for restoring an entire sharded
cluster from backup.
Backup a Small Sharded Cluster with mongodump
Overview If your sharded cluster holds a small data set, you can connect to a mongos using mongodump. You can
create backups of your MongoDB cluster, if your backup infrastructure can capture the entire backup in a reasonable
amount of time and if you have a storage system that can hold the complete MongoDB data set.
See MongoDB Backup Methods (page 136) and Backup and Restore Sharded Clusters (page 198) for a complete
information on backups in MongoDB and backups of sharded clusters in particular.
Important: By default mongodump issue its queries to the non-primary nodes.
198
Chapter 4. Administration
Considerations If you use mongodump without specifying a database or collection, mongodump will capture
collection data and the cluster meta-data from the config servers (page 488).
You cannot use the --oplog option for mongodump when capturing data from mongos. As a result, if you need
to capture a backup that reflects a single moment in time, you must stop all writes to the cluster for the duration of the
backup operation.
Procedure
Capture Data You can perform a backup of a sharded cluster by connecting mongodump to a mongos. Use the
following operation at your systems prompt:
mongodump --host mongos3.example.net --port 27017
mongodump will write BSON files that hold a copy of data stored in the sharded cluster accessible via the mongos
listening on port 27017 of the mongos3.example.net host.
Restore Data Backups created with mongodump do not reflect the chunks or the distribution of data in the sharded
collection or collections. Like all mongodump output, these backups contain separate directories for each database
and BSON files for each collection in that database.
You can restore mongodump output to any MongoDB instance, including a standalone, a replica set, or a new sharded
cluster. When restoring data to sharded cluster, you must deploy and configure sharding before restoring data from
the backup. See Deploy a Sharded Cluster (page 507) for more information.
Backup a Sharded Cluster with Filesystem Snapshots
Overview This document describes a procedure for taking a backup of all components of a sharded cluster. This procedure uses file system snapshots to capture a copy of the mongod instance. An alternate procedure uses mongodump
to create binary database dumps when file-system snapshots are not available. See Backup a Sharded Cluster with
Database Dumps (page 200) for the alternate procedure.
See MongoDB Backup Methods (page 136) and Backup and Restore Sharded Clusters (page 198) for a complete
information on backups in MongoDB and backups of sharded clusters in particular.
Important: To capture a point-in-time backup from a sharded cluster you must stop all writes to the cluster. On a
running production system, you can only capture an approximation of point-in-time snapshot.
Procedure In this procedure, you will stop the cluster balancer and take a backup up of the config database, and
then take backups of each shard in the cluster using a file-system snapshot tool. If you need an exact moment-in-time
snapshot of the system, you will need to stop all application writes before taking the filesystem snapshots; otherwise
the snapshot will only approximate a moment in time.
For approximate point-in-time snapshots, you can improve the quality of the backup while minimizing impact on the
cluster by taking the backup from a secondary member of the replica set that provides each shard.
1. Disable the balancer process that equalizes the distribution of data among the shards. To disable the balancer,
use the sh.stopBalancer() method in the mongo shell. For example:
use config
sh.setBalancerState(false)
199
For more information, see the Disable the Balancer (page 533) procedure.
Warning: It is essential that you stop the balancer before creating backups. If the balancer remains active,
your resulting backups could have duplicate data or miss some data, as chunks may migrate while recording
backups.
2. Lock one secondary member of each replica set in each shard so that your backups reflect the state of your
database at the nearest possible approximation of a single moment in time. Lock these mongod instances in as
short of an interval as possible.
To lock a secondary, connect through the mongo shell to the secondary members mongod instance and issue
the db.fsyncLock() method.
3. Back up one of the config servers (page 488). Backing up a config server backs up the sharded clusters metadata.
You need back up only one config server, as they all hold the same data
Do one of the following to back up one of the config servers:
Create a file-system snapshot of the config server. Use the procedure in Backup and Restore with Filesystem
Snapshots (page 190).
Important:
This is only avalible if the config server has journal is enabled.
db.fsyncLock() on config databases.
Never use
Use mongodump to backup the config server. Issue mongodump against one of the config mongod
instances or via the mongos.
If you are running MongoDB 2.4 or later with the --configsvr option, then include the --oplog
option when running mongodump to ensure that the dump includes a partial oplog containing operations
from the duration of the mongodump operation. For example:
mongodump --oplog --db config
4. Back up the replica set members of the shards that you locked. You may back up the shards in parallel. For each
shard, create a snapshot. Use the procedure in Backup and Restore with Filesystem Snapshots (page 190).
5. Unlock all locked replica set members of each shard using the db.fsyncUnlock() method in the mongo
shell.
6. Re-enable the balancer with the sh.setBalancerState() method.
Use the following command sequence when connected to the mongos with the mongo shell:
use config
sh.setBalancerState(true)
Overview This document describes a procedure for taking a backup of all components of a sharded cluster. This
procedure uses mongodump to create dumps of the mongod instance. An alternate procedure uses file system snapshots to capture the backup data, and may be more efficient in some situations if your system configuration allows file
system backups. See Backup a Sharded Cluster with Filesystem Snapshots (page 199).
See MongoDB Backup Methods (page 136) and Backup and Restore Sharded Clusters (page 198) for a complete
information on backups in MongoDB and backups of sharded clusters in particular.
Important: To capture a point-in-time backup from a sharded cluster you must stop all writes to the cluster. On a
running production system, you can only capture an approximation of point-in-time snapshot.
200
Chapter 4. Administration
Procedure In this procedure, you will stop the cluster balancer and take a backup up of the config database, and then
take backups of each shard in the cluster using mongodump to capture the backup data. If you need an exact momentin-time snapshot of the system, you will need to stop all application writes before taking the filesystem snapshots;
otherwise the snapshot will only approximate a moment of time.
For approximate point-in-time snapshots, you can improve the quality of the backup while minimizing impact on the
cluster by taking the backup from a secondary member of the replica set that provides each shard.
1. Disable the balancer process that equalizes the distribution of data among the shards. To disable the balancer,
use the sh.stopBalancer() method in the mongo shell. For example:
use config
sh.setBalancerState(false)
For more information, see the Disable the Balancer (page 533) procedure.
Warning: It is essential that you stop the balancer before creating backups. If the balancer remains active,
your resulting backups could have duplicate data or miss some data, as chunks migrate while recording
backups.
2. Lock one member of each replica set in each shard so that your backups reflect the state of your database at
the nearest possible approximation of a single moment in time. Lock these mongod instances in as short of an
interval as possible.
To lock or freeze a sharded cluster, you shut down one member of each replica set. Ensure that the oplog has
sufficient capacity to allow these secondaries to catch up to the state of the primaries after finishing the backup
procedure. See Oplog Size (page 411) for more information.
3. Use mongodump to backup one of the config servers (page 488). This backs up the clusters metadata. You
only need to back up one config server, as they all hold the same data.
Use the mongodump tool to capture the content of the config mongod instances.
Your config servers must run MongoDB 2.4 or later with the --configsvr option and the mongodump
option must include the --oplog to capture a consistent copy of the config database:
mongodump --oplog --db config
4. Back up the replica set members of the shards that shut down using mongodump and specifying the --dbpath
option. You may back up the shards in parallel. Consider the following invocation:
mongodump --journal --dbpath /data/db/ --out /data/backup/
You must run this command on the system where the mongod ran. This operation will use journaling and create
a dump of the entire mongod instance with data files stored in /data/db/. mongodump will write the output
of this dump to the /data/backup/ directory.
5. Restart all stopped replica set members of each shard as normal and allow them to catch up with the state of the
primary.
6. Re-enable the balancer with the sh.setBalancerState() method.
Use the following command sequence when connected to the mongos with the mongo shell:
use config
sh.setBalancerState(true)
201
Overview In a sharded cluster, the balancer process is responsible for distributing sharded data around the cluster,
so that each shard has roughly the same amount of data.
However, when creating backups from a sharded cluster it is important that you disable the balancer while taking
backups to ensure that no chunk migrations affect the content of the backup captured by the backup procedure. Using
the procedure outlined in the section Disable the Balancer (page 533) you can manually stop the balancer process
temporarily. As an alternative you can use this procedure to define a balancing window so that the balancer is always
disabled during your automated backup operation.
Procedure If you have an automated backup schedule, you can disable all balancing operations for a period of time.
For instance, consider the following command:
use config
db.settings.update( { _id : "balancer" }, { $set : { activeWindow : { start : "6:00", stop : "23:00"
This operation configures the balancer to run between 6:00am and 11:00pm, server time. Schedule your backup
operation to run and complete outside of this time. Ensure that the backup can complete outside the window when
the balancer is running and that the balancer can effectively balance the collection among the shards in the window
allotted to each.
Restore a Single Shard
Overview Restoring a single shard from backup with other unaffected shards requires a number of special considerations and practices. This document outlines the additional tasks you must perform when restoring a single shard.
Consider the following resources on backups in general as well as backup and restoration of sharded clusters specifically:
Backup and Restore Sharded Clusters (page 198)
Restore a Sharded Cluster (page 202)
MongoDB Backup Methods (page 136)
Procedure Always restore sharded clusters as a whole. When you restore a single shard, keep in mind that the
balancer process might have moved chunks to or from this shard since the last backup. If thats the case, you must
manually move those chunks, as described in this procedure.
1. Restore the shard as you would any other mongod instance. See MongoDB Backup Methods (page 136) for
overviews of these procedures.
2. For all chunks that migrate away from this shard, you do not need to do anything at this time. You do not
need to delete these documents from the shard because the chunks are automatically filtered out from queries by
mongos. You can remove these documents from the shard, if you like, at your leisure.
3. For chunks that migrate to this shard after the most recent backup, you must manually recover the chunks using
backups of other shards, or some other source. To determine what chunks have moved, view the changelog
collection in the Config Database (page 547).
Restore a Sharded Cluster
202
Chapter 4. Administration
Overview The procedure outlined in this document addresses how to restore an entire sharded cluster. For information on related backup procedures consider the following tutorials which describe backup procedures in greater
detail:
Backup a Sharded Cluster with Filesystem Snapshots (page 199)
Backup a Sharded Cluster with Database Dumps (page 200)
The exact procedure used to restore a database depends on the method used to capture the backup. See the MongoDB
Backup Methods (page 136) document for an overview of backups with MongoDB and Backup and Restore Sharded
Clusters (page 198) for a complete information on backups in MongoDB and backups of sharded clusters in particular.
Procedure
1. Stop all mongos and mongod processes, including all shards and all config servers.
2. If shard hostnames have changed, you must manually update the shards collection in the Config Database
(page 547) to use the new hostnames. Do the following:
(a) Start the three config servers (page 488) by issuing commands similar to the following, using values
appropriate to your configuration:
mongod --configsvr --dbpath /data/configdb --port 27019
(b) Restore the Config Database (page 547) on each config server.
(c) Start one mongos instance.
(d) Update the Config Database (page 547) collection named shards to reflect the new hostnames.
3. Restore the following:
Data files for each server in each shard. Because replica sets provide each production shard, restore all the
members of the replica set or use the other standard approaches for restoring a replica set from backup.
See the Restore a Snapshot (page 192) and Restore a Database with mongorestore (page 196) sections for
details on these procedures.
Data files for each config server (page 488), if you have not already done so in the previous step.
4. Restart all the mongos instances.
5. Restart all the shard mongod instances.
6. Restart all the config servers mongod instances.
7. Connect to a mongos instance from a mongo shell and use the db.printShardingStatus() method to
ensure that the cluster is operational, as follows:
db.printShardingStatus()
show collections
203
To prevent data inconsistency and corruption, always shut down the database cleanly and use the durability journaling.
MongoDB writes data to the journal, by default, every 100 milliseconds, such that MongoDB can always recover to a
consistent state even in the case of an unclean shutdown due to power loss or other system failure.
If you are not running as part of a replica set and do not have journaling enabled, use the following procedure to
recover data that may be in an inconsistent state. If you are running as part of a replica set, you should always restore
from a backup or restart the mongod instance with an empty dbpath and allow MongoDB to perform an initial sync
to restore the data.
See also:
The Administration (page 135) documents, including Replica Set Syncing (page 410), and the documentation on the
repair, repairpath, and journal settings.
Process
Indications When you are aware of a mongod instance running without journaling that stops unexpectedly and
youre not running with replication, you should always run the repair operation before starting MongoDB again. If
youre using replication, then restore from a backup and allow replication to perform an initial sync (page 410) to
restore data.
If the mongod.lock file in the data directory specified by dbpath, /data/db by default, is not a zero-byte file,
then mongod will refuse to start, and you will find a message that contains the following line in your MongoDB log
our output:
Unclean shutdown detected.
This indicates that you need to run mongod with the --repair option. If you run repair when the mongodb.lock
file exists in your dbpath, or the optional --repairpath, you will see a message that contains the following line:
old lock file: /data/db/mongod.lock. probably means unclean shutdown
If you see this message, as a last resort you may remove the lockfile and run the repair operation before starting the
database normally, as in the following procedure:
Overview
There are two processes to repair data files that result from an unexpected shutdown:
1. Use the --repair option in conjunction with the --repairpath option. mongod will read the existing
data files, and write the existing data to new data files. This does not modify or alter the existing data files.
You do not need to remove the mongod.lock file before using this procedure.
2. Use the --repair option. mongod will read the existing data files, write the existing data to new files and
replace the existing, possibly corrupt, files with new files.
You must remove the mongod.lock file before using this procedure.
Note: --repair functionality is also available in the shell with the db.repairDatabase() helper for the
repairDatabase command.
204
Chapter 4. Administration
Procedures To repair your data files using the --repairpath option to preserve the original data files unmodified:
1. Start mongod using --repair to read the existing data files.
mongod --dbpath /data/db --repair --repairpath /data/db0
When this completes, the new repaired data files will be in the /data/db0 directory.
2. Start mongod using the following invocation to point the dbpath at /data/db0:
mongod --dbpath /data/db0
Once you confirm that the data files are operational you may delete or archive the data files in the /data/db
directory.
To repair your data files without preserving the original files, do not use the --repairpath option, as in the
following procedure:
1. Remove the stale lock file:
rm /data/db/mongod.lock
Replace /data/db with your dbpath where your MongoDB instances data files reside.
Warning: After you remove the mongod.lock file you must run the --repair process before using
your database.
2. Start mongod using --repair to read the existing data files.
mongod --dbpath /data/db --repair
When this completes, the repaired data files will replace the original data files in the /data/db directory.
3. Start mongod using the following invocation to point the dbpath at /data/db:
mongod --dbpath /data/db
mongod.lock
In normal operation, you should never remove the mongod.lock file and start mongod. Instead consider the one
of the above methods to recover the database and remove the lock files. In dire situations you can remove the lockfile,
and start the database using the possibly corrupt files, and attempt to recover data from the database; however, its
impossible to predict the state of the database in these situations.
If you are not running with journaling, and your database shuts down unexpectedly for any reason, you should always
proceed as if your database is in an inconsistent and likely corrupt state. If at all possible restore from backup
(page 136) or, if running as a replica set, restore by performing an initial sync using data from an intact member of the
set, as described in Resync a Member of a Replica Set (page 449).
205
MongoDB supports the execution of JavaScript code for the following server-side operations:
mapReduce and the corresponding mongo shell method db.collection.mapReduce(). See MapReduce (page 284) for more information.
eval command, and the corresponding mongo shell method db.eval()
$where operator
Running .js files via a mongo shell Instance on the Server (page 206)
JavaScript in MongoDB
Although the above operations use JavaScript, most interactions with MongoDB do not use JavaScript but use an
idiomatic driver (page 95) in the language of the interacting application.
See also:
Store a JavaScript Function on the Server (page 183)
You can disable all server-side execution of JavaScript, by passing the --noscripting option on the command
line or setting noscripting in a configuration file.
Running .js files via a mongo shell Instance on the Server
You can run a JavaScript (.js) file using a mongo shell instance on the server. This is a good technique for performing
batch administrative work. When you run mongo shell on the server, connecting via the localhost interface, the
connection is fast with low latency.
The command helpers (page 218) provided in the mongo shell are not available in JavaScript files because they are
not valid JavaScript. The following table maps the most common mongo shell helpers to their JavaScript equivalents.
206
Chapter 4. Administration
Shell Helpers
show dbs, show databases
use <db>
show collections
show users
show log <logname>
show logs
it
JavaScript Equivalents
db.adminCommand('listDatabases')
db = db.getSiblingDB('<db>')
db.getCollectionNames()
db.system.users.find()
db.adminCommand({ 'getLog' : '<logname>' })
db.adminCommand({ 'getLog' : '*' })
cursor = db.collection.find()
if ( cursor.hasNext() ){
cursor.next();
}
Concurrency
Refer to the individual method or operator documentation for any concurrency information. See also the concurrency
table (page 570).
Data Types in the mongo Shell
MongoDB BSON provide support for additional data types than JSON. Drivers (page 95) provide native support for
these data types in host languages and the mongo shell also provides several helper classes to support the use of these
data types in the mongo JavaScript shell. See MongoDB Extended JSON (page 227) for additional information.
Types
Date The mongo shell provides various options to return the date, either as a string or as an object:
Date() method which returns the current date as a string.
Date() constructor which returns an ISODate object when used with the new operator.
ISODate() constructor which returns an ISODate object when used with or without the new operator.
Consider the following examples:
To return the date as a string, use the Date() method, as in the following example:
var myDateString = Date();
To print the value of the variable, type the variable name in the shell, as in the following:
myDateString
207
To print the value of the variable, type the variable name in the shell, as in the following:
myDateObject
You can use the new operator with the ISODate() constructor as well.
To print the value of the variable, type the variable name in the shell, as in the following:
myDateObject2
See
ObjectId (page 129) for full documentation of ObjectIds in MongoDB.
208
Chapter 4. Administration
NumberLong By default, the mongo shell treats all numbers as floating-point values. The mongo shell provides
the NumberLong() class to handle 64-bit integers.
The NumberLong() constructor accepts the long as a string:
NumberLong("2090845886852")
The following examples use the NumberLong() class to write to the collection:
db.collection.insert( { _id: 10, calc: NumberLong("2090845886852") } )
db.collection.update( { _id: 10 },
{ $set: { calc: NumberLong("2555555000000") } } )
db.collection.update( { _id: 10 },
{ $inc: { calc: NumberLong(5) } } )
If you use the $inc to increment the value of a field that contains a NumberLong object by a float, the data type
changes to a floating point value, as in the following example:
1. Use $inc to increment the calc field by 5, which the mongo shell treats as a float:
db.collection.update( { _id: 10 },
{ $inc: { calc: 5 } } )
In the updated document, the calc field contains a floating point value:
{ "_id" : 10, "calc" : 2555555000010 }
NumberInt By default, the mongo shell treats all numbers as floating-point values. The mongo shell provides the
NumberInt() constructor to explicitly specify 32-bit integers.
Check Types in the mongo Shell
To determine the type of fields, the mongo shell provides the following operators:
instanceof returns a boolean to test if a value has a specific type.
typeof returns the type of a field.
Example
Consider the following operations using instanceof and typeof:
The following operation tests whether the _id field is of type ObjectId:
mydoc._id instanceof ObjectId
209
typeof mydoc._id
In this case typeof will return the more generic object type rather than ObjectId type.
From the mongo shell or from a JavaScript file, you can instantiate database connections using the Mongo() constructor:
new Mongo()
new Mongo(<host>)
new Mongo(<host:port>)
Consider the following example that instantiates a new connection to the MongoDB instance running on localhost on
the default port and sets the global db variable to myDatabase using the getDB() method:
conn = new Mongo();
db = conn.getDB("myDatabase");
Additionally, you can use the connect() method to connect to the MongoDB instance. The following example
connects to the MongoDB instance that is running on localhost with the non-default port 27020 and set the
global db variable:
db = connect("localhost:27020/myDatabase");
When writing scripts for the mongo shell, consider the following:
To set the db global variable, use the getDB() method or the connect() method. You can assign the
database reference to a variable other than db.
Inside the script, call db.getLastError() explicitly to wait for the result of write operations (page 42).
You cannot use any shell helper (e.g. use <dbname>, show dbs, etc.) inside the JavaScript file because
they are not valid JavaScript.
The following table maps the most common mongo shell helpers to their JavaScript equivalents.
210
Chapter 4. Administration
Shell Helpers
show dbs, show databases
use <db>
show collections
show users
show log <logname>
show logs
it
JavaScript Equivalents
db.adminCommand('listDatabases')
db = db.getSiblingDB('<db>')
db.getCollectionNames()
db.system.users.find()
db.adminCommand({ 'getLog' : '<logname>' })
db.adminCommand({ 'getLog' : '*' })
cursor = db.collection.find()
if ( cursor.hasNext() ){
cursor.next();
}
In interactive mode, mongo prints the results of operations including the content of all cursors. In scripts, either
use the JavaScript print() function or the mongo specific printjson() function which returns formatted
JSON.
Example
To print all items in a result cursor in mongo shell scripts, use the following idiom:
cursor = db.collection.find();
while ( cursor.hasNext() ) {
printjson( cursor.next() );
}
Scripting
This returns the output of db.getCollectionNames() using the mongo shell connected to the mongod or
mongos instance running on port 27017 on the localhost interface.
Execute a JavaScript file You can specify a .js file to the mongo shell, and mongo will execute the JavaScript
directly. Consider the following example:
mongo localhost:27017/test myjsfile.js
This operation executes the myjsfile.js script in a mongo shell that connects to the test database on the
mongod instance accessible via the localhost interface on port 27017.
211
Alternately, you can specify the mongodb connection parameters inside of the javascript file using the Mongo()
constructor. See Opening New Connections (page 210) for more information.
You can execute a .js file from within the mongo shell, using the load() function, as in the following:
load("myjstest.js")
Note: There is no search path for the load() function. If the desired script is not in the current working directory
or the full specified path, mongo will not be able to access the file.
To start the mongo shell and connect to your MongoDB instance running on localhost with default port:
1. Go to your <mongodb installation dir>:
cd <mongodb installation dir>
If you have added the <mongodb installation dir>/bin to the PATH environment variable, you can
just type mongo instead of ./bin/mongo.
3. To display the database you are using, type db:
db
The operation should return test, which is the default database. To switch databases, issue the use <db>
helper, as in the following example:
use <database>
To list the available databases, use the helper show dbs. See also How can I access different databases
temporarily? (page 568) to access a different database from the current database without switching your current
database context (i.e. db..)
To start the mongo shell with other options, see examples of starting up mongo and mongo reference which
provides details on the available options.
Note: When starting, mongo checks the users HOME directory for a JavaScript file named .mongorc.js. If found,
mongo interprets the content of .mongorc.js before displaying the prompt for the first time. If you use the shell
212
Chapter 4. Administration
to evaluate a JavaScript file or expression, either by using the --eval option on the command line or by specifying
a .js file to mongo, mongo will read the .mongorc.js file after the JavaScript has finished processing.
Executing Queries
From the mongo shell, you can use the shell methods to run queries, as in the following example:
db.<collection>.find()
The find() method is the JavaScript method to retrieve documents from <collection>. The find()
method returns a cursor to the results; however, in the mongo shell, if the returned cursor is not assigned to a
variable using the var keyword, then the cursor is automatically iterated up to 20 times to print up to the first
20 documents that match the query. The mongo shell will prompt Type it to iterate another 20 times.
You can set the DBQuery.shellBatchSize attribute to change the number of iteration from the default
value 20, as in the following example which sets it to 10:
DBQuery.shellBatchSize = 10;
For more information and examples on cursor handling in the mongo shell, see Cursors (page 35).
See also Cursor Help (page 217) for list of cursor help in the mongo shell.
For more documentation of basic MongoDB operations in the mongo shell, see:
Getting Started with MongoDB (page 19)
mongo Shell Quick Reference (page 217)
Read Operations (page 31)
Write Operations (page 42)
Indexing Tutorials (page 339)
Print
The mongo shell automatically prints the results of the find() method if the returned cursor is not assigned to
a variable using the var keyword. To format the result, you can add the .pretty() to the operation, as in the
following:
db.<collection>.find().pretty()
In addition, you can use the following explicit print methods in the mongo shell:
print() to print without formatting
print(tojson(<obj>)) to print with JSON formatting and equivalent to printjson()
4.2. Administration Tutorials
213
You can execute a .js file from within the mongo shell, using the load() function, as in the following:
load("myjstest.js")
Note: There is no search path for the load() function. If the desired script is not in the current working directory
or the full specified path, mongo will not be able to access the file.
You may modify the content of the prompt by creating the variable prompt in the shell. The prompt variable can
hold strings as well as any arbitrary JavaScript. If prompt holds a function that returns a string, mongo can display
dynamic information in each prompt. Consider the following examples:
Example
Create a prompt with the number of operations issued in the current session, define the following variables:
cmdCount = 1;
prompt = function() {
return (cmdCount++) + "> ";
}
Example
To create a mongo shell prompt in the form of <database>@<hostname>$ define the following variables:
host = db.serverStatus().host;
prompt = function() {
return db+"@"+host+"$ ";
}
214
Chapter 4. Administration
Example
To create a mongo shell prompt that contains the system up time and the number of documents in the current database,
define the following prompt variable:
prompt = function() {
return "Uptime:"+db.serverStatus().uptime+" Documents:"+db.stats().objects+" > ";
}
Note: As mongo shell interprets code edited in an external editor, it may modify code in functions, depending on
the JavaScript compiler. For mongo may convert 1+1 to 2 or remove comments. The actual changes affect only the
appearance of the code and will vary based on the version of JavaScript used but will not affect the semantics of the
code.
215
To see the list of options and help for starting the mongo shell, use the --help option from the command line:
mongo --help
Shell Help
Database Help
To see the list of databases on the server, use the show dbs command:
show dbs
New in version 2.4: show databases is now an alias for show dbs
To see the list of help for methods you can use on the db object, call the db.help() method:
db.help()
To see the implementation of a method in the shell, type the db.<method name> without the parenthesis
(()), as in the following example which will return the implementation of the method db.addUser():
db.addUser
Collection Help
To see the list of collections in the current database, use the show collections command:
show collections
To see the help for methods available on the collection objects (e.g.
db.<collection>.help() method:
db.collection.help()
<collection> can be the name of a collection that exists, although you may specify a collection that doesnt
exist.
216
Chapter 4. Administration
To see the collection method implementation, type the db.<collection>.<method> name without the
parenthesis (()), as in the following example which will return the implementation of the save() method:
db.collection.save
Cursor Help
When you perform read operations (page 31) with the find() method in the mongo shell, you can use various
cursor methods to modify the find() behavior and various JavaScript methods to handle the cursor returned from
the find() method.
To list the available modifier and cursor handling methods, use the db.collection.find().help()
command:
db.collection.find().help()
<collection> can be the name of a collection that exists, although you may specify a collection that doesnt
exist.
To see the implementation of the cursor method, type the db.<collection>.find().<method> name
without the parenthesis (()), as in the following example which will return the implementation of the
toArray() method:
db.collection.find().toArray
To get a list of the wrapper classes available in the mongo shell, such as BinData(), type help misc in the
mongo shell:
help misc
You can retrieve previous commands issued in the mongo shell with the up and down arrow keys. Command history
is stored in ~/.dbshell file. See .dbshell for more information.
217
The mongo executable can be started with numerous options. See mongo executable page for details on all
available options.
The following table displays some common options for mongo:
OpDescription
tion
--help Show command line options
--nodb Start mongo shell without connecting to a database.
To connect later, see Opening New Connections (page 210).
--shellUsed in conjunction with a JavaScript file (i.e. <file.js>) to continue in the mongo shell after running
the JavaScript file.
See JavaScript file (page 211) for an example.
Command Helpers
The mongo shell provides various help. The following table displays some common help methods and commands:
Help Methods and Description
Commands
help
Show help.
db.help()
Show help for database methods.
db.<collection>.help()
Show help on collection methods. The <collection> can be the name of an existing
collection or a non-existing collection.
show dbs
Print a list of all databases on the server.
use <db>
Switch current database to <db>. The mongo shell variable db is set to the current
database.
show
Print a list of all collections for current database
collections
show users
Print a list of users for current database.
show profile
Print the five most recent operations that took 1 millisecond or more. See documentation
on the database profiler (page 174) for more information.
show databases
New in version 2.4: Print a list of all available databases.
load()
Execute a JavaScript file. See Getting Started with the mongo Shell (page 212) for more
information.
Basic Shell JavaScript Operations
218
Chapter 4. Administration
Description
If running in secure mode, authenticate the user.
Set a specific collection in the current database to a variable coll, as in the following example:
coll = db.myCollection;
You can perform operations on the myCollection
using the variable, as in the following example:
coll.find();
find()
insert()
update()
save()
remove()
drop()
ensureIndex()
db.getSiblingDB()
Function
previous-history
next-history
beginning-of-line
end-of-line
autocomplete
Continued on next page
219
Queries
In the mongo shell, perform read operations using the find() and findOne() methods.
The find() method returns a cursor object which the mongo shell iterates to print documents on screen. By default,
mongo prints the first 20. The mongo shell will prompt the user to Type it to continue iterating the next 20
results.
The following table provides some common read operations in the mongo shell:
220
Chapter 4. Administration
Read Operations
db.collection.find(<query>)
db.collection.find( <query>,
<projection> )
db.collection.find().sort( <sort
order> )
db.collection.findOne( <query> )
Description
Find the documents matching the <query> criteria in
the collection. If the <query> criteria is not specified
or is empty (i.e {} ), the read operation selects all documents in the collection.
The following example selects the documents in the
users collection with the name field equal to "Joe":
coll = db.users;
coll.find( { name: "Joe" } );
For more information on specifying the <query> criteria, see Query Documents (page 60).
Find documents matching the <query> criteria and return just specific fields in the <projection>.
The following example selects all documents from the
collection but returns only the name field and the _id
field. The _id is always returned unless explicitly specified to not return.
coll = db.users;
coll.find( { },
{ name: true }
);
For
more
information
on
specifying
the
<projection>, see Limit Fields to Return from
a Query (page 64).
Return results in the specified <sort order>.
The following example selects all documents from the
collection and returns the results sorted by the name
field in ascending order (1). Use -1 for descending order:
coll = db.users;
coll.find().sort( { name: 1 } );
Return the documents matching the <query> criteria
in the specified <sort order>.
Limit result to <n> rows. Highly recommended if you
need only a certain number of rows for best performance.
Skip <n> results.
Returns total number of documents in the collection.
Returns the total number of documents that match the
query.
The count() ignores limit() and skip(). For
example, if 100 records match but the limit is 10,
count() will return 100. This will be faster than iterating yourself, but still take time.
Find and return a single document. Returns null if not
found.
The following example selects a single document in the users collection with the
name field matches to "Joe":
coll = db.users;
coll.findOne( { name: "Joe" } );
Internally, the findOne() method is the find()
method with a limit(1).
221
See Query Documents (page 60) and Read Operations (page 31) documentation for more information and examples.
See http://docs.mongodb.org/manualreference/operator to specify other query operators.
Error Checking Methods
The mongo shell provides numerous administrative database methods, including error checking methods. These
methods are:
Error Checking Methods
db.getLastError()
db.getLastErrorObj()
Description
Returns error message from the last operation.
Returns the error document from the last operation.
The following table lists some common methods to support database administration:
JavaScript Database
Description
Administration Methods
db.cloneDatabase(<host>)
Clone the current database from the <host> specified. The <host> database
instance must be in noauth mode.
db.copyDatabase(<from>,Copy the <from> database from the <host> to the <to> database on the
<to>, <host>)
current server.
The <host> database instance must be in noauth mode.
db.fromColl.renameCollection(<toColl>)
Rename collection from fromColl to <toColl>.
db.repairDatabase()
Repair and compact the current database. This operation can be very slow on
large databases.
db.addUser( <user>,
Add user to current database.
<pwd> )
db.getCollectionNames()Get the list of all collections in the current database.
db.dropDatabase()
Drops the current database.
See also administrative database methods for a full list of methods.
Opening Additional Connections
Description
Open a new database connection.
Open a connection to a new server using new
Mongo().
Use getDB() method of the connection to select a
database.
See also Opening New Connections (page 210) for more information on the opening new connections from the mongo
shell.
Miscellaneous
Chapter 4. Administration
Method
Object.bsonsize(<document>)
Description
Prints the BSON size of a <document>
See the MongoDB JavaScript API Documentation82 for a full list of JavaScript methods .
Additional Resources
Consider the following reference material that addresses the mongo shell and its interface:
http://docs.mongodb.org/manualreference/program/mongo
http://docs.mongodb.org/manualreference/method
http://docs.mongodb.org/manualreference/operator
http://docs.mongodb.org/manualreference/command
Aggregation Reference (page 308)
Additionally, the MongoDB source code repository includes a jstests directory83 which contains numerous mongo
shell scripts.
See also:
The MongoDB Manual contains administrative documentation and tutorials though out several sections. See Replica
Set Tutorials (page 419) and Sharded Cluster Tutorials (page 506) for additional tutorials and information.
223
and hard nproc values to increase the process limit. See /etc/security/limits.d/90-nproc.conf file
as an example.
Resource Utilization
mongod and mongos each use threads and file descriptors to track connections and manage internal operations. This
section outlines the general resource utilization patterns for MongoDB. Use these figures in combination with the
actual information about your deployment and its use to determine ideal ulimit settings.
Generally, all mongod and mongos instances:
track each incoming connection with a file descriptor and a thread.
track each internal thread or pthread as a system process.
mongod
1 file descriptor for each data file in use by the mongod instance.
1 file descriptor for each journal file used by the mongod instance when journal is true.
In replica sets, each mongod maintains a connection to all other members of the set.
mongod uses background threads for a number of internal processes, including TTL collections (page 163), replication, and replica set health checks, which may require a small number of additional resources.
mongos
In addition to the threads and file descriptors for client connections, mongos must maintain connects to all config
servers and all shards, which includes all members of all replica sets.
For mongos, consider the following behaviors:
mongos instances maintain a connection pool to each shard so that the mongos can reuse connections and
quickly fulfill requests without needing to create new connections.
You can limit the number of incoming connections using the maxConns run-time option.
By restricting the number of incoming connections you can prevent a cascade effect where the mongos creates
too many connections on the mongod instances.
Note: You cannot set maxConns to a value higher than 20000.
Note: Both the hard and the soft ulimit affect MongoDBs performance. The hard ulimit refers to the
maximum number of processes that a user can have active at any time. This is the ceiling: no non-root process can
increase the hard ulimit. In contrast, the soft ulimit is the limit that is actually enforced for a session or
process, but any process can increase it up to hard ulimit maximum.
224
Chapter 4. Administration
A low soft ulimit can cause cant create new thread, closing connection errors if the number
of connections grows too high. For this reason, it is extremely important to set both ulimit values to the recommended values.
You can use the ulimit command at the system prompt to check system limits, as in the following example:
$ ulimit -a
-t: cpu time (seconds)
-f: file size (blocks)
-d: data seg size (kbytes)
-s: stack size (kbytes)
-c: core file size (blocks)
-m: resident set size (kbytes)
-u: processes
-n: file descriptors
-l: locked-in-memory size (kb)
-v: address space (kb)
-x: file locks
-i: pending signals
-q: bytes in POSIX msg queues
-e: max nice
-r: max rt priority
-N 15:
unlimited
unlimited
unlimited
8192
0
unlimited
192276
21000
40000
unlimited
unlimited
192276
819200
30
65
unlimited
ulimit refers to the per-user limitations for various resources. Therefore, if your mongod instance executes as
a user that is also running multiple processes, or multiple mongod processes, you might see contention for these
resources. Also, be aware that the processes value (i.e. -u) refers to the combined number of distinct processes
and sub-process threads.
You can change ulimit settings by issuing a command in the following form:
ulimit -n <value>
For many distributions of Linux you can change values by substituting the -n option for any possible value in the
output of ulimit -a. On OS X, use the launchctl limit command. See your operating system documentation
for the precise procedure for changing system limits on running systems.
Note: After changing the ulimit settings, you must restart the process to take advantage of the modified settings.
You can use the http://docs.mongodb.org/manualproc file system to see the current limitations on a
running process.
Depending on your systems configuration, and default settings, any change to system limits made using ulimit
may revert following system a system restart. Check your distribution and operating system documentation for more
information.
225
You can copy and paste this function into a current shell session or load it as part of a script. Call the function with
one the following invocations:
return-limits mongod
return-limits mongos
return-limits mongod mongos
Soft Limit
unlimited
unlimited
unlimited
8720000
0
unlimited
192276
1024
40960000
unlimited
unlimited
192276
819200
30
65
unlimited
Hard Limit
unlimited
unlimited
unlimited
unlimited
unlimited
unlimited
192276
4096
40960000
unlimited
unlimited
192276
819200
30
65
unlimited
Units
seconds
bytes
bytes
bytes
bytes
bytes
processes
files
bytes
bytes
locks
signals
bytes
us
Recommended Settings
Every deployment may have unique requirements and settings; however, the following thresholds and settings are
particularly important for mongod and mongos deployments:
-f (file size): unlimited
-t (cpu time): unlimited
-v (virtual memory): unlimited 84
-n (open files): 64000
-m (memory size): unlimited 1
84 If you limit virtual or resident memory size on a system running MongoDB the operating system will refuse to honor additional allocation
requests.
226
Chapter 4. Administration
-u (processes/threads): 32000
Always remember to restart your mongod and mongos instances after changing the ulimit settings to make sure
that the settings change takes effect.
227
data_binary
Strict Mode
{
"$binary": "<bindata>",
"$type": "<t>"
}
{
"$binary": "<bindata>",
"$type": "<t>"
}
data_date
Strict Mode
{
"$date": <date>
}
<date> is the JSON representation of a 64-bit signed integer for milliseconds since epoch UTC (unsigned
before version 1.9.1).
Timestamp
data_timestamp
Strict Mode
{
"$timestamp": {
"t": <t>,
"i": <i>
}
}
{
"$timestamp": {
"t": <t>,
"i": <i>
}
}
<t> is the JSON representation of a 32-bit unsigned integer for seconds since epoch.
<i> is a 32-bit unsigned integer for the increment.
228
Chapter 4. Administration
Regular Expression
data_regex
Strict Mode
{
/<jRegex>/<jOptions>
"$regex": "<sRegex>",
"$options": "<sOptions>"
}
data_oid
Strict Mode
{
"$oid": "<id>"
}
{
"$oid": "<id>"
}
ObjectId( "<id>" )
data_ref
Strict Mode
{
"$ref": "<name>",
"$id": "<id>"
}
{
"$ref" : "<name>",
"$id" : "<id>"
}
DBRef("<name>", "<id>")
229
Undefined Type
data_undefined
Strict Mode
{
"$undefined": true
}
undefined
undefined
data_minkey
Strict Mode
{
"$minKey": 1
}
{
"$minKey": 1
}
MinKey
The representation of the MinKey BSON data type that compares lower than all other types. See What is the
compare order for BSON types? (page 563) for more information on comparison order for BSON types.
MaxKey
data_maxkey
Strict Mode
{
"$maxKey": 1
}
{
"$maxKey": 1
}
MaxKey
The representation of the MaxKey BSON data type that compares higher than all other types. See What is the
compare order for BSON types? (page 563) for more information on comparison order for BSON types.
230
Chapter 4. Administration
Output Reference
For any single operation, the documents created by the database profiler will include a subset of the following fields.
The precise selection of fields in these documents depends on the type of operation.
system.profile.ts
The timestamp of the operation.
system.profile.op
The type of operation. The possible values are:
insert
query
update
231
remove
getmore
command
system.profile.ns
The namespace the operation targets. Namespaces in MongoDB take the form of the database, followed by a
dot (.), followed by the name of the collection.
system.profile.query
The query document (page 60) used.
system.profile.command
The command operation.
system.profile.updateobj
The <update> document passed in during an update (page 42) operation.
system.profile.cursorid
The ID of the cursor accessed by a getmore operation.
system.profile.ntoreturn
Changed in version 2.2: In 2.0, MongoDB includes this field for query and command operations. In 2.2, this
information MongoDB also includes this field for getmore operations.
The number of documents the operation specified to return. For example, the profile command would
return one document (a results document) so the ntoreturn (page 232) value would be 1. The limit(5)
command would return five documents so the ntoreturn (page 232) value would be 5.
If the ntoreturn (page 232) value is 0, the command did not specify a number of documents to return, as
would be the case with a simple find() command with no limit specified.
system.profile.ntoskip
New in version 2.2.
The number of documents the skip() method specified to skip.
system.profile.nscanned
The number of documents that MongoDB scans in the index (page 313) in order to carry out the operation.
In general, if nscanned (page 232) is much higher than nreturned (page 233), the database is scanning
many objects to find the target objects. Consider creating an index to improve this.
system.profile.scanAndOrder
scanAndOrder (page 232) is a boolean that is true when a query cannot use the order of documents in the
index for returning sorted results: MongoDB must sort the documents after it receives the documents from a
cursor.
If scanAndOrder (page 232) is false, MongoDB can use the order of the documents in an index to return
sorted results.
system.profile.moved
This field appears with a value of true when an update operation moved one or more documents to a new
location on disk. If the operation did not result in a move, this field does not appear. Operations that result in a
move take more time than in-place updates and typically occur as a result of document growth.
system.profile.nmoved
New in version 2.2.
The number of documents the operation moved on disk. This field appears only if the operation resulted in a
move. The fields implicit value is zero, and the field is present only when non-zero.
232
Chapter 4. Administration
system.profile.nupdated
New in version 2.2.
The number of documents updated by the operation.
system.profile.keyUpdates
New in version 2.2.
The number of index (page 313) keys the update changed in the operation. Changing an index key carries a
small performance cost because the database must remove the old key and inserts a new key into the B-tree
index.
system.profile.numYield
New in version 2.2.
The number of times the operation yielded to allow other operations to complete. Typically, operations yield
when they need access to data that MongoDB has not yet fully read into memory. This allows other operations
that have data in memory to complete while MongoDB reads in data for the yielding operation. For more
information, see the FAQ on when operations yield (page 570).
system.profile.lockStats
New in version 2.2.
The time in microseconds the operation spent acquiring and holding locks. This field reports data for the
following lock types:
R - global read lock
W - global write lock
r - database-specific read lock
w - database-specific write lock
system.profile.lockStats.timeLockedMicros
The time in microseconds the operation held a specific lock. For operations that require more than one
lock, like those that lock the local database to update the oplog, this value may be longer than the total
length of the operation (i.e. millis (page 233).)
system.profile.lockStats.timeAcquiringMicros
The time in microseconds the operation spent waiting to acquire a specific lock.
system.profile.nreturned
The number of documents returned by the operation.
system.profile.responseLength
The length in bytes of the operations result document. A large responseLength (page 233) can affect
performance. To limit the size of the result document for a query operation, you can use any of the following:
Projections (page 64)
The limit() method
The batchSize() method
Note: When MongoDB writes query profile information to the log, the responseLength (page 233) value
is in a field named reslen.
system.profile.millis
The time in milliseconds from the perspective of the mongod from the beginning of the operation to the end of
the operation.
233
system.profile.client
The IP address or hostname of the client connection where the operation originates.
For some operations, such as db.eval(), the client is 0.0.0.0:0 instead of an actual client.
system.profile.user
The authenticated user who ran the operation.
234
Chapter 4. Administration
map your existing on-disk data files to the shared view virtual memory view. The operating system maps the files
but does not load them. MongoDB later loads data files into the shared view as needed.
The private view stores data for use with read operations (page 31). The private view is the first place
MongoDB applies new write operations (page 42). Upon a journal commit, MongoDB copies the changes made in
the private view to the shared view, where they are then available for uploading to the database data files.
The journal is an on-disk view that stores new write operations after MongoDB applies the operation to the private
view but before applying them to the data files. The journal provides durability. If the mongod instance were to
crash without having applied the writes to the data files, the journal could replay the writes to the shared view for
eventual upload to the data files.
How Journaling Records Write Operations
MongoDB copies the write operations to the journal in batches called group commits. These group commits help
minimize the performance impact of journaling, since a group commit must block all writers during the commit. See
journalCommitInterval for information on the default commit interval.
Journaling stores raw operations that allow MongoDB to reconstruct the following:
document insertion/updates
index modifications
metadata changes to the namespace files
creation and dropping of databases and their associated data files
As write operations (page 42) occur, MongoDB writes the data to the private view in RAM and then copies the
write operations in batches to the journal. The journal stores the operations on disk to ensure durability. Each journal
entry describes the bytes the write operation changed in the data files.
MongoDB next applies the journals write operations to the shared view. At this point, the shared view
becomes inconsistent with the data files.
At default intervals of 60 seconds, MongoDB asks the operating system to flush the shared view to disk. This
brings the data files up-to-date with the latest write operations. The operating system may choose to flush the shared
view to disk at a higher frequency than 60 seconds, particularly if the system is low on free memory.
When MongoDB flushes write operations to the data files, MongoDB notes which journal writes have been flushed.
Once a journal file contains only flushed writes, it is no longer needed for recovery, and MongoDB either deletes it or
recycles it for a new journal file.
As part of journaling, MongoDB routinely asks the operating system to remap the shared view to the private
view, in order to save physical RAM. Upon a new remapping, the operating system knows that physical memory
pages can be shared between the shared view and the private view mappings.
Note: The interaction between the shared view and the on-disk data files is similar to how MongoDB works
without journaling, which is that MongoDB asks the operating system to flush in-memory changes back to the data
files every 60 seconds.
235
2
The specified options are in error or are incompatible with other options.
3
Returned by mongod if there is a mismatch between hostnames specified on the command line and in the
local.sources (page 474) collection. mongod may also return this status if oplog collection in the local
database is not readable.
4
The version of the database is different from the version supported by the mongod (or mongod.exe) instance.
The instance exits cleanly. Restart mongod with the --upgrade option to upgrade the database to the version
supported by this mongod instance.
5
Returned by mongod if a moveChunk operation fails to confirm a commit.
12
Returned by the mongod.exe process on Windows when it receives a Control-C, Close, Break or Shutdown
event.
14
Returned by MongoDB applications which encounter an unrecoverable error, an uncaught exception or uncaught
signal. The system exits without performing a clean shut down.
20
Message: ERROR: wsastartup failed <reason>
Returned by MongoDB applications on Windows following an error in the WSAStartup function.
Message: NT Service Error
Returned by MongoDB applications for Windows due to failures installing, starting or removing the NT Service
for the application.
45
Returned when a MongoDB application cannot open a file or cannot obtain a lock on a file.
47
MongoDB applications exit cleanly following a large clock skew (32768 milliseconds) event.
48
mongod exits cleanly if the server socket closes. The server socket is on port 27017 by default, or as specified
to the --port run-time option.
49
Returned by mongod.exe or mongos.exe on Windows when either receives a shutdown message from the
Windows Service Control Manager.
100
Returned by mongod when the process throws an uncaught exception.
236
Chapter 4. Administration
CHAPTER 5
Security
This section outlines basic security and risk management strategies and access control. The included tutorials outline
specific tasks for configuring firewalls, authentication, and system privileges.
Security Introduction (page 237) A high-level introduction to security and MongoDB deployments.
Security Concepts (page 239) The core documentation of security.
Access Control (page 239) Control access to MongoDB instances using authentication and authorization.
Network Exposure and Security (page 242) Discusses potential security risks related to the network and strategies for decreasing possible network-based attack vectors for MongoDB.
Security and MongoDB API Interfaces (page 244) Discusses potential risks related
JavaScript, HTTP and REST interfaces, including strategies to control those risks.
to
MongoDBs
Sharded Cluster Security (page 241) MongoDB controls access to sharded clusters with key files.
Security Tutorials (page 245) Tutorials for enabling and configuring security features for MongoDB.
Create a Vulnerability Report (page 265) Report a vulnerability in MongoDB.
Network Security Tutorials (page 245) Ensure that the underlying network configuration supports a secure operating environment for MongoDB deployments, and appropriately limits access to MongoDB deployments.
Access Control Tutorials (page 257) MongoDBs access control system provides role-based access control for
limiting access to MongoDB deployments. These tutorials describe procedures relevant for the operation
and maintenance of this access control system.
Security Reference (page 267) Reference for security related functions.
237
The intent of a Defense In Depth approach is to ensure there are no exploitable points of failure in your deployment
that could allow an intruder or un-trusted party to access the data stored in the MongoDB database. The easiest and
most effective way to reduce the risk of exploitation is to run MongoDB in a trusted environment, limit access, follow
a system of least privilege, and follow best development and deployment practices.
238
Chapter 5. Security
239
Authorization
MongoDB provisions authorization, or access to databases and operations, on a per-database level. MongoDB uses
a role-based approach to authorization, storing each users roles in a privilege document (page 267) in a databases
system.users (page 272) collection. For more information on privilege documents and available user roles, see
system.users Privilege Documents (page 272) and User Privilege Roles in MongoDB (page 267).
Important: The admin database provides roles that are unavailable in other databases, including a role that effectively makes a user a MongoDB system superuser. See Database Administration Roles (page 269) and Administrative
Roles (page 269).
To assign roles to users, you must be a user with administrative role in the database. As such, you must first create an
administrative user. For details, see Create a User Administrator (page 258) and Add a User to a Database (page 259).
system.users Collection
A databases system.users (page 272) collection stores information for authentication and authorization to that
database. Specifically, the collection stores user credentials for authentication and user privilege information for
authorization. MongoDB requires authorization to access the system.users (page 272) collection in order to
prevent privilege escalation attacks. To access the collection, you must have either userAdmin (page 269) or
userAdminAnyDatabase (page 271) role.
Changed in version 2.4: The schema of system.users (page 272) changed to accommodate a more sophisticated
authorization using user privilege model, as defined in privilege documents (page 267).
240
Chapter 5. Security
keyFile = /srv/mongodb/keyfile
Note: You may chose to set these run-time configuration options using the --keyFile (or mongos --keyFile)
options on the command line.
Setting keyFile enables authentication and specifies a key file for the replica set members to use when authenticating
to each other. The content of the key file is arbitrary but must be the same on all members of the replica set and on all
mongos instances that connect to the set.
The key file must be between 6 and 1024 characters and may only contain characters in the base64 set. The key file
must not have group or world permissions on UNIX systems. See Generate a Key File (page 261) for instructions
on generating a key file.
241
The nohttpinterface setting for mongod and mongos instances disables the home status page, which would
run on port 28017 by default. The status interface is read-only by default. You may also specify this option on the
command line as mongod --nohttpinterface or mongos --nohttpinterface.
Authentication does not control or affect access to this interface.
Important: Disable this option for production deployments. If you do leave this interface enabled, you should only
allow trusted clients to access this port. See Firewalls (page 243).
rest
The rest setting for mongod enables a fully interactive administrative REST interface, which is disabled by default.
The status interface, which is enabled by default, is read-only. This configuration makes that interface fully interactive.
The REST interface does not support any authentication and you should always restrict access to this interface to only
allow trusted clients to connect to this port.
You may also enable this interface on the command line as mongod --rest.
Important: Disable this option for production deployments. If do you leave this interface enabled, you should only
allow trusted clients to access this port.
242
Chapter 5. Security
bind_ip
The bind_ip setting for mongod and mongos instances limits the network interfaces on which MongoDB programs
will listen for incoming connections. You can also specify a number of interfaces by passing bind_ip a comma
separated list of IP addresses. You can use the mongod --bind_ip and mongos --bind_ip option on the
command line at run time to limit the network accessibility of a MongoDB program.
Important: Make sure that your mongod and mongos instances are only accessible on trusted networks. If your
system has more than one network interface, bind MongoDB programs to the private or internal network interface.
port
The port setting for mongod and mongos instances changes the main port on which the mongod or mongos
instance listens for connections. The default port is 27017. Changing the port does not meaningfully reduce risk or
limit exposure. You may also specify this option on the command line as mongod --port or mongos --port.
Setting port also indirectly sets the port for the HTTP status interface, which is always available on the port numbered
1000 greater than the primary mongod port.
Only allow trusted clients to connect to the port for the mongod and mongos instances. See Firewalls (page 243).
See also Security Considerations (page 147) and Default MongoDB Port (page 275).
Firewalls
Firewalls allow administrators to filter and control access to a system by providing granular control over what network
communications. For administrators of MongoDB, the following capabilities are important: limiting incoming traffic
on a specific port to specific systems, and limiting incoming traffic from untrusted hosts.
On Linux systems, the iptables interface provides access to the underlying netfilter firewall. On Windows
systems, netsh command line interface provides access to the underlying Windows Firewall. For additional information about firewall configuration, see Configure Linux iptables Firewall for MongoDB (page 245) and Configure
Windows netsh Firewall for MongoDB (page 249).
For best results and to minimize overall exposure, ensure that only traffic from trusted sources can reach mongod and
mongos instances and that the mongod and mongos instances can only connect to trusted outputs.
See also:
For MongoDB deployments on Amazons web services, see the Amazon EC27 page, which addresses Amazons
Security Groups and other EC2-specific security features.
Virtual Private Networks
Virtual private networks, or VPNs, make it possible to link two networks over an encrypted and limited-access trusted
network. Typically MongoDB users who use VPNs use SSL rather than IPSEC VPNs for performance issues.
Depending on configuration and implementation, VPNs provide for certificate validation and a choice of encryption
protocols, which requires a rigorous level of authentication and identification of all clients. Furthermore, because
VPNs provide a secure tunnel, by using a VPN connection to control access to your MongoDB instance, you can
prevent tampering and man-in-the-middle attacks.
7 http://docs.mongodb.org/ecosystem/platforms/amazon-ec2
243
The mongo program can evaluate JavaScript expressions using the command line --eval option. Also, the mongo
program can evaluate a JavaScript file (.js) passed directly to it (e.g. mongo someFile.js).
Because the mongo program evaluates the JavaScript directly, inputs should only come from trusted sources.
.mongorc.js File
If a .mongorc.js file exists 8 , the mongo shell will evaluate a .mongorc.js file before starting. You can disable
this behavior by passing the mongo --norc option.
HTTP Status Interface
The HTTP status interface provides a web-based interface that includes a variety of operational data, logs, and status
reports regarding the mongod or mongos instance. The HTTP interface is always available on the port numbered
1000 greater than the primary mongod port. By default, the HTTP interface port is 28017, but is indirectly set using
the port option which allows you to configure the primary mongod port.
Without the rest setting, this interface is entirely read-only, and limited in scope; nevertheless, this interface
may represent an exposure. To disable the HTTP interface, set the nohttpinterface run time option or the
--nohttpinterface command line option. See also Configuration Options (page 242).
REST API
The REST API to MongoDB provides additional information and write access on top of the HTTP Status interface.
While the REST API does not provide any support for insert, update, or remove operations, it does provide administrative access, and its accessibility represents a vulnerability in a secure environment. The REST interface is disabled
by default, and is not recommended for production use.
If you must use the REST API, please control and limit access to the REST API. The REST API does not include any
support for authentication, even when running with auth enabled.
See the following documents for instructions on restricting access to the REST API interface:
Configure Linux iptables Firewall for MongoDB (page 245)
Configure Windows netsh Firewall for MongoDB (page 249)
8 On Linux and Unix systems, mongo reads the .mongorc.js file from $HOME/.mongorc.js (i.e. ~/.mongorc.js). On Windows,
mongo.exe reads the .mongorc.js file from %HOME%.mongorc.js or %HOMEDRIVE%%HOMEPATH%.mongorc.js.
244
Chapter 5. Security
245
For MongoDB deployments on Amazons web services, see the Amazon EC29 page, which addresses Amazons
Security Groups and other EC2-specific security features.
Overview
Rules in iptables configurations fall into chains, which describe the process for filtering and processing specific
streams of traffic. Chains have an order, and packets must pass through earlier rules in a chain to reach later rules.
This document addresses only the following two chains:
INPUT Controls all incoming traffic.
OUTPUT Controls all outgoing traffic.
Given the default ports (page 242) of all MongoDB processes, you must configure networking rules that permit only
required communication between your application and the appropriate mongod and mongos instances.
Be aware that, by default, the default policy of iptables is to allow all connections and traffic unless explicitly
disabled. The configuration changes outlined in this document will create rules that explicitly allow traffic from
specific addresses and on specific ports, using a default policy that drops all traffic that is not explicitly allowed. When
you have properly configured your iptables rules to allow only the traffic that you want to permit, you can Change
Default Policy to DROP (page 248).
Patterns
This section contains a number of patterns and examples for configuring iptables for use with MongoDB deployments. If you have configured different ports using the port configuration setting, you will need to modify the rules
accordingly.
Traffic to and from mongod Instances This pattern is applicable to all mongod instances running as standalone
instances or as part of a replica set.
The goal of this pattern is to explicitly allow traffic to the mongod instance from the application server. In the
following examples, replace <ip-address> with the IP address of the application server:
The first rule allows all incoming traffic from <ip-address> on port 27017, which allows the application server to
connect to the mongod instance. The second rule, allows outgoing traffic from the mongod to reach the application
server.
Optional
If you have only one application server, you can replace <ip-address> with either the IP address itself, such as:
198.51.100.55. You can also express this using CIDR notation as 198.51.100.55/32. If you want to permit
a larger block of possible IP addresses you can allow traffic from a http://docs.mongodb.org/manual24
using one of the following specifications for the <ip-address>, as follows:
10.10.10.10/24
10.10.10.10/255.255.255.0
9 http://docs.mongodb.org/ecosystem/platforms/amazon-ec2
246
Chapter 5. Security
Traffic to and from mongos Instances mongos instances provide query routing for sharded clusters. Clients
connect to mongos instances, which behave from the clients perspective as mongod instances. In turn, the mongos
connects to all mongod instances that are components of the sharded cluster.
Use the same iptables command to allow traffic to and from these instances as you would from the mongod
instances that are members of the replica set. Take the configuration outlined in the Traffic to and from mongod
Instances (page 246) section as an example.
Traffic to and from a MongoDB Config Server Config servers, host the config database that stores metadata
for sharded clusters. Each production cluster has three config servers, initiated using the mongod --configsvr
option. 10 Config servers listen for connections on port 27019. As a result, add the following iptables rules to the
config server to allow incoming and outgoing connection on port 27019, for connection to the other config servers.
Replace <ip-address> with the address or address space of all the mongod that provide config servers.
Additionally, config servers need to allow incoming connections from all of the mongos instances in the cluster and
all mongod instances in the cluster. Add rules that resemble the following:
Replace <ip-address> with the address of the mongos instances and the shard mongod instances.
Traffic to and from a MongoDB Shard Server For shard servers, running as mongod --shardsvr 11 Because
the default port number when running with shardsvr is 27018, you must configure the following iptables rules
to allow traffic to and from each shard:
Replace the <ip-address> specification with the IP address of all mongod. This allows you to permit incoming
and outgoing traffic between all shards including constituent replica set members, to:
all mongod instances in the shards replica sets.
all mongod instances in other shards.
12
You can also run a config server by setting the configsvr option in a configuration file.
You can also specify the shard server option using the shardsvr setting in the configuration file. Shard members are also often conventional
replica sets using the default port.
12 All shards in a cluster need to be able to communicate with all other shards to facilitate chunk and balancing operations.
11
247
2. If your monitoring system needs access the HTTP interface, insert the following rule to the chain:
Replace <ip-address> with the address of the instance that needs access to the HTTP or REST interface.
For all deployments, you should restrict access to this port to only the monitoring instance.
Optional
For shard server mongod instances running with shardsvr, the rule would resemble the following:
For config server mongod instances running with configsvr, the rule would resemble the following:
The default policy for iptables chains is to allow all traffic. After completing all iptables configuration changes,
you must change the default policy to DROP so that all traffic that isnt explicitly allowed as above will not be able to
reach components of the MongoDB deployment. Issue the following commands to change this policy:
iptables -P INPUT DROP
iptables -P OUTPUT DROP
This section contains a number of basic operations for managing and using iptables. There are various front end
tools that automate some aspects of iptables configuration, but at the core all iptables front ends provide the
same basic functionality:
Make all iptables Rules Persistent By default all iptables rules are only stored in memory. When your
system restarts, your firewall rules will revert to their defaults. When you have tested a rule set and have guaranteed
that it effectively controls traffic you can use the following operations to you should make the rule set persistent.
On Red Hat Enterprise Linux, Fedora Linux, and related distributions you can issue the following command:
service iptables save
On Debian, Ubuntu, and related distributions, you can use the following command to dump the iptables rules to
the /etc/iptables.conf file:
iptables-save > /etc/iptables.conf
Place this command in your rc.local file, or in the /etc/network/if-up.d/iptables file with other
similar operations.
248
Chapter 5. Security
List all iptables Rules To list all of currently applied iptables rules, use the following operation at the system
shell.
iptables --L
Flush all iptables Rules If you make a configuration mistake when entering iptables rules or simply need to
revert to the default rule set, you can use the following operation at the system shell to flush all rules:
iptables --F
If youve already made your iptables rules persistent, you will need to repeat the appropriate procedure in the
Make all iptables Rules Persistent (page 248) section.
Configure Windows netsh Firewall for MongoDB
On Windows Server systems, the netsh program provides methods for managing the Windows Firewall. These
firewall rules make it possible for administrators to control what hosts can connect to the system, and limit risk
exposure by limiting the hosts that can connect to a system.
This document outlines basic Windows Firewall configurations. Use these approaches as a starting point for your
larger networking organization. For a detailed over view of security practices and risk management for MongoDB, see
Security Concepts (page 239).
See also:
Windows Firewall13 documentation from Microsoft.
Overview
Windows Firewall processes rules in an ordered determined by rule type, and parsed in the following order:
1. Windows Service Hardening
2. Connection security rules
3. Authenticated Bypass Rules
4. Block Rules
5. Allow Rules
6. Default Rules
By default, the policy in Windows Firewall allows all outbound connections and blocks all incoming connections.
Given the default ports (page 242) of all MongoDB processes, you must configure networking rules that permit only
required communication between your application and the appropriate mongod.exe and mongos.exe instances.
The configuration changes outlined in this document will create rules which explicitly allow traffic from specific
addresses and on specific ports, using a default policy that drops all traffic that is not explicitly allowed.
You can configure the Windows Firewall with using the netsh command line tool or through a windows application.
On Windows Server 2008 this application is Windows Firewall With Advanced Security in Administrative Tools. On
previous versions of Windows Server, access the Windows Firewall application in the System and Security control
panel.
The procedures in this document use the netsh command line tool.
13 http://technet.microsoft.com/en-us/network/bb545423.aspx
249
Patterns
This section contains a number of patterns and examples for configuring Windows Firewall for use with MongoDB
deployments. If you have configured different ports using the port configuration setting, you will need to modify the
rules accordingly.
Traffic to and from mongod.exe Instances This pattern is applicable to all mongod.exe instances running as
standalone instances or as part of a replica set. The goal of this pattern is to explicitly allow traffic to the mongod.exe
instance from the application server.
netsh advfirewall firewall add rule name="Open mongod port 27017" dir=in action=allow protocol=TCP lo
This rule allows all incoming traffic to port 27017, which allows the application server to connect to the
mongod.exe instance.
Windows Firewall also allows enabling network access for an entire application rather than to a specific port, as in the
following example:
netsh advfirewall firewall add rule name="Allowing mongod" dir=in action=allow program=" C:\mongodb\b
You can allow all access for a mongos.exe server, with the following invocation:
netsh advfirewall firewall add rule name="Allowing mongos" dir=in action=allow program=" C:\mongodb\b
Traffic to and from mongos.exe Instances mongos.exe instances provide query routing for sharded clusters.
Clients connect to mongos.exe instances, which behave from the clients perspective as mongod.exe instances.
In turn, the mongos.exe connects to all mongod.exe instances that are components of the sharded cluster.
Use the same Windows Firewall command to allow traffic to and from these instances as you would from the
mongod.exe instances that are members of the replica set.
netsh advfirewall firewall add rule name="Open mongod shard port 27018" dir=in action=allow protocol=
Traffic to and from a MongoDB Config Server Configuration servers, host the config database that stores metadata for sharded clusters. Each production cluster has three configuration servers, initiated using the mongod
--configsvr option. 14 Configuration servers listen for connections on port 27019. As a result, add the following Windows Firewall rules to the config server to allow incoming and outgoing connection on port 27019, for
connection to the other config servers.
netsh advfirewall firewall add rule name="Open mongod config svr port 27019" dir=in action=allow prot
Additionally, config servers need to allow incoming connections from all of the mongos.exe instances in the cluster
and all mongod.exe instances in the cluster. Add rules that resemble the following:
netsh advfirewall firewall add rule name="Open mongod config svr inbound" dir=in action=allow protoco
Replace <ip-address> with the addresses of the mongos.exe instances and the shard mongod.exe instances.
Traffic to and from a MongoDB Shard Server For shard servers, running as mongod --shardsvr 15 Because
the default port number when running with shardsvr is 27018, you must configure the following Windows Firewall
rules to allow traffic to and from each shard:
14
You can also run a config server by setting the configsvr option in a configuration file.
You can also specify the shard server option using the shardsvr setting in the configuration file. Shard members are also often conventional
replica sets using the default port.
15
250
Chapter 5. Security
netsh advfirewall firewall add rule name="Open mongod shardsvr inbound" dir=in action=allow protocol=
netsh advfirewall firewall add rule name="Open mongod shardsvr outbound" dir=out action=allow protoco
Replace the <ip-address> specification with the IP address of all mongod.exe instances. This allows you to
permit incoming and outgoing traffic between all shards including constituent replica set members to:
all mongod.exe instances in the shards replica sets.
all mongod.exe instances in other shards.
16
netsh advfirewall firewall add rule name="Open mongod config svr outbound" dir=out action=allow proto
netsh advfirewall firewall add rule name="Open mongod HTTP monitoring inbound" dir=in action=all
Replace <ip-address> with the address of the instance that needs access to the HTTP or REST interface.
For all deployments, you should restrict access to this port to only the monitoring instance.
Optional
For shard server mongod.exe instances running with shardsvr, the rule would resemble the following:
netsh advfirewall firewall add rule name="Open mongos HTTP monitoring inbound" dir=in action=all
For config server mongod.exe instances running with configsvr, the rule would resemble the following:
netsh advfirewall firewall add rule name="Open mongod configsvr HTTP monitoring inbound" dir=in
This section contains a number of basic operations for managing and using netsh. While you can use the GUI front
ends to manage the Windows Firewall, all core functionality is accessible is accessible from netsh.
Delete all Windows Firewall Rules To delete the firewall rule allowing mongod.exe traffic:
netsh advfirewall firewall delete rule name="Open mongod port 27017" protocol=tcp localport=27017
netsh advfirewall firewall delete rule name="Open mongod shard port 27018" protocol=tcp localport=270
16
All shards in a cluster need to be able to communicate with all other shards to facilitate chunk and balancing operations.
251
List All Windows Firewall Rules To return a list of all Windows Firewall rules:
netsh advfirewall firewall show rule name=all
Backup and Restore Windows Firewall Rules To simplify administration of larger collection of systems, you can
export or import firewall systems from different servers) rules very easily on Windows:
Export all firewall rules with the following command:
netsh advfirewall export "C:\temp\MongoDBfw.wfw"
Replace "C:\temp\MongoDBfw.wfw" with a path of your choosing. You can use a command in the following
form to import a file created using this operation:
netsh advfirewall import "C:\temp\MongoDBfw.wfw"
Combine SSL Certificate and Key File Before you can use SSL, you must have a .pem file that contains the public
key certificate and private key. MongoDB can use any valid SSL certificate. To generate a self-signed certificate and
private key, use a command that resembles the following:
cd /etc/ssl/
openssl req -new -x509 -days 365 -nodes -out mongodb-cert.crt -keyout mongodb-cert.key
This operation generates a new, self-signed certificate with no passphrase that is valid for 365 days. Once you have
the certificate, concatenate the certificate and private key to a .pem file, as in the following example:
cat mongodb-cert.key mongodb-cert.crt > mongodb.pem
Set Up mongod and mongos with SSL Certificate and Key To use SSL in your MongoDB deployment, include
the following run-time options with mongod and mongos:
sslOnNormalPorts
sslPEMKeyFile with the .pem file that contains the SSL certificate and key.
17 http://www.mongodb.org/downloads
18 http://www.mongodb.com/products/mongodb-enterprise
252
Chapter 5. Security
For example, given an SSL certificate located at /etc/ssl/mongodb.pem, configure mongod to use SSL encryption for all connections with the following command:
mongod --sslOnNormalPorts --sslPEMKeyFile /etc/ssl/mongodb.pem
Note:
Specify <pem> with the full path name to the certificate.
If the private key portion of the <pem> is encrypted, specify the encryption password with the
sslPEMKeyPassword option.
You may also specify these options in the configuration file, as in the following example:
sslOnNormalPorts = true
sslPEMKeyFile = /etc/ssl/mongodb.pem
To connect, to mongod and mongos instances using SSL, the mongo shell and MongoDB tools must include the
--ssl option. See SSL Configuration for Clients (page 254) for more information on connecting to mongod and
mongos running with SSL.
Set Up mongod and mongos with Certificate Validation To set up mongod or mongos for SSL encryption
using an SSL certificate signed by a certificate authority, include the following run-time options during startup:
sslOnNormalPorts
sslPEMKeyFile with the name of the .pem file that contains the signed SSL certificate and key.
sslCAFile with the name of the .pem file that contains the root certificate chain from the Certificate Authority.
Consider the following syntax for mongod:
mongod --sslOnNormalPorts --sslPEMKeyFile <pem> --sslCAFile <ca>
For example, given a signed SSL certificate located at /etc/ssl/mongodb.pem and the certificate authority file
at /etc/ssl/ca.pem, you can configure mongod for SSL encryption as follows:
mongod --sslOnNormalPorts --sslPEMKeyFile /etc/ssl/mongodb.pem --sslCAFile /etc/ssl/ca.pem
Note:
Specify the <pem> file and the <ca> file with either the full path name or the relative path name.
If the <pem> is encrypted, specify the encryption password with the sslPEMKeyPassword option.
You may also specify these options in the configuration file, as in the following example:
sslOnNormalPorts = true
sslPEMKeyFile = /etc/ssl/mongodb.pem
sslCAFile = /etc/ssl/ca.pem
To connect, to mongod and mongos instances using SSL, the mongo tools must include the both the --ssl and
--sslPEMKeyFile option. See SSL Configuration for Clients (page 254) for more information on connecting to
mongod and mongos running with SSL.
253
Block Revoked Certificates for Clients To prevent clients with revoked certificates from connecting, include the
sslCRLFile to specify a .pem file that contains revoked certificates.
For example, the following mongod with SSL configuration includes the sslCRLFile setting:
Clients with revoked certificates in the /etc/ssl/ca-crl.pem will not be able to connect to this mongod instance.
Validate Only if a Client Presents a Certificate In most cases it is important to ensure that clients present valid
certificates. However, if you have clients that cannot present a client certificate, or are transitioning to using a certificate
authority you may only want to validate certificates from clients that present a certificate.
If you want to bypass validation for clients that dont present certificates, include the
sslWeakCertificateValidation run-time option with mongod and mongos. If the client does not
present a certificate, no validation occurs. These connections, though not validated, are still encrypted using SSL.
For example, consider the following mongod
sslWeakCertificateValidation setting:
with
an
SSL
configuration
that
includes
the
Then, clients can connect either with the option --ssl and no certificate or with the option --ssl and a valid
certificate. See SSL Configuration for Clients (page 254) for more information on SSL connections for clients.
Note: If the client presents a certificate, the certificate must be a valid certificate.
All connections, including those that have not presented certificates are encrypted using SSL.
Run in FIPS Mode If your mongod or mongos is running on a system with an OpenSSL library configured with
the FIPS 140-2 module, you can run mongod or mongos in FIPS mode, with the sslFIPSMode setting.
SSL Configuration for Clients
Clients must have support for SSL to work with a mongod or a mongos instance that has SSL support enabled. The
current versions of the Python, Java, Ruby, Node.js, .NET, and C++ drivers have support for SSL, with full support
coming in future releases of other drivers.
mongo SSL Configuration For SSL connections, you must use the mongo shell built with SSL support or distributed with MongoDB Enterprise. To support SSL, mongo has the following settings:
--ssl
--sslPEMKeyFile with the name of the .pem file that contains the SSL certificate and key.
--sslCAFile with the name of the .pem file that contains the certificate from the Certificate Authority.
--sslPEMKeyPassword option if the client certificate-key file is encrypted.
Connect to MongoDB Instance with SSL Encryption To connect to a mongod or mongos instance that requires
only a SSL encryption mode (page 252), start mongo shell with --ssl, as in the following:
mongo --ssl
254
Chapter 5. Security
Connect to MongoDB Instance that Requires Client Certificates To connect to a mongod or mongos that requires CA-signed client certificates (page 253), start the mongo shell with --ssl and the --sslPEMKeyFile
option to specify the signed certificate-key file, as in the following:
mongo --ssl --sslPEMKeyFile /etc/ssl/client.pem
Connect to MongoDB Instance that Validates when Presented with a Certificate To connect to a mongod or
mongos instance that only requires valid certificates when the client presents a certificate (page 254), start mongo
shell either with the --ssl ssl and no certificate or with the --ssl ssl and a valid signed certificate.
For example, if mongod is running with weak certificate validation, both of the following mongo shell clients can
connect to that mongod:
mongo --ssl
mongo --ssl --sslPEMKeyFile /etc/ssl/client.pem
MMS Monitoring Agent The Monitoring agent will also have to connect via SSL in order to gather its stats. Because the agent already utilizes SSL for its communications to the MMS servers, this is just a matter of enabling SSL
support in MMS itself on a per host basis.
Use the Edit host button (i.e. the pencil) on the Hosts page in the MMS console to enable SSL.
Please see the MMS documentation19 for more information about MMS configuration.
PyMongo Add the ssl=True parameter to a PyMongo MongoClient20 to create a MongoDB connection to
an SSL MongoDB instance:
from pymongo import MongoClient
c = MongoClient(host="mongodb.example.net", port=27017, ssl=True)
throws Exception {
255
.socketFactory(SSLSocketFactory.getDefault())
.build();
MongoClient m = new MongoClient("localhost", o);
DB db = m.getDB( "test" );
DBCollection c = db.getCollection( "foo" );
System.out.println( c.findOne() );
}
}
Ruby The recent versions of the Ruby driver have support for connections to SSL servers. Install the latest version
of the driver with the following command:
gem install mongo
.NET As of release 1.6, the .NET driver supports SSL connections with mongod and mongos instances. To connect
using SSL, you must add an option to the connection string, specifying ssl=true as follows:
var connectionString = "mongodb://localhost/?ssl=true";
var server = MongoServer.Create(connectionString);
The .NET driver will validate the certificate against the local trusted certificate store, in addition to providing encryption of the server. This behavior may produce issues during testing if the server uses a self-signed certificate. If
21 https://github.com/mongodb/node-mongodb-native
256
Chapter 5. Security
you encounter this issue, add the sslverifycertificate=false option to the connection string to prevent the
.NET driver from validating the certificate, as follows:
var connectionString = "mongodb://localhost/?ssl=true&sslverifycertificate=false";
var server = MongoServer.Create(connectionString);
257
If you have the userAdmin (page 269) or userAdminAnyDatabase (page 271) role on a database, you can query
authenticated users in that database with the following operation:
db.system.users.find()
3. Add the user with either the userAdmin (page 269) role or userAdminAnyDatabase (page 271) role,
and only that role, by issuing a command similar to the following, where <username> is the username and
<password> is the password:
258
Chapter 5. Security
To authenticate as this user, you must authenticate against the admin database.
Authenticate with Full Administrative Access via Localhost
If there are no users for the admin database, you can connect with full administrative access via the localhost interface.
This bypass exists to support bootstrapping new deployments. This approach is useful, for example, if you want to run
mongod or mongos with authentication before creating your first user.
To authenticate via localhost, connect to the mongod or mongos from a client running on the same system. Your
connection will have full administrative access.
To disable the localhost bypass, set the enableLocalhostAuthBypass parameter using setParameter during startup:
mongod --setParameter enableLocalhostAuthBypass=0
Note: For versions of MongoDB 2.2 prior to 2.2.4, if mongos is running with keyFile, then all users connecting
over the localhost interface must authenticate, even if there arent any users in the admin database. Connections on
localhost are not correctly granted full access on sharded systems that run those versions.
MongoDB 2.2.4 resolves this issue.
Note: In version 2.2, you cannot add the first user to a sharded cluster using the localhost connection. If you are
running a 2.2 sharded cluster and want to enable authentication, you must deploy the cluster and add the first user to
the admin database before restarting the cluster to run with keyFile.
259
The following creates a user named Alice in the products database and gives her readWrite and dbAdmin
privileges.
use products
db.addUser( { user: "Alice",
pwd: "Moon1234",
roles: [ "readWrite", "dbAdmin" ]
} )
Example
The following creates a user named Bob in the admin database. The privilege document (page 272) uses
Bobs credentials from the products database and assigns him userAdmin privileges.
use admin
db.addUser( { user: "Bob",
userSource: "products",
roles: [ "userAdmin" ]
} )
Example
The following creates a user named Carlos in the admin database and gives him readWrite access to the
config database, which lets him change certain settings for sharded clusters, such as to disable the balancer.
db = db.getSiblingDB('admin')
db.addUser( { user: "Carlos",
pwd: "Moon1234",
roles: [ "clusterAdmin" ],
otherDBRoles: { config: [ "readWrite" ]
} } )
Only the admin database supports the otherDBRoles (page 274) field.
users
username
and
the
new
desired
password
to
the
Example
The following operation changes the reporting users password to SOhSS3TbYhxusooLiW8ypJPxmt1oOfL:
db = db.getSiblingDB('records')
db.changeUserPassword("reporting", "SOhSS3TbYhxusooLiW8ypJPxmt1oOfL")
Note: In previous versions of MongoDB, you could change an existing users password by calling db.addUser()
again with the users username and their updated password. Anything specified in the addUser() method would
override the existing information for that user. In newer versions of MongoDB, this will result in a duplicate key error.
For more about changing a users password prior to version 2.4, see: Add a User to a Database (page 259).
260
Chapter 5. Security
Use the following openssl command at the system shell to generate pseudo-random content for a key file:
openssl rand -base64 741
Be aware that MongoDB strips whitespace characters (e.g. x0d, x09, and x20) for cross-platform convenience. As
a result, the following operations produce identical keys:
echo
echo
echo
echo
-e
-e
-e
-e
Process Overview
261
Generate and distribute keytab files for each MongoDB component (i.e. mongod and mongos)in your deployment. Ensure that you only transmit keytab files over secure channels.
Optional. Start the mongod instance without auth and create users inside of MongoDB that you can use to
bootstrap your deployment.
Start mongod and mongos with the KRB5_KTNAME environment variable as well as a number of required run
time options.
If you did not create Kerberos user accounts, you can use the localhost exception (page 259) to create users at
this point until you create the first user on the admin database.
Authenticate clients, including the mongo shell using Kerberos.
Operations
Create Users and Privilege Documents For every user that you want to be able to authenticate using Kerberos, you
must create corresponding privilege documents in the system.users (page 272) collection to provision access to
users. Consider the following document:
{
user: "application/[email protected]",
roles: ["read"],
userSource: "$external"
}
This grants the Kerberos user principal application/[email protected] read only access to a
database. The userSource (page 273) $external reference allows mongod to consult an external source (i.e.
Kerberos) to authenticate this user.
In the mongo shell you can pass the db.addUser() a user privilege document to provision access to users, as in
the following operation:
db = db.getSiblingDB("records")
db.addUser( {
"user": "application/[email protected]",
"roles": [ "read" ],
"userSource": "$external"
} )
These operations grants the Kerberos user application/[email protected] access to the records
database.
To remove access to a user, use the remove() method, as in the following example:
db.system.users.remove( { user: "application/[email protected]" } )
To modify a user document, use update (page 42) operations on documents in the system.users (page 272)
collection.
See also:
system.users Privilege Documents (page 272) and User Privilege Roles in MongoDB (page 267).
Start mongod with Kerberos Support Once you have provisioned privileges to users in the mongod, and obtained
a valid keytab file, you must start mongod using a command in the following form:
env KRB5_KTNAME=<path to keytab file> <mongod invocation>
262
Chapter 5. Security
For successful operation with mongod use the following run time options in addition to your normal default configuration options:
--setParameter with the authenticationMechanisms=GSSAPI argument to enable support for
Kerberos.
--auth to enable authentication.
--keyFile to allow components of a single MongoDB deployment to communicate with each other, if needed
to support replica set and sharded cluster operations. keyFile implies auth.
For example, consider the following invocation:
env KRB5_KTNAME=/opt/mongodb/mongod.keytab \
/opt/mongodb/bin/mongod --dbpath /opt/mongodb/data \
--fork --logpath /opt/mongodb/log/mongod.log \
--auth --setParameter authenticationMechanisms=GSSAPI
You can also specify these options using the configuration file. As in the following:
# /opt/mongodb/mongod.conf, Example configuration file.
fork = true
auth = true
dbpath = /opt/mongodb/data
logpath = /opt/mongodb/log/mongod.log
setParameter = authenticationMechanisms=GSSAPI
To start a mongos instance using Kerberos, you must create a Kerberos service principal and deploy a keytab file for
this instance, and then start the mongos with the following invocation:
env KRB5_KTNAME=/opt/mongodb/mongos.keytab \
/opt/mongodb/bin/mongos
--configdb shard0.example.net,shard1.example.net,shard2.example.net \
--setParameter authenticationMechanisms=GSSAPI \
--keyFile /opt/mongodb/mongos.keyfile
Tip
If you installed MongoDB Enterprise using one of the official .deb or .rpm packages and are controlling the
mongod instance using the included init/upstart scripts, you can set the KR5_KTNAME variable in the default environment settings file. For .rpm packages this file is located at /etc/sysconfig/mongod. For .deb packages,
this file is /etc/default/mongodb. Set the value in a line that resembles the following:
export KRB5_KTNAME="<setting>"
If you encounter problems when trying to start mongod or mongos, please see the troubleshooting section (page 264)
for more information.
Important: Before users can authenticate to MongoDB using Kerberos you must create users (page 262) and grant
them privileges within MongoDB. If you have not created users when you start MongoDB with Kerberos you can
use the localhost authentication exception (page 259) to add users. See the Create Users and Privilege Documents
(page 262) section and the User Privilege Roles in MongoDB (page 267) document for more information.
263
Authenticate mongo Shell with Kerberos To connect to a mongod instance using the mongo shell you must begin
by using the kinit program to initialize and authenticate a Kerberos session. Then, start a mongo instance, and use
the db.auth() method, to authenticate against the special $external database, as in the following operation:
use $external
db.auth( { mechanism: "GSSAPI", user: "application/[email protected]" } )
Alternately, you can authenticate using command line options to mongo, as in the following equivalent example:
mongo --authenticationMechanism=GSSAPI
--authenticationDatabase='$external' \
--username application/[email protected]
Kerberos Configuration Checklist If youre having trouble getting mongod to start with Kerberos, there are a
number of Kerberos-specific issues that can prevent successful authentication. As you begin troubleshooting your
Kerberos deployment, ensure that:
The mongod is from MongoDB Enterprise.
You are not using the HTTP Console27 . MongoDB Enterprise does not support Kerberos authentication over the
HTTP Console interface.
You have a valid keytab file specified in the environment running the mongod. For the mongod instance
running on the db0.example.net host, the service principal should be mongodb/db0.example.net.
DNS allows the mongod to resolve the components of the Kerberos infrastructure. You should have both A and
PTR records (i.e. forward and reverse DNS) for the system that runs the mongod instance.
The canonical system hostname of the system that runs the mongod instance is the resolvable fully qualified
domain for this host. Test system hostname resolution with the hostname -f command at the system prompt.
Both the Kerberos KDC and the system running mongod instance must be able to resolve each other using DNS
28
22 http://docs.mongodb.org/ecosystem/tutorial/authenticate-with-java-driver/
23 http://docs.mongodb.org/ecosystem/tutorial/authenticate-with-csharp-driver/
24 http://docs.mongodb.org/ecosystem/tutorial/authenticate-with-cpp-driver/
25 http://api.mongodb.org/python/current/examples/authentication.html
26 http://docs.mongodb.org/ecosystem/tools/http-interface/#http-console
27 http://docs.mongodb.org/ecosystem/tools/http-interface/#http-console
28
By default, Kerberos attempts to resolve hosts using the content of the /etc/kerb5.conf before using DNS to resolve hosts.
264
Chapter 5. Security
The time systems of the systems running the mongod instances and the Kerberos infrastructure are synchronized. Time differences greater than 5 minutes will prevent successful authentication.
If you still encounter problems with Kerberos, you can start both mongod and mongo (or another client) with the
environment variable KRB5_TRACE set to different files to produce more verbose logging of the Kerberos process to
help further troubleshooting, as in the following example:
env KRB5_KTNAME=/opt/mongodb/mongod.keytab \
KRB5_TRACE=/opt/mongodb/log/mongodb-kerberos.log \
/opt/mongodb/bin/mongod --dbpath /opt/mongodb/data \
--fork --logpath /opt/mongodb/log/mongod.log \
--auth --setParameter authenticationMechanisms=GSSAPI
Common Error Messages In some situations, MongoDB will return error messages from the GSSAPI interface if
there is a problem with the Kerberos service.
GSSAPI error in client while negotiating security context.
This error occurs on the client and reflects insufficient credentials or a malicious attempt to authenticate.
If you receive this error ensure that youre using the correct credentials and the correct fully qualified
domain name when connecting to the host.
GSSAPI error acquiring credentials.
This error only occurs when attempting to start the mongod or mongos and reflects improper configuration of system hostname or a missing or incorrectly configured keytab file. If you encounter this problem,
consider all the items in the Kerberos Configuration Checklist (page 264), in particular:
examine the keytab file, with the following command:
klist -k <keytab>
Ensure that this name matches the name in the keytab file, or use the saslHostName to pass
MongoDB the correct hostname.
Enable the Traditional MongoDB Authentication Mechanism For testing and development purposes you can
enable both the Kerberos (i.e. GSSAPI) authentication mechanism in combination with the traditional MongoDB
challenge/response authentication mechanism (i.e. MONGODB-CR), using the following setParameter run-time
option:
mongod --setParameter authenticationMechanisms=GSSAPI,MONGODB-CR
Warning: All keyFile internal authentication between members of a replica set or sharded cluster still uses
the MONGODB-CR authentication mechanism, even if MONGODB-CR is not enabled. All client authentication will
still use Kerberos.
265
To report an issue, we strongly suggest filing a ticket in our Security project in JIRA
<https://jira.mongodb.org/browse/SECURITY/>_ . MongoDB, Inc responds to vulnerability notifications within 48
hours.
Create the Report in JIRA
Submit a ticket in the Security29 project at: https://jira.mongodb.org/browse/SECURITY/. The ticket number will
become the reference identification for the issue for its lifetime. You can use this identifier for tracking purposes.
Information to Provide
All vulnerability reports should contain as much information as possible so MongoDBs developers can move quickly
to resolve the issue. In particular, please include the following:
The name of the product.
Common Vulnerability information, if applicable, including:
CVSS (Common Vulnerability Scoring System) Score.
CVE (Common Vulnerability and Exposures) Identifier.
Contact information, including an email address and/or phone number, if applicable.
Send the Report via Email
While JIRA is the preferred reporting method, you may also report vulnerabilities via email to [email protected] .
You may encrypt email using MongoDBs public key at http://docs.mongodb.org/10gen-security-gpg-key.asc.
MongoDB, Inc. responds to vulnerability reports sent via email with a response email that contains a reference number
for a JIRA ticket posted to the SECURITY31 project.
Evaluation of a Vulnerability Report
MongoDB, Inc. validates all submitted vulnerabilities and uses Jira to track all communications regarding a vulnerability, including requests for clarification or additional information. If needed, MongoDB representatives set up a
conference call to exchange information regarding the vulnerability.
Disclosure
MongoDB, Inc. requests that you do not publicly disclose any information regarding the vulnerability or exploit the
issue until it has had the opportunity to analyze the vulnerability, to respond to the notification, and to notify key users,
customers, and partners.
The amount of time required to validate a reported vulnerability depends on the complexity and severity of the issue.
MongoDB, Inc. takes all required vulnerabilities very seriously and will always ensure that there is a clear and open
channel of communication with the reporter.
29 https://jira.mongodb.org/browse/SECURITY
30 [email protected]
31 https://jira.mongodb.org/browse/SECURITY
266
Chapter 5. Security
After validating an issue, MongoDB, Inc. coordinates public disclosure of the issue with the reporter in a mutually
agreed timeframe and format. If required or requested, the reporter of a vulnerability will receive credit in the published
security bulletin.
Description
Adds a user to a database, and allows administrators to configure the users
privileges.
db.auth()
Authenticates a user to a database.
db.changeUserPassword() Changes an existing users password.
In general, you should set this option if your deployment does not need to support legacy user documents. Typically
legacy user documents are only useful during the upgrade process and while you migrate applications to the updated
privilege document form.
See privilege documents (page 272) and Delegated Credentials for MongoDB Authentication (page 274) for more
information about permissions and authentication in MongoDB.
267
268
Chapter 5. Security
Administrative Roles
clusterAdmin
clusterAdmin (page 269) grants access to several administration operations that affect or present information
about the whole system, rather than just a single database. These privileges include but are not limited to replica
set and sharded cluster administrative functions.
269
clusterAdmin (page 269) is only applicable on the admin database, and does not confer any access to the
local or config databases.
Specifically, users with the clusterAdmin (page 269) role have access to the following operations:
addShard
closeAllDatabases
connPoolStats
connPoolSync
_cpuProfilerStart
_cpuProfilerStop
cursorInfo
diagLogging
dropDatabase
enableSharding
flushRouterConfig
fsync
db.fsyncUnlock()
getCmdLineOpts
getLog
getParameter
getShardMap
getShardVersion
hostInfo
db.currentOp()
db.killOp()
listDatabases
listShards
logRotate
moveChunk
movePrimary
netstat
removeShard
repairDatabase
replSetFreeze
replSetGetStatus
replSetInitiate
replSetMaintenance
270
Chapter 5. Security
replSetReconfig
replSetStepDown
replSetSyncFrom
resync
serverStatus
setParameter
setShardVersion
shardCollection
shardingState
shutdown
splitChunk
splitVector
split
top
touch
unsetSharding
For some cluster administration operations, MongoDB requires read and write access to the local or config
databases. You must specify this access separately from clusterAdmin (page 269). See the Combined Access
(page 272) section for more information.
Any Database Roles
Note: You must specify the following any database roles on the admin databases. These roles apply to all
databases in a mongod instance and are roughly equivalent to their single-database equivalents.
If you add any of these roles to a user privilege document (page 272) outside of the admin database, the privilege
will have no effect. However, only the specification of the roles must occur in the admin database, with delegated
authentication credentials (page 274), users can gain these privileges by authenticating to another database.
readAnyDatabase
readAnyDatabase (page 271) provides users with the same read-only permissions as read (page 268),
except it applies to all logical databases in the MongoDB environment.
readWriteAnyDatabase
readWriteAnyDatabase (page 271) provides users with the same read and write permissions as
readWrite (page 268), except it applies to all logical databases in the MongoDB environment.
userAdminAnyDatabase
userAdminAnyDatabase (page 271) provides users with the same access to user administration operations
as userAdmin (page 269), except it applies to all logical databases in the MongoDB environment.
Important: Because users with userAdminAnyDatabase (page 271) and userAdmin (page 269) have
the ability to create and modify permissions in addition to their own level of access, this role is effectively the
MongoDB system superuser. However, userAdminAnyDatabase (page 271) and userAdmin (page 269)
do not explicitly authorize a user for any privileges beyond user administration.
271
dbAdminAnyDatabase
dbAdminAnyDatabase (page 271) provides users with the same access to database administration operations
as dbAdmin (page 269), except it applies to all logical databases in the MongoDB environment.
Combined Access
Some operations are only available to users that have multiple roles. Consider the following:
sh.status() Requires clusterAdmin (page 269) and read (page 268) access to the config (page 548)
database.
applyOps, eval 32 Requires readWriteAnyDatabase (page 271), userAdminAnyDatabase (page 271),
dbAdminAnyDatabase (page 271) and clusterAdmin (page 269) (on the admin database.)
Some operations related to cluster administration are not available to users who only have the clusterAdmin
(page 269) role:
rs.conf() Requires read (page 268) on the local database.
sh.addShard() Requires readWrite (page 268) on the config database.
system.users Privilege Documents
Changed in version 2.4.
Overview
The documents in the <database>.system.users (page 272) collection store credentials and user privilege
information used by the authentication system to provision access to users in the MongoDB system. See User Privilege
Roles in MongoDB (page 267) for more information about access roles, and Security (page 237) for an overview of
security in MongoDB.
Data Model
<database>.system.users
Changed in version 2.4.
Documents in the <database>.system.users (page 272) collection stores credentials and user roles
(page 267) for users who have access to the database. Consider the following prototypes of user privilege
documents:
{
user: "<username>",
pwd: "<hash>",
roles: []
}
{
user: "<username>",
userSource: "<database>",
roles: []
}
272
Chapter 5. Security
Note: The pwd (page 273) and userSource (page 273) fields are mutually exclusive. A single document
cannot contain both.
The following privilege document with the otherDBRoles (page 274) field is only supported on the admin
database:
{
user: "<username>",
userSource: "<database>",
otherDBRoles: {
<database0> : [],
<database1> : []
},
roles: []
}
Consider the content of the following fields in the system.users (page 272) documents:
<database>.system.users.user
user (page 273) is a string that identifies each user. Users exist in the context of a single logical database;
however, users from one database may obtain access in another database by way of the otherDBRoles
(page 274) field on the admin database, the userSource (page 273) field, or the Any Database Roles
(page 271).
<database>.system.users.pwd
pwd (page 273) holds a hashed shared secret used to authenticate the user (page 273). pwd (page 273)
field is mutually exclusive with the userSource (page 273) field.
<database>.system.users.roles
roles (page 273) holds an array of user roles. The available roles are:
read (page 268)
readWrite (page 268)
dbAdmin (page 269)
userAdmin (page 269)
clusterAdmin (page 269)
readAnyDatabase (page 271)
readWriteAnyDatabase (page 271)
userAdminAnyDatabase (page 271)
dbAdminAnyDatabase (page 271)
See Roles (page 267) for full documentation of all available user roles.
<database>.system.users.userSource
A string that holds the name of the database that contains the credentials for the user. If userSource
(page 273) is $external, then MongoDB will use an external resource, such as Kerberos, for authentication credentials.
Note: In the current release, the only external authentication source is Kerberos, which is only available
in MongoDB Enterprise.
Use userSource (page 273) to ensure that a single users authentication credentials are only stored in a
single location in a mongod instances data.
273
A userSource (page 273) and user (page 273) pair identifies a unique user in a MongoDB system.
admin.system.users.otherDBRoles
A document that holds one or more fields with a name that is the name of a database in the MongoDB
instance with a value that holds a list of roles this user has on other databases. Consider the following
example:
{
user: "admin",
userSource: "$external",
roles: [ "clusterAdmin"],
otherDBRoles:
{
config: [ "read" ],
records: [ "dbAdmin" ]
}
}
Then for every database that the application0 user requires access, add documents to the system.users
(page 272) collection that resemble the following:
{
user: "application0",
roles: ['readWrite'],
userSource: "accounts"
}
To gain privileges to databases where the application0 has access, you must first authenticate to the accounts
database.
Disable Legacy Privilege Documents
By default MongoDB 2.4 includes support for both new, role-based privilege documents style as well 2.2 and earlier
privilege documents. MongoDB assumes any privilege document without a roles (page 273) field is a 2.2 or earlier
document.
274
Chapter 5. Security
To ensure that mongod instances will only provide access to users defined with the new role-based privilege documents, use the following setParameter run-time option:
mongod --setParameter supportCompatibilityFormPrivilegeDocuments=0
Description
The default port for mongod and mongos instances. You can change this port with port or
--port.
The default port when running with --shardsvr runtime operation or shardsvr setting.
The default port when running with --configsvr runtime operation or configsvr setting.
The default port for the web status page. The web status page is always accessible at a port number
that is 1000 greater than the port determined by port.
If a user has the same password for multiple databases, the hash will be the same. A malicious user could exploit this
to gain access on a second database using a different users credentials.
As a result, always use unique username and password combinations for each database.
Thanks to Will Urbanski, from Dell SecureWorks, for identifying this issue.
33
275
276
Chapter 5. Security
CHAPTER 6
Aggregation
Aggregations operations process data records and return computed results. Aggregation operations group values from
multiple documents together, and can perform a variety of operations on the grouped data to return a single result.
MongoDB provides three ways to perform aggregation: the aggregation pipeline (page 281), the map-reduce function
(page 284), and single purpose aggregation methods and commands (page 286).
Aggregation Introduction (page 277) A high-level introduction to aggregation.
Aggregation Concepts (page 281) Introduces the use and operation of the data aggregation modalities available in
MongoDB.
Aggregation Pipeline (page 281) The aggregation pipeline is a framework for performing aggregation tasks,
modeled on the concept of data processing pipelines. Using this framework, MongoDB passes the documents of a single collection through a pipeline. The pipeline transforms the documents into aggregated
results, and is accessed through the aggregate database command.
Map-Reduce (page 284) Map-reduce is a generic multi-phase data aggregation modality for processing quantities of data. MongoDB provides map-reduce with the mapReduce database command.
Single Purpose Aggregation Operations (page 286) MongoDB provides a collection of specific data aggregation operations to support a number of common data aggregation functions. These operations include
returning counts of documents, distinct values of a field, and simple grouping operations.
Aggregation Mechanics (page 289) Details internal optimization operations, limits, support for sharded collections, and concurrency concerns.
Aggregation Examples (page 292) Examples and tutorials for data aggregation operations in MongoDB.
Aggregation Reference (page 308) References for all aggregation operations material for all data aggregation methods in MongoDB.
277
Figure 6.1: Diagram of the annotated aggregation pipeline operation. The aggregation pipeline has two phases:
$match and $group.
Map-Reduce
MongoDB also provides map-reduce (page 284) operations to perform aggregation. In general, map-reduce operations
have two phases: a map stage that processes each document and emits one or more objects for each input document,
278
Chapter 6. Aggregation
and reduce phase that combines the output of the map operation. Optionally, map-reduce can have a finalize stage to
make final modifications to the result. Like other aggregation operations, map-reduce can specify a query condition to
select the input documents as well as sort and limit the results.
Map-reduce uses custom JavaScript functions to perform the map and reduce operations, as well as the optional finalize
operation. While the custom JavaScript provide great flexibility compared to the aggregation pipeline, in general, mapreduce is less efficient and more complex than the aggregation pipeline.
Additionally, map-reduce operations can have output sets that exceed the 16 megabyte output limitation of the aggregation pipeline.
Note: Starting in MongoDB 2.4, certain mongo shell functions and properties are inaccessible in map-reduce operations. MongoDB 2.4 also provides support for multiple JavaScript operations to run at the same time. Before
MongoDB 2.4, JavaScript code executed in a single thread, raising concurrency issues for map-reduce.
279
280
Chapter 6. Aggregation
281
Figure 6.4: Diagram of the annotated aggregation pipeline operation. The aggregation pipeline has two phases:
$match and $group.
282
Chapter 6. Aggregation
Important: The result of aggregation pipeline is a document and is subject to the BSON Document size limit, which
is currently 16 megabytes.
See aggregation-pipeline-operator-reference for the list of pipeline operators that define the stages.
For example usage of the aggregation pipeline, consider Aggregation with User Preference Data (page 296)
and Aggregation with the Zip Code Data Set (page 293), as well as the aggregate command and the
db.collection.aggregate() method reference pages.
Pipeline Expressions
Each pipeline operator takes a pipeline expression as its operand. Pipeline expressions specify the transformation to
apply to the input documents. Expressions have a document structure and can contain fields, values, and operators.
Pipeline expressions can only operate on the current document in the pipeline and cannot refer to data from other
documents: expression operations provide in-memory transformation of documents.
Generally, expressions are stateless and are only evaluated when seen by the aggregation process with one exception:
accumulator expressions. The accumulator expressions, used with the $group pipeline operator, maintain their state
(e.g. totals, maximums, minimums, and related data) as documents progress through the pipeline.
For the expression operators, see aggregation-expression-operators.
Aggregation Pipeline Behavior
In MongoDB, the aggregate command operates on a single collection, logically passing the entire collection into
the aggregation pipeline. To optimize the operation, wherever possible, use the following strategies to avoid scanning
the entire collection.
Pipeline Operators and Indexes
The $match, $sort, $limit, and $skip pipeline operators can take advantage of an index when they occur at the
beginning of the pipeline before any of the following aggregation operators: $project, $unwind, and $group.
New in version 2.4: The $geoNear pipeline operator takes advantage of a geospatial index. When using $geoNear,
the $geoNear pipeline operation must appear as the first stage in an aggregation pipeline.
For unsharded collections, when the aggregation pipeline only needs to access the indexed fields to fulfill its operations,
an index can cover (page 37) the pipeline.
Example
Consider the following index on the orders collection:
{ status: 1, amount: 1, cust_id: 1 }
This index can cover the following aggregation pipeline operation because MongoDB does not need to inspect the data
outside of the index to fulfill the operation:
db.orders.aggregate([
{ $match: { status: "A" } },
{ $group: { _id: "$cust_id", total: { $sum: "$amount" } } },
{ $sort: { total: -1 } }
])
283
Early Filtering
If your aggregation operation requires only a subset of the data in a collection, use the $match, $limit, and $skip
stages to restrict the documents that enter at the beginning of the pipeline. When placed at the beginning of a pipeline,
$match operations use suitable indexes to scan only the matching documents in a collection.
Placing a $match pipeline stage followed by a $sort stage at the start of the pipeline is logically equivalent to a
single query with a sort and can use an index. When possible, place $match operators at the beginning of the pipeline.
Additional Features
The aggregation pipeline has an internal optimization phase that provides improved performance for certain sequences
of operators. For details, see Pipeline Sequence Optimization (page 289).
The aggregation pipeline supports operations on sharded collections. See Aggregation Pipeline and Sharded Collections (page 291).
6.2.2 Map-Reduce
Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. For
map-reduce operations, MongoDB provides the mapReduce database command.
Consider the following map-reduce operation:
In this map-reduce operation, MongoDB applies the map phase to each input document (i.e. the documents in the
collection that match the query condition). The map function emits key-value pairs. For those keys that have multiple
values, MongoDB applies the reduce phase, which collects and condenses the aggregated data. MongoDB then stores
the results in a collection. Optionally, the output of the reduce function may pass through a finalize function to further
condense or process the results of the aggregation.
All map-reduce functions in MongoDB are JavaScript and run within the mongod process. Map-reduce operations
take the documents of a single collection as the input and can perform any arbitrary sorting and limiting before
beginning the map stage. mapReduce can return the results of a map-reduce operation as a document, or may write
the results to collections. The input and the output collections may be sharded.
Note: For most aggregation operations, the Aggregation Pipeline (page 281) provides better performance and more
coherent interface. However, map-reduce operations provide some flexibility that is not presently available in the
aggregation pipeline.
Chapter 6. Aggregation
285
that merge replace, merge, or reduce new results with previous results. See mapReduce and Perform Incremental
Map-Reduce (page 302) for details and examples.
When returning the results of a map reduce operation inline, the result documents must be within the BSON
Document Size limit, which is currently 16 megabytes. For additional information on limits and restrictions on
map-reduce operations, see the http://docs.mongodb.org/manualreference/command/mapReduce
reference page.
MongoDB supports map-reduce operations on sharded collections (page 479). Map-reduce operations can also output
the results to a sharded collection. See Map-Reduce and Sharded Collections (page 291).
a:
a:
a:
a:
1,
1,
1,
2,
b:
b:
b:
b:
0
1
4
2
}
}
}
}
The following operation would count all documents in the collection and return the number 4:
db.records.count()
The following operation will count only the documents where the value of the field a is 1 and return 3:
db.records.count( { a: 1 } )
Distinct
The distinct operation takes a number of documents that match a query and returns all of the unique values for a field
in the matching documents. The distinct command and db.collection.distinct() method provide this
operation in the mongo shell. Consider the following examples of a distinct operation:
Example
Given a collection named records with only the following documents:
{
{
{
{
a:
a:
a:
a:
286
1,
1,
1,
1,
b:
b:
b:
b:
0
1
1
4
}
}
}
}
Chapter 6. Aggregation
287
{ a: 2, b: 2 }
{ a: 2, b: 2 }
Consider the following db.collection.distinct() operation which returns the distinct values of the field b:
db.records.distinct( "b" )
Group
The group operation takes a number of documents that match a query, and then collects groups of documents based
on the value of a field or fields. It returns an array of documents with computed results for each group of documents.
Access the grouping functionality via the group command or the db.collection.group() method in the
mongo shell.
Warning: group does not support data in sharded collections. In addition, the results of the group operation
must be no larger than 16 megabytes.
Consider the following group operation:
Example
Given a collection named records with the following documents:
{
{
{
{
{
{
{
a:
a:
a:
a:
a:
a:
a:
1,
1,
1,
2,
2,
1,
4,
count:
count:
count:
count:
count:
count:
count:
4
2
4
3
1
5
4
}
}
}
}
}
}
}
Consider the following group operation which groups documents by the field a, where a is less than 3, and sums the
field count for each group:
db.records.group( {
key: { a: 1 },
cond: { a: { $lt: 3 } },
reduce: function(cur, result) { result.count += cur.count },
initial: { count: 0 }
} )
See also:
The $group for related functionality in the aggregation pipeline (page 281).
288
Chapter 6. Aggregation
$sort + $skip + $limit Sequence Optimization When you have a sequence with $sort followed by a
$skip followed by a $limit, an optimization occurs that moves the $limit operator before the $skip operator.
For example, if the pipeline consists of the following stages:
{ $sort: { age : -1 } },
{ $skip: 10 },
{ $limit: 5 }
During the optimization phase, the optimizer transforms the sequence to the following:
{ $sort: { age : -1 } },
{ $limit: 15 }
{ $skip: 10 }
Note: The $limit value has increased to the sum of the initial value and the $skip value.
The optimized sequence now has $sort immediately preceding the $limit. See $sort for information on the
behavior of the $sort operation when it immediately precedes $limit.
$limit + $skip + $limit + $skip Sequence Optimization When you have a continuous sequence of a
$limit pipeline stage followed by a $skip pipeline stage, the optimization phase attempts to arrange the pipeline
stages to combine the limits and skips. For example, if the pipeline consists of the following stages:
{
{
{
{
$limit: 100 },
$skip: 5 },
$limit: 10},
$skip: 2 }
During the intermediate step, the optimizer reverses the position of the $skip followed by a $limit to $limit
followed by the $skip.
289
{
{
{
{
$limit: 100 },
$limit: 15},
$skip: 5 },
$skip: 2 }
The $limit value has increased to the sum of the initial value and the $skip value. Then, for the final $limit
value, the optimizer selects the minimum between the adjacent $limit values. For the final $skip value, the
optimizer adds the adjacent $skip values, to transform the sequence to the following:
{ $limit: 15 },
{ $skip: 7 }
Projection Optimization
If the aggregation pipeline contains a $project stage that specifies the fields to include, then MongoDB applies the
projection to the head of the pipeline. This reduces the amount of data passing through the pipeline from the start.
In the following example, the $project stage specifies that the results of this stage return only the _id and the
amount fields. The optimization phase applies the projection to the head of the pipeline such that only the _id and
the amount fields return in the resulting documents from the $match stage as well.
db.orders.aggregate(
{ $match: { status: "A" } },
{ $project: { amount: 1 } }
)
The aggregation pipeline (page 281) cannot operate on values of the following types: Symbol, MinKey, MaxKey,
DBRef, Code, CodeWScope.
Changed in version 2.4: Removed restriction on Binary type data. In MongoDB 2.2, the pipeline could not operate
on Binary type data.
Result Size Restrictions
Output from the pipeline cannot exceed the BSON Document Size limit, which is currently 16 megabytes. If the
result set exceeds this limit, the aggregate command produces an error.
Memory Restrictions
If any single aggregation operation consumes more than 10 percent of system RAM, the operation will produce an
error.
Cumulative operators, such as $sort and $group, require access to the entire input set before they can produce any
output. These operators log a warning if the cumulative operator consumes 5% or more of the physical memory on the
host. Like any aggregation operation, these operators produce an error if they consume 10% or more of the physical
memory on the host. See the $sort and $group reference pages for details on their specific memory requirements
and use.
290
Chapter 6. Aggregation
When operating on a sharded collection, the aggregation pipeline is split into two parts. First, the aggregation pipeline
pushes all of the operators up to the first $group or $sort operation to each shard 1 . Then, a second pipeline runs
on the mongos. This pipeline consists of the first $group or $sort and any remaining pipeline operators, and runs
on the results received from the shards.
The $group operator brings in any sub-totals from the shards and combines them: in some cases these may be
structures. For example, the $avg expression maintains a total and count for each shard; mongos combines these
values and then divides.
Impact of Aggregation Pipelines on mongos
When using sharded collection as the input for a map-reduce operation, mongos will automatically dispatch the mapreduce job to each shard in parallel. There is no special option required. mongos will wait for jobs on all shards to
finish.
Sharded Collection as Output
If an early $match can exclude shards through the use of the shard key in the predicate, then these operators are only pushed to the relevant
shards.
291
mongos dispatches, in parallel, a map-reduce post-processing job to every shard that owns a chunk. During
the post-processing, each shard will pull the results for its own chunks from the other shards, run the final
reduce/finalize, and write locally to the output collection.
Note:
During later map-reduce jobs, MongoDB splits chunks as needed.
Balancing of chunks for the output collection is automatically prevented during post-processing to avoid concurrency issues.
In MongoDB 2.0:
mongos retrieves the results from each shard, performs a merge sort to order the results, and proceeds to the
reduce/finalize phase as needed. mongos then writes the result to the output collection in sharded mode.
This model requires only a small amount of memory, even for large data sets.
Shard chunks are not automatically split during insertion. This requires manual intervention until the chunks
are granular and balanced.
Important: For best results, only use the sharded output options for mapReduce in version 2.2 or later.
292
Chapter 6. Aggregation
Map-Reduce Examples (page 300) Define map-reduce operations that select ranges, group data, and calculate sums
and averages.
Perform Incremental Map-Reduce (page 302) Run a map-reduce operations over one collection and output results
to another collection.
Troubleshoot the Map Function (page 304) Steps to troubleshoot the map function.
Troubleshoot the Reduce Function (page 305) Steps to troubleshoot the reduce function.
me-
Data Model
Each document in the zipcode collection has the following form:
{
"_id": "10280",
"city": "NEW YORK",
"state": "NY",
"pop": 5574,
"loc": [
-74.016323,
40.710537
]
}
Aggregations operations using the aggregate() helper process all documents in the zipcodes collection.
aggregate() connects a number of pipeline (page 281) operators, which define the aggregation process.
In this example, the pipeline passes all documents in the zipcodes collection through the following steps:
2 http://media.mongodb.org/zips.json
293
the $group operator collects all documents and creates documents for each state.
These new per-state documents have one field in addition to the _id field: totalPop which is a generated
field using the $sum operation to calculate the total value of all pop fields in the source documents.
After the $group operation the documents in the pipeline resemble the following:
{
"_id" : "AK",
"totalPop" : 550043
}
the $match operation filters these documents so that the only documents that remain are those where the value
of totalPop is greater than or equal to 10 million.
The $match operation does not alter the documents, which have the same format as the documents output by
$group.
The equivalent SQL for this operation is:
SELECT state, SUM(pop) AS totalPop
FROM zipcodes
GROUP BY state
HAVING totalPop >= (10*1000*1000)
Aggregations operations using the aggregate() helper process all documents in the zipcodes collection.
aggregate() connects a number of pipeline (page 281) operators that define the aggregation process.
In this example, the pipeline passes all documents in the zipcodes collection through the following steps:
the $group operator collects all documents and creates new documents for every combination of the city and
state fields in the source document.
After this stage in the pipeline, the documents resemble the following:
{
"_id" : {
"state" : "CO",
"city" : "EDGEWATER"
},
"pop" : 13154
}
the second $group operator collects documents by the state field and use the $avg expression to compute
a value for the avgCityPop field.
The final output of this aggregation operation is:
294
Chapter 6. Aggregation
{
"_id" : "MN",
"avgCityPop" : 5335
},
Aggregation operations using the aggregate() helper process all documents in the zipcodes collection.
aggregate() combines a number of pipeline (page 281) operators that define the aggregation process.
All documents from the zipcodes collection pass into the pipeline, which consists of the following steps:
the $group operator collects all documents and creates new documents for every combination of the city and
state fields in the source documents.
By specifying the value of _id as a sub-document that contains both fields, the operation preserves the state
field for use later in the pipeline. The documents produced by this stage of the pipeline have a second field,
pop, which uses the $sum operator to provide the total of the pop fields in the source document.
At this stage in the pipeline, the documents resemble the following:
{
"_id" : {
"state" : "CO",
"city" : "EDGEWATER"
},
"pop" : 13154
}
$sort operator orders the documents in the pipeline based on the vale of the pop field from largest to smallest.
This operation does not alter the documents.
the second $group operator collects the documents in the pipeline by the state field, which is a field inside
the nested _id document.
Within each per-state document this $group operator specifies four fields: Using the $last expression, the
$group operator creates the biggestcity and biggestpop fields that store the city with the largest pop-
295
ulation and that population. Using the $first expression, the $group operator creates the smallestcity
and smallestpop fields that store the city with the smallest population and that population.
The documents, at this stage in the pipeline resemble the following:
{
"_id" : "WA",
"biggestCity" : "SEATTLE",
"biggestPop" : 520096,
"smallestCity" : "BENGE",
"smallestPop" : 2
}
The final operation is $project, which renames the _id field to state and moves the biggestCity,
biggestPop, smallestCity, and smallestPop into biggestCity and smallestCity subdocuments.
The output of this aggregation operation is:
{
"state" : "RI",
"biggestCity" : {
"name" : "CRANSTON",
"pop" : 176404
},
"smallestCity" : {
"name" : "CLAYVILLE",
"pop" : 45
}
}
296
Chapter 6. Aggregation
{ $sort : { name : 1 } }
]
)
All documents from the users collection pass through the pipeline, which consists of the following operations:
The $project operator:
creates a new field called name.
converts the value of the _id to upper case, with the $toUpper operator. Then the $project creates
a new field, named name to hold this value.
suppresses the id field. $project will pass the _id field by default, unless explicitly suppressed.
The $sort operator orders the results by the name field.
The results of the aggregation would resemble the following:
{
"name" : "JANE"
},
{
"name" : "JILL"
},
{
"name" : "JOE"
}
The pipeline passes all documents in the users collection through the following operations:
The $project operator:
Creates two new fields: month_joined and name.
Suppresses the id from the results. The aggregate() method includes the _id, unless explicitly
suppressed.
The $month operator converts the values of the joined field to integer representations of the month. Then
the $project operator assigns those values to the month_joined field.
The $sort operator sorts the results by the month_joined field.
The operation returns results that resemble the following:
297
{
"month_joined" : 1,
"name" : "ruth"
},
{
"month_joined" : 1,
"name" : "harold"
},
{
"month_joined" : 1,
"name" : "kate"
}
{
"month_joined" : 2,
"name" : "jill"
}
The pipeline passes all documents in the users collection through the following operations:
The $project operator creates a new field called month_joined.
The $month operator converts the values of the joined field to integer representations of the month. Then
the $project operator assigns the values to the month_joined field.
The $group operator collects all documents with a given month_joined value and counts how many documents there are for that value. Specifically, for each unique value, $group creates a new per-month document
with two fields:
_id, which contains a nested document with the month_joined field and its value.
number, which is a generated field. The $sum operator increments this field by 1 for every document
containing the given month_joined value.
The $sort operator sorts the documents created by $group according to the contents of the month_joined
field.
The result of this aggregation operation would resemble the following:
{
"_id" : {
"month_joined" : 1
},
"number" : 3
},
{
"_id" : {
298
Chapter 6. Aggregation
"month_joined" : 2
},
"number" : 9
},
{
"_id" : {
"month_joined" : 3
},
"number" : 5
}
The pipeline begins with all documents in the users collection, and passes these documents through the following
operations:
The $unwind operator separates each value in the likes array, and creates a new version of the source
document for every element in the array.
Example
Given the following document from the users collection:
{
_id : "jane",
joined : ISODate("2011-03-02"),
likes : ["golf", "racquetball"]
}
The $group operator collects all documents the same value for the likes field and counts each grouping.
With this information, $group creates a new document with two fields:
_id, which contains the likes value.
6.3. Aggregation Examples
299
number, which is a generated field. The $sum operator increments this field by 1 for every document
containing the given likes value.
The $sort operator sorts these documents by the number field in reverse order.
The $limit operator only includes the first 5 result documents.
The results of aggregation would resemble the following:
{
"_id" : "golf",
"number" : 33
},
{
"_id" : "racquetball",
"number" : 31
},
{
"_id" : "swimming",
"number" : 24
},
{
"_id" : "handball",
"number" : 19
},
{
"_id" : "tennis",
"number" : 18
}
300
Chapter 6. Aggregation
The function maps the price to the cust_id for each document and emits the cust_id and price
pair.
var mapFunction1 = function() {
emit(this.cust_id, this.price);
};
2. Define the corresponding reduce function with two arguments keyCustId and valuesPrices:
The valuesPrices is an array whose elements are the price values emitted by the map function and
grouped by keyCustId.
The function reduces the valuesPrice array to the sum of its elements.
var reduceFunction1 = function(keyCustId, valuesPrices) {
return Array.sum(valuesPrices);
};
3. Perform the map-reduce on all documents in the orders collection using the mapFunction1 map function
and the reduceFunction1 reduce function.
db.orders.mapReduce(
mapFunction1,
reduceFunction1,
{ out: "map_reduce_example" }
)
2. Define the corresponding reduce function with two arguments keySKU and countObjVals:
countObjVals is an array whose elements are the objects mapped to the grouped keySKU values
passed by map function to the reducer function.
301
The function reduces the countObjVals array to a single object reducedValue that contains the
count and the qty fields.
In reducedVal, the count field contains the sum of the count fields from the individual array elements, and the qty field contains the sum of the qty fields from the individual array elements.
var reduceFunction2 = function(keySKU, countObjVals) {
reducedVal = { count: 0, qty: 0 };
for (var idx = 0; idx < countObjVals.length; idx++) {
reducedVal.count += countObjVals[idx].count;
reducedVal.qty += countObjVals[idx].qty;
}
return reducedVal;
};
3. Define a finalize function with two arguments key and reducedVal. The function modifies the
reducedVal object to add a computed field named avg and returns the modified object:
var finalizeFunction2 = function (key, reducedVal) {
reducedVal.avg = reducedVal.qty/reducedVal.count;
return reducedVal;
};
using
the
mapFunction2,
db.orders.mapReduce( mapFunction2,
reduceFunction2,
{
out: { merge: "map_reduce_example" },
query: { ord_date:
{ $gt: new Date('01/01/2012') }
},
finalize: finalizeFunction2
}
)
This operation uses the query field to select only those documents with ord_date greater than new
Date(01/01/2012). Then it output the results to a collection map_reduce_example. If the
map_reduce_example collection already exists, the operation will merge the existing contents with the
results of this map-reduce operation.
Chapter 6. Aggregation
the query parameter that specifies conditions that match only the new documents.
the out parameter that specifies the reduce action to merge the new results into the existing output
collection.
Consider the following example where you schedule a map-reduce operation on a sessions collection to run at the
end of each day.
Data Setup
The sessions collection contains documents that log users sessions each day, for example:
db.sessions.save(
db.sessions.save(
db.sessions.save(
db.sessions.save(
{
{
{
{
userid:
userid:
userid:
userid:
"a",
"b",
"c",
"d",
ts:
ts:
ts:
ts:
ISODate('2011-11-03
ISODate('2011-11-03
ISODate('2011-11-03
ISODate('2011-11-03
14:17:00'),
14:23:00'),
15:02:00'),
16:45:00'),
length:
length:
length:
length:
95 } );
110 } );
120 } );
45 } );
db.sessions.save(
db.sessions.save(
db.sessions.save(
db.sessions.save(
{
{
{
{
userid:
userid:
userid:
userid:
"a",
"b",
"c",
"d",
ts:
ts:
ts:
ts:
ISODate('2011-11-04
ISODate('2011-11-04
ISODate('2011-11-04
ISODate('2011-11-04
11:05:00'),
13:14:00'),
17:00:00'),
15:37:00'),
length:
length:
length:
length:
105 } );
120 } );
130 } );
65 } );
2. Define the corresponding reduce function with two arguments key and values to calculate the total time and
the count. The key corresponds to the userid, and the values is an array whose elements corresponds to
the individual objects mapped to the userid in the mapFunction.
var reduceFunction = function(key, values) {
var reducedObject = {
userid: key,
total_time: 0,
count:0,
avg_time:0
};
values.forEach( function(value) {
reducedObject.total_time += value.total_time;
reducedObject.count += value.count;
303
}
);
return reducedObject;
};
3. Define the finalize function with two arguments key and reducedValue. The function modifies the
reducedValue document to add another field average and returns the modified document.
var finalizeFunction = function (key, reducedValue) {
if (reducedValue.count > 0)
reducedValue.avg_time = reducedValue.total_time / reducedValue.count;
return reducedValue;
};
4. Perform map-reduce on the session collection using the mapFunction, the reduceFunction, and the
finalizeFunction functions. Output the results to a collection session_stat. If the session_stat
collection already exists, the operation will replace the contents:
db.sessions.mapReduce( mapFunction,
reduceFunction,
{
out: { reduce: "session_stat" },
finalize: finalizeFunction
}
)
{
{
{
{
userid:
userid:
userid:
userid:
"a",
"b",
"c",
"d",
ts:
ts:
ts:
ts:
ISODate('2011-11-05
ISODate('2011-11-05
ISODate('2011-11-05
ISODate('2011-11-05
14:17:00'),
14:23:00'),
15:02:00'),
16:45:00'),
length:
length:
length:
length:
100 } );
115 } );
125 } );
55 } );
At the end of the day, perform incremental map-reduce on the sessions collection, but use the query field to select
only the new documents. Output the results to the collection session_stat, but reduce the contents with the
results of the incremental map-reduce:
db.sessions.mapReduce( mapFunction,
reduceFunction,
{
query: { ts: { $gt: ISODate('2011-11-05 00:00:00') } },
out: { reduce: "session_stat" },
finalize: finalizeFunction
}
);
304
Chapter 6. Aggregation
To verify the key and value pairs emitted by the map function, write your own emit function.
Consider a collection orders that contains documents of the following prototype:
{
_id: ObjectId("50a8240b927d5d8b5891743c"),
cust_id: "abc123",
ord_date: new Date("Oct 04, 2012"),
status: 'A',
price: 250,
items: [ { sku: "mmm", qty: 5, price: 2.5 },
{ sku: "nnn", qty: 5, price: 2.5 } ]
}
1. Define the map function that maps the price to the cust_id for each document and emits the cust_id and
price pair:
var map = function() {
emit(this.cust_id, this.price);
};
3. Invoke the map function with a single document from the orders collection:
var myDoc = db.orders.findOne( { _id: ObjectId("50a8240b927d5d8b5891743c") } );
map.apply(myDoc);
5. Invoke the map function with multiple documents from the orders collection:
var myCursor = db.orders.find( { cust_id: "abc123" } );
while (myCursor.hasNext()) {
var doc = myCursor.next();
print ("document _id= " + tojson(doc._id));
map.apply(doc);
print();
}
305
The reduce function must return an object whose type must be identical to the type of the value emitted by
the map function.
The order of the elements in the valuesArray should not affect the output of the reduce function.
The reduce function must be idempotent.
For a list of all the requirements for the reduce function, see mapReduce, or the mongo shell helper method
db.collection.mapReduce().
Confirm Output Type
You can test that the reduce function returns a value that is the same type as the value emitted from the map function.
1. Define a reduceFunction1 function that takes the arguments keyCustId and valuesPrices.
valuesPrices is an array of integers:
var reduceFunction1 = function(keyCustId, valuesPrices) {
return Array.sum(valuesPrices);
};
5. Define a reduceFunction2 function that takes the arguments keySKU and valuesCountObjects.
valuesCountObjects is an array of documents that contain two fields count and qty:
var reduceFunction2 = function(keySKU, valuesCountObjects) {
reducedValue = { count: 0, qty: 0 };
for (var idx = 0; idx < valuesCountObjects.length; idx++) {
reducedValue.count += valuesCountObjects[idx].count;
reducedValue.qty += valuesCountObjects[idx].qty;
}
return reducedValue;
};
8. Verify the reduceFunction2 returned a document with exactly the count and the qty field:
306
Chapter 6. Aggregation
{ "count" : 6, "qty" : 30 }
= [
qty: 5 },
qty: 10 },
qty: 15 }
var values2
{ count: 3,
{ count: 1,
{ count: 2,
];
= [
qty: 15 },
qty: 5 },
qty: 10 }
2. Define a reduceFunction2 function that takes the arguments keySKU and valuesCountObjects.
valuesCountObjects is an array of documents that contain two fields count and qty:
var reduceFunction2 = function(keySKU, valuesCountObjects) {
reducedValue = { count: 0, qty: 0 };
for (var idx = 0; idx < valuesCountObjects.length; idx++) {
reducedValue.count += valuesCountObjects[idx].count;
reducedValue.qty += valuesCountObjects[idx].qty;
}
return reducedValue;
};
3. Invoke the reduceFunction2 first with values1 and then with values2:
reduceFunction2('myKey', values1);
reduceFunction2('myKey', values2);
307
3. Define a sample valuesIdempotent array that contains an element that is a call to the reduceFunction2
function:
var valuesIdempotent = [
{ count: 1, qty: 5 },
{ count: 2, qty: 10 },
reduceFunction2(myKey, [ { count:3, qty: 15 } ] )
];
4. Define a sample values1 array that combines the values passed to reduceFunction2:
var values1
{ count: 1,
{ count: 2,
{ count: 3,
];
= [
qty: 5 },
qty: 10 },
qty: 15 }
5. Invoke the reduceFunction2 first with myKey and valuesIdempotent and then with myKey and
values1:
reduceFunction2(myKey, valuesIdempotent);
reduceFunction2(myKey, values1);
308
Chapter 6. Aggregation
aggregate
De- New in version 2.2.
scrip- Designed with specific goals of
tion improving performance and
usability for aggregation tasks.
Uses a pipeline approach
where objects are transformed as
they pass through a series of
pipeline operators such as
$group, $match, and $sort.
See Aggregation Reference
(page 308) for more information
on the pipeline operators.
Key Pipeline operators can be
Fea- repeated as needed.
tures Pipeline operators need not
produce one output document for
every input document.
Can also generate new
documents or filter out
documents.
Flexibility
mapReduce
Implements the Map-Reduce
aggregation for processing large
data sets.
group
Provides grouping functionality.
Is slower than the aggregate
command and has less
functionality than the
mapReduce command.
In addition to grouping
operations, can perform complex
aggregation tasks as well as
perform incremental aggregation
on continuously growing
datasets.
See Map-Reduce Examples
(page 300) and Perform
Incremental Map-Reduce
(page 302).
Custom map, reduce and
finalize JavaScript functions
offer flexibility to aggregation
logic.
See mapReduce for details and
restrictions on the functions.
309
$match
$group
$match
$project
$sort
$limit
$sum
$sum
No direct corresponding operator; however, the $unwind operator allows for
somewhat similar functionality, but with fields embedded within the document.
Examples
The following table presents a quick reference of SQL aggregation statements and the corresponding MongoDB statements. The examples in the table assume the following conditions:
The SQL examples assume two tables, orders and order_lineitem that join by the
order_lineitem.order_id and the orders.id columns.
The MongoDB examples assume one collection orders that contain documents of the following prototype:
{
cust_id: "abc123",
ord_date: ISODate("2012-11-02T17:04:11.102Z"),
status: 'A',
price: 50,
items: [ { sku: "xxx", qty: 25, price: 1 },
{ sku: "yyy", qty: 25, price: 1 } ]
}
The MongoDB statements prefix the names of the fields from the documents in the collection orders with a $
character when they appear as operands to the aggregation operations.
310
Chapter 6. Aggregation
SQL Example
MongoDB Example
db.orders.aggregate( [
{ $group: { _id: null,
count: { $sum: 1 } } }
] )
Description
Count all records from orders
SELECT cust_id,
SUM(price) AS total
FROM orders
GROUP BY cust_id
ORDER BY total
SELECT cust_id,
ord_date,
SUM(price) AS total
FROM orders
GROUP BY cust_id, ord_date
For
each
unique
cust_id,
db.orders.aggregate( [
ord_date grouping, sum the
{ $group: { _id: { cust_id: "$cust_id",
price field.
ord_date: "$ord_date" },
total: { $sum: "$price" } } }
] )
SELECT cust_id,
ord_date,
SUM(price) AS total
FROM orders
GROUP BY cust_id, ord_date
HAVING total > 250
For
each
unique
cust_id,
db.orders.aggregate( [
ord_date grouping, sum the
{ $group: { _id: { cust_id: "$cust_id",
price field and return only where
ord_date: "$ord_date" },
the sum is greater than 250.
total: { $sum: "$price" } } },
{ $match: { total: { $gt: 250 } } }
] )
SELECT cust_id,
SUM(price) as total
FROM orders
WHERE status = 'A'
GROUP BY cust_id
SELECT cust_id,
SUM(price) as total
FROM orders
WHERE status = 'A'
GROUP
BY cust_id
6.4.
Aggregation
Reference
HAVING total > 250
SELECT cust_id,
db.orders.aggregate( [
Description
Performs aggregation tasks (page 281) such as group using the aggregation framework.
Counts the number of documents in a collection.
Displays the distinct values found for a specified key in a collection.
Groups documents in a collection by the specified key and performs simple aggregation.
Performs map-reduce (page 284) aggregation for large data sets.
Aggregation Methods
Name
Description
db.collection.aggregate()Provides access to the aggregation pipeline (page 281).
db.collection.group()
Groups documents in a collection by the specified key and performs simple
aggregation.
db.collection.mapReduce()Performs map-reduce (page 284) aggregation for large data sets.
312
Chapter 6. Aggregation
CHAPTER 7
Indexes
Indexes provide high performance read operations for frequently used queries.
This section introduces indexes in MongoDB, describes the types and configuration options for indexes, and describes
special types of indexing MongoDB supports. The section also provides tutorials detailing procedures and operational
concerns, and providing information on how applications may use indexes.
Index Introduction (page 313) An introduction to indexes in MongoDB.
Index Concepts (page 318) The core documentation of indexes in MongoDB, including geospatial and text indexes.
Index Types (page 319) MongoDB provides different types of indexes for different purposes and different types
of content.
Index Properties (page 334) The properties you can specify when building indexes.
Index Creation (page 336) The options available when creating indexes.
Indexing Tutorials (page 339) Examples of operations involving indexes, including index creation and querying indexes.
Indexing Reference (page 375) Reference material for indexes in MongoDB.
313
Figure 7.1: Diagram of a query selecting documents using an index. MongoDB narrows the query by scanning the
range of documents with values of score less than 30.
Create indexes to support common and user-facing queries. Having these indexes will ensure that MongoDB only
scans the smallest possible number of documents.
Indexes can also optimize the performance of other operations in specific situations:
Sorted Results
MongoDB can use indexes to return documents sorted by the index key directly from the index without
requiring an additional sort phase.
Covered Results
When the query criteria and the projection of a query include only the indexed fields, MongoDB will return
results directly from the index without scanning any documents or bringing documents into memory.
These covered queries can be very efficient. Indexes can also cover aggregation pipeline operations
(page 281).
314
Chapter 7. Indexes
Figure 7.2: Diagram of a query that uses an index to select and return sorted results. The index stores score values
in ascending order. MongoDB can traverse the index in either ascending or descending order to return sorted results.
Figure 7.3: Diagram of a query that uses only the index to match the query criteria and return the results. MongoDB
does not need to inspect data outside of the index to fulfill the query.
315
Single Field
In addition to the MongoDB-defined _id index, MongoDB supports user-defined indexes on a single field of a document (page 320). Consider the following illustration of a single-field index:
Compound Index
MongoDB also supports user-defined indexes on multiple fields. These compound indexes (page 322) behave like
single-field indexes; however, the query can select documents based on additional fields. The order of fields listed
in a compound index has significance. For instance, if a compound index consists of { userid: 1, score:
-1 }, the index sorts first by userid and then, within each userid value, sort by score. Consider the following
illustration of this compound index:
Figure 7.5: Diagram of a compound index on the userid field (ascending) and the score field (descending). The
index sorts first by the userid field and then by the score field.
316
Chapter 7. Indexes
Multikey Index
MongoDB uses multikey indexes (page 324) to index the content stored in arrays. If you index a field that holds an
array value, MongoDB creates separate index entries for every element of the array. These multikey indexes (page 324)
allow queries to select documents that contain arrays by matching on element or elements of the arrays. MongoDB
automatically determines whether to create a multikey index if the indexed field contains an array value; you do not
need to explicitly specify the multikey type.
Consider the following illustration of a multikey index:
Figure 7.6: Diagram of a multikey index on the addr.zip field. The addr field contains an array of address
documents. The address documents contain the zip field.
Geospatial Index
To support efficient queries of geospatial coordinate data, MongoDB provides two special indexes: 2d indexes
(page 329) that uses planar geometry when returning results and 2sphere indexes (page 328) that use spherical geometry to return results.
See 2d Index Internals (page 331) for a high level introduction to geospatial indexes.
Text Indexes
MongoDB provides a beta text index type that supports searching for string content in a collection. These text
indexes do not store language-specific stop words (e.g. the, a, or) and stem the words in a collection to only
store root words.
See Text Indexes (page 332) for more information on text indexes and search.
317
Hashed Indexes
To support hash based sharding (page 493), MongoDB provides a hashed index (page 333) type, which indexes the
hash of the value of a field. These indexes have a more random distribution of values along their range, but only
support equality matches and cannot support range-based queries.
318
Chapter 7. Indexes
Sparse Indexes (page 335) A sparse index does not index documents that do not have the indexed field.
Index Creation (page 336) The options available when creating indexes.
MongoDB indexes may be ascending, (i.e. 1) or descending (i.e. -1) in their ordering. Nevertheless, MongoDB may
also traverse the index in either directions. As a result, for single-field indexes, ascending and descending indexes are
interchangeable. This is not the case for compound indexes: in compound indexes, the direction of the sort order can
have a greater impact on the results.
See Sort Order (page 323) for more information on the impact of index order on results in compound indexes.
Redundant Indexes
A single query can only use one index, except for queries that use the $or operator that can use a different index for
each clause.
See also:
Index Limitations.
Index Type Documentation
Single Field Indexes (page 320) A single field index only includes data from a single field of the documents in a
collection. MongoDB supports single field indexes on fields at the top level of a document and on fields in
sub-documents.
Compound Indexes (page 322) A compound index includes more than one field of the documents in a collection.
Multikey Indexes (page 324) A multikey index references an array and records a match if a query includes any value
in the array.
319
Geospatial Indexes and Queries (page 326) Geospatial indexes support location-based searches on data that is stored
as either GeoJSON objects or legacy coordinate pairs.
Text Indexes (page 332) Text indexes supports search of string content in documents.
Hashed Index (page 333) Hashed indexes maintain entries with hashes of the values of the indexed field.
Single Field Indexes
MongoDB provides complete support for indexes on any field in a collection of documents. By default, all collections
have an index on the _id field (page 321), and applications and users may add additional indexes to support important
queries and operations.
MongoDB supports indexes that contain either a single field or multiple fields depending on the operations that index
supports. This document describes indexes that contain a single field. Consider the following illustration of a single
field index.
Cases
320
Chapter 7. Indexes
_id Field Index MongoDB creates the _id index, which is an ascending unique index (page 334) on the _id field
for all collections when the collection is created. You cannot remove the index on the _id field.
Think of the _id field as the primary key for a collection. Every document must have a unique _id field. You may
store any unique value in the _id field. The default value of _id is an ObjectId on every insert() operation. An
ObjectId is a 12-byte unique identifiers suitable for use as the value of an _id field.
Note: In sharded clusters, if you do not use the _id field as the shard key, then your application must ensure the
uniqueness of the values in the _id field to prevent errors. This is most-often done by using a standard auto-generated
ObjectId.
Before version 2.2, capped collections did not have an _id field. In version 2.2 and newer, capped collection do
have an _id field, except those in the local database. See Capped Collections Recommendations and Restrictions
(page 161) for more information.
Indexes on Embedded Fields You can create indexes on fields embedded in sub-documents, just as you can index
top-level fields in documents. Indexes on embedded fields differ from indexes on sub-documents (page 321), which
include the full content up to the maximum index size of the sub-document in the index. Instead, indexes on
embedded fields allow you to use a dot notation, to introspect into sub-documents.
Consider a collection named people that holds documents that resemble the following example document:
{"_id": ObjectId(...)
"name": "John Doe"
"address": {
"street": "Main"
"zipcode": 53511
"state": "WI"
}
}
You can create an index on the address.zipcode field, using the following specification:
db.people.ensureIndex( { "address.zipcode": 1 } )
The metro field is a subdocument, containing the embedded fields city and state. The following creates an index
on the metro field as a whole:
db.factories.ensureIndex( { metro: 1 } )
The following query can use the index on the metro field:
db.factories.find( { metro: { city: "New York", state: "NY" } } )
321
This query returns the above document. When performing equality matches on subdocuments, field order matters and
the subdocuments must match exactly. For example, the following query does not match the above document:
db.factories.find( { metro: { state: "NY", city: "New York" } } )
MongoDB supports compound indexes, where a single index structure holds references to multiple fields
collections documents. The following diagram illustrates an example of a compound index on two fields:
within a
Figure 7.8: Diagram of a compound index on the userid field (ascending) and the score field (descending). The
index sorts first by the userid field and then by the score field.
Compound indexes can support queries that match on multiple fields.
Example
Consider a collection named products that holds documents that resemble the following document:
{
"_id": ObjectId(...)
"item": "Banana"
"category": ["food", "produce", "grocery"]
"location": "4th Street Store"
"stock": 4
"type": cases
"arrival": Date(...)
}
If applications query on the item field as well as query on both the item field and the stock field, you can specify
a single compound index to support both of these queries:
db.products.ensureIndex( { "item": 1, "stock": 1 } )
Important: You may not create compound indexes that have hashed index fields. You will receive an error if you
attempt to create a compound index that includes a hashed index (page 333).
2
322
Chapter 7. Indexes
The order of the fields in a compound index is very important. In the previous example, the index will contain
references to documents sorted first by the values of the item field and, within each value of the item field, sorted
by values of the stock field. See Sort Order (page 323) for more information.
In addition to supporting queries that match on all the index fields, compound indexes can support queries that match
on the prefix of the index fields. For details, see Prefixes (page 323).
Sort Order Indexes store references to fields in either ascending (1) or descending (-1) sort order. For single-field
indexes, the sort order of keys doesnt matter because MongoDB can traverse the index in either direction. However,
for compound indexes (page 322), sort order can matter in determining whether the index can support a sort operation.
Consider a collection events that contains documents with the fields username and date. Applications can issue
queries that return results sorted first by ascending username values and then by descending (i.e. more recent to last)
date values, such as:
db.events.find().sort( { username: 1, date: -1 } )
or queries that return results sorted first by descending username values and then by ascending date values, such
as:
db.events.find().sort( { username: -1, date: 1 } )
However, the above index cannot support sorting by ascending username values and then by ascending date values,
such as the following:
db.events.find().sort( { username: 1, date: 1 } )
Prefixes Compound indexes support queries on any prefix of the index fields. Index prefixes are the beginning
subset of indexed fields. For example, given the index { a: 1, b: 1, c: 1 }, both { a: 1 } and {
a: 1, b: 1 } are prefixes of the index.
If you have a collection that has a compound index on { a: 1, b: 1 }, as well as an index that consists of the
prefix of that index, i.e. { a: 1 }, assuming none of the index has a sparse or unique constraints, then you can
drop the { a: 1 } index. MongoDB will be able to use the compound index in all of situations that it would have
used the { a: 1 } index.
Example
Given the following index:
{ "item": 1, "location": 1, "stock": 1 }
323
Multikey Indexes
To index a field that holds an array value, MongoDB adds index items for each item in the array. These multikey indexes
allow MongoDB to return documents from queries using the value of an array. MongoDB automatically determines
whether to create a multikey index if the indexed field contains an array value; you do not need to explicitly specify
the multikey type.
Consider the following illustration of a multikey index:
Figure 7.9: Diagram of a multikey index on the addr.zip field. The addr field contains an array of address
documents. The address documents contain the zip field.
Multikey indexes support all operations supported by other MongoDB indexes; however, applications may use multikey indexes to select documents based on ranges of values for the value of an array. Multikey indexes support arrays
that hold both values (e.g. strings, numbers) and nested documents.
Limitations
Interactions between Compound and Multikey Indexes While you can create multikey compound indexes
(page 322), at most one field in a compound index may hold an array. For example, given an index on { a: 1,
b: 1 }, the following documents are permissible:
324
Chapter 7. Indexes
However, the following document is impermissible, and MongoDB cannot insert such a document into a collection
with the {a: 1, b: 1 } index:
{a: [1, 2], b: [1, 2]}
If you attempt to insert a such a document, MongoDB will reject the insertion, and produce an error that says cannot
index parallel arrays. MongoDB does not index parallel arrays because they require the index to include
each value in the Cartesian product of the compound keys, which could quickly result in incredibly large and difficult
to maintain indexes.
Shard Keys
Important: The index of a shard key cannot be a multi-key index.
Hashed Indexes hashed indexes are not compatible with multi-key indexes.
To compute the hash for a hashed index, MongoDB collapses sub-documents and computes the hash for the entire
value. For fields that hold arrays or sub-documents, you cannot use the index to support queries that introspect the
sub-document.
Examples
Index Basic Arrays Given the following document:
{
"_id" : ObjectId("..."),
"name" : "Warm Weather",
"author" : "Steve",
"tags" : [ "weather", "hot", "record", "april" ]
}
"weather",
"hot",
"record", and
"april".
Queries could use the multikey index to return queries for any of the above values.
Index Arrays with Embedded Documents You can create multikey indexes on fields in objects embedded in arrays,
as in the following example:
Consider a feedback collection with documents in the following form:
{
"_id": ObjectId(...),
"title": "Grocery Quality",
"comments": [
325
{ author_id: ObjectId(...),
date: Date(...),
text: "Please expand the cheddar selection." },
{ author_id: ObjectId(...),
date: Date(...),
text: "Please expand the mustard selection." },
{ author_id: ObjectId(...),
date: Date(...),
text: "Please expand the olive selection." }
]
}
An index on the comments.text field would be a multikey index and would add items to the index for all embedded
documents in the array.
With the index { "comments.text":
The query would select the documents in the collection that contain the following embedded document in the
comments array:
{ author_id: ObjectId(...),
date: Date(...),
text: "Please expand the olive selection." }
MongoDB offers a number of indexes and query mechanisms to handle geospatial information. This section introduces
MongoDBs geospatial features. For complete examples of geospatial queries in MongoDB, see Geospatial Index
Tutorials (page 350).
Surfaces Before storing your location data and writing queries, you must decide the type of surface to use to perform
calculations. The type you choose affects how you store data, what type of index to build, and the syntax of your
queries.
MongoDB offers two surface types:
Spherical To calculate geometry over an Earth-like sphere, store your location data on a spherical surface and use
2dsphere (page 328) index.
Store your location data as GeoJSON objects with this coordinate-axis order: longitude, latitude. The coordinate
reference system for GeoJSON uses the WGS84 datum.
Flat To calculate distances on a Euclidean plane, store your location data as legacy coordinate pairs and use a 2d
(page 329) index.
Location Data If you choose spherical surface calculations, you store location data as either:
GeoJSON Objects Queries on GeoJSON objects always calculate on a sphere. The default coordinate reference
system for GeoJSON uses the WGS84 datum.
326
Chapter 7. Indexes
New in version 2.4: Support for GeoJSON storage and queries is new in version 2.4. Prior to version 2.4, all geospatial
data used coordinate pairs.
MongoDB supports the following GeoJSON objects:
Point
LineString
Polygon
Legacy Coordinate Pairs MongoDB supports spherical surface calculations on legacy coordinate pairs by converting the data to the GeoJSON Point type.
If you choose flat surface calculations, you can store data only as legacy coordinate pairs.
Query Operations MongoDBs geospatial query operators let you query for:
Inclusion MongoDB can query for locations contained entirely within a specified polygon. Inclusion queries use
the $geoWithin operator.
Intersection MongoDB can query for locations that intersect with a specified geometry. These queries apply only
to data on a spherical surface. These queries use the $geoIntersects operator.
Proximity MongoDB can query for the points nearest to another point. Proximity queries use the $near operator.
The $near operator requires a 2d or 2dsphere index.
Geospatial Indexes MongoDB provides the following geospatial index types to support the geospatial queries.
2dsphere 2dsphere (page 328) indexes support:
Calculations on a sphere
Both GeoJSON objects and legacy coordinate pairs
A compound index with scalar index fields (i.e. ascending or descending) as a prefix or suffix of the 2dsphere
index field
New in version 2.4: 2dsphere indexes are not available before version 2.4.
See also:
Query a 2dsphere Index (page 350)
2d 2d (page 329) indexes support:
Calculations using flat geometry
Legacy coordinate pairs (i.e., geospatial points on a flat coordinate system)
A compound index with only one additional field, as a suffix of the 2d index field
See also:
Query a 2d Index (page 353)
327
Geospatial Indexes and Sharding You cannot use a geospatial index as the shard key index.
You can create and maintain a geospatial index on a sharded collection if using different fields as the shard key.
Queries using $near are not supported for sharded collections. Use geoNear instead. You also can query for
geospatial data using $geoWithin.
Additional Resources The following pages provide complete documentation for geospatial indexes and queries:
2dsphere Indexes (page 328) A 2dsphere index supports queries that calculate geometries on an earth-like sphere.
The index supports data stored as both GeoJSON objects and as legacy coordinate pairs.
2d Indexes (page 329) The 2d index supports data stored as legacy coordinate pairs and is intended for use in MongoDB 2.2 and earlier.
Haystack Indexes (page 330) A haystack index is a special index optimized to return results over small areas. For
queries that use spherical geometry, a 2dsphere index is a better option than a haystack index.
2d Index Internals (page 331) Provides a more in-depth explanation of the internals of geospatial indexes. This material is not necessary for normal operations but may be useful for troubleshooting and for further understanding.
2dsphere Indexes New in version 2.4.
A 2dsphere index supports queries that calculate geometries on an earth-like sphere. The index supports data stored
as both GeoJSON objects and as legacy coordinate pairs. The index supports legacy coordinate pairs by converting
the data to the GeoJSON Point type.
The 2dsphere index supports all MongoDB geospatial queries: queries for inclusion, intersection and proximity.
A compound (page 322) 2dsphere index can reference multiple location and non-location fields within a collections
documents. You can arrange the fields in any order.
The default datum for an earth-like sphere in MongoDB 2.4 is WGS84. Coordinate-axis order is longitude, latitude.
See the http://docs.mongodb.org/manualreference/operator/query-geospatial for the
query operators that support geospatial queries.
Considerations MongoDB allows only one geospatial index per collection. You can create either a 2dsphere or
a 2d (page 329) per collection.
You cannot use a 2dsphere index as a shard key when sharding a collection. However, you can create and maintain
a geospatial index on a sharded collection by using a different field as the shard key.
GeoJSON Objects New in version 2.4.
MongoDB supports the following GeoJSON objects:
Point
LineString
Polygon
In order to index GeoJSON data, you must store the data in a location field that you name. The location field contains
a subdocument with a type field specifying the GeoJSON object type and a coordinates field specifying the
objects coordinates. Always store coordinates longitude, latitude order.
Use the following syntax:
328
Chapter 7. Indexes
Polygons consist of an array of GeoJSON LinearRing coordinate arrays. These LinearRings are closed
LineStrings. Closed LineStrings have at least four coordinate pairs and specify the same position as the
first and last coordinates.
The following example stores a GeoJSON Polygon with an exterior ring and no interior rings (or holes). Note the
first and last coordinate pair with the [ 0 , 0 ] coordinate:
{ loc :
{ type : "Polygon" ,
coordinates : [ [ [ 0 , 0 ] , [ 3 , 6 ] , [ 6 , 1 ] , [ 0 , 0 ] ] ]
} }
2d Indexes Use a 2d index for data stored as points on a two-dimensional plane. The 2d index is intended for
legacy coordinate pairs used in MongoDB 2.2 and earlier.
Use a 2d index if:
your database has legacy location data from MongoDB 2.2 or earlier, and
you do not intend to store any location data as GeoJSON objects.
See the http://docs.mongodb.org/manualreference/operator/query-geospatial for the
query operators that support geospatial queries.
Considerations MongoDB allows only one geospatial index per collection. You can create either a 2d or a 2dsphere
(page 328) per collection.
329
Arrays are preferred as certain languages do not guarantee associative map ordering.
For all points, if you use longitude and latitude, store coordinates in longitude, latitude order.
Haystack Indexes A haystack index is a special index that is optimized to return results over small areas. Haystack
indexes improve performance on queries that use flat geometry.
330
Chapter 7. Indexes
For queries that use spherical geometry, a 2dsphere index is a better option than a haystack index. 2dsphere indexes
(page 328) allow field reordering; haystack indexes require the first field to be the location field. Also, haystack indexes
are only usable via commands and so always return all results at once.
Haystack indexes create buckets of documents from the same geographic area in order to improve performance for
queries limited to that area. Each bucket in a haystack index contains all the documents within a specified proximity
to a given longitude and latitude.
To create a geohaystacks index, see Create a Haystack Index (page 355). For information and example on querying a
haystack index, see Query a Haystack Index (page 356).
2d Index Internals This document provides a more in-depth explanation of the internals of MongoDBs 2d geospatial indexes. This material is not necessary for normal operations or application development but may be useful for
troubleshooting and for further understanding.
Calculation of Geohash Values for 2d Indexes When you create a geospatial index on legacy coordinate pairs,
MongoDB computes geohash values for the coordinate pairs within the specified location range (page 353) and then
indexes the geohash values.
To calculate a geohash value, recursively divide a two-dimensional map into quadrants. Then assign each quadrant a
two-bit value. For example, a two-bit representation of four quadrants would be:
01
11
00
10
These two-bit values (00, 01, 10, and 11) represent each of the quadrants and all points within each quadrant. For
a geohash with two bits of resolution, all points in the bottom left quadrant would have a geohash of 00. The top
left quadrant would have the geohash of 01. The bottom right and top right would have a geohash of 10 and 11,
respectively.
To provide additional precision, continue dividing each quadrant into sub-quadrants. Each sub-quadrant would have
the geohash value of the containing quadrant concatenated with the value of the sub-quadrant. The geohash for the
upper-right quadrant is 11, and the geohash for the sub-quadrants would be (clockwise from the top left): 1101,
1111, 1110, and 1100, respectively.
Multi-location Documents for 2d Indexes New in version 2.0: Support for multiple locations in a document.
While 2d geospatial indexes do not support more than one set of coordinates in a document, you can use a multi-key
index (page 324) to index multiple coordinate pairs in a single document. In the simplest example you may have a
field (e.g. locs) that holds an array of coordinates, as in the following example:
{ _id : ObjectId(...),
locs : [ [ 55.5 , 42.3 ] ,
[ -74 , 44.74 ] ,
{ lng : 55.5 , lat : 42.3 } ]
}
The values of the array may be either arrays, as in [ 55.5, 42.3 ], or embedded documents, as in { lng :
55.5 , lat : 42.3 }.
You could then create a geospatial index on the locs field, as in the following:
db.places.ensureIndex( { "locs": "2d" } )
You may also model the location data as a field inside of a sub-document. In this case, the document would contain
a field (e.g. addresses) that holds an array of documents where each document has a field (e.g. loc:) that holds
location coordinates. For example:
7.2. Index Concepts
331
{ _id : ObjectId(...),
name : "...",
addresses : [ {
context : "home" ,
loc : [ 55.5, 42.3 ]
} ,
{
context : "home",
loc : [ -74 , 44.74 ]
}
]
}
You could then create the geospatial index on the addresses.loc field as in the following example:
db.records.ensureIndex( { "addresses.loc": "2d" } )
For documents with multiple coordinate values, queries may return the same document multiple times if more than
one indexed coordinate pair satisfies the query constraints. Use the uniqueDocs parameter to geoNear or the
$uniqueDocs operator with $geoWithin.
To include the location field with the distance field in multi-location document queries, specify includeLocs:
true in the geoNear command.
See also:
geospatial-query-compatibility-chart
Text Indexes
Create Text Index To create a text index, use the db.collection.ensureIndex() method. To index a
field that contains a string or an array of string elements, include the field and specify the string literal "text" in the
index document, as in the following example:
db.reviews.ensureIndex( { comments: "text" } )
For examples of creating text indexes on multiple fields, see Create a text Index (page 359).
text indexes drop language-specific stop words (e.g. in English, the, an, a, and, etc.) and uses simple
language-specific suffix stemming. See text-search-languages for the supported languages and Specify a Language for
Text Index (page 363) for details on specifying languages with text indexes.
text indexes can satisfy the filter component of a text search. For details, see Create text Index to Satisfy the
filter Component of Text Search (page 367).
332
Chapter 7. Indexes
Storage Requirements and Performance Costs text indexes have the following storage requirements and performance costs:
text indexes change the space allocation method for all future record allocations in a collection to
usePowerOf2Sizes.
text indexes can be large. They contain one index entry for each unique post-stemmed word in each indexed
field for each document inserted.
Building a text index is very similar to building a large multi-key index and will take longer than building a
simple ordered (scalar) index on the same data.
When building a large text index on an existing collection, ensure that you have a sufficiently high limit on
open file descriptors. See the recommended settings (page 223).
text indexes will impact insertion throughput because MongoDB must add an index entry for each unique
post-stemmed word in each indexed field of each new source document.
Additionally, text indexes do not store phrases or information about the proximity of words in the documents.
As a result, phrase queries will run much more effectively when the entire collection fits in RAM.
Text Search Text search supports the search of string content in documents of a collection. MongoDB provides the
text command to perform the text search. The text command accesses the text index.
The text search process:
tokenizes and stems the search term(s) during both the index creation and the text command execution.
assigns a score to each document that contains the search term in the indexed fields. The score determines the
relevance of a document to a given search query.
By default, the text command returns at most the top 100 matching documents as determined by the scores. The
command can search for words and phrases. The command matches on the complete stemmed words. For example, if
a document field contains the word blueberry, a search on the term blue will not match the document. However,
a search on either blueberry or blueberries will match.
For information and examples on various text search patterns, see Search String Content for Text (page 360).
Hashed Index
333
This operation creates a hashed index for the active collection on the a field.
334
Chapter 7. Indexes
Examples
Sparse Index On A Collection Can Result In Incomplete Results Consider a collection scores that contains
the following documents:
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
335
Then, the following query to return all documents in the scores collection sorted by the score field gives incomplete results:
db.scores.find().sort( { score: -1 } )
Because the document for the userid "newbie" does not contain the score field, the query, which uses the sparse
index, will return incomplete results that omit that document:
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
Sparse Index with Unique Constraint Consider a collection scores that contains the following documents:
{ "_id" : ObjectId("523b6e32fb408eea0eec2647"), "userid" : "newbie" }
{ "_id" : ObjectId("523b6e61fb408eea0eec2648"), "userid" : "abby", "score" : 82 }
{ "_id" : ObjectId("523b6e6ffb408eea0eec2649"), "userid" : "nina", "score" : 90 }
You could create an index with a unique constraint (page 334) and sparse filter on the score field using the following
operation:
db.scores.ensureIndex( { score: 1 } , { sparse: true, unique: true } )
This index would permit inserting documents that had unique values for the score field or did not include a score
field. Consider the following insert operation (page 59):
db.scores.insert(
db.scores.insert(
db.scores.insert(
db.scores.insert(
{
{
{
{
"userid":
"userid":
"userid":
"userid":
"PWWfO8lFs1", "score": 43 } )
"XlSOX66gEy", "score": 34 } )
"nuZHu2tcRm" } )
"HIGvEZfdc5" } )
However, this index would not permit adding the following documents:
db.scores.insert( { "userid": "PWWfO8lFs1", "score": 82 } )
db.scores.insert( { "userid": "XlSOX66gEy", "score": 90 } )
Background Construction
By default, creating an index blocks all other operations on a database. When building an index on a collection, the
database that holds the collection is unavailable for read or write operations until the index build completes. Any
336
Chapter 7. Indexes
operation that requires a read or write lock on all databases (e.g. listDatabases) will wait for the foreground index
build to complete.
For potentially long running index building operations, consider the background operation so that the MongoDB
database remains available during the index building operation. For example, to create an index in the background of
the zipcode field of the people collection, issue the following:
db.people.ensureIndex( { zipcode: 1}, {background: true} )
Behavior
As of MongoDB version 2.4, a mongod instance can build more than one index in the background concurrently.
Changed in version 2.4: Before 2.4, a mongod instance could only build one background index per database at a time.
Changed in version 2.2: Before 2.2, a single mongod instance could only build one index at a time.
Background indexing operations run in the background so that other database operations can run while creating the
index. However, the mongo shell session or connection where you are creating the index will block until the index
build is complete. To continue issuing commands to the database, open another connection or mongo instance.
Queries will not use partially-built indexes: the index will only be usable once the index build is complete.
Note:
If MongoDB is building an index in the background, you cannot perform other administrative operations involving that collection, including running repairDatabase, dropping the collection (i.e.
db.collection.drop()), and running compact. These operations will return an error during background
index builds.
Performance
The background index operation uses an incremental approach that is slower than the normal foreground index
builds. If the index is larger than the available RAM, then the incremental process can take much longer than the
foreground build.
If your application includes ensureIndex() operations, and an index doesnt exist for other operational concerns,
building the index can have a severe impact on the performance of the database.
To avoid performance issues, make sure that your application checks for the indexes at start up using the
getIndexes() method or the equivalent method for your driver4 and terminates if the proper indexes do not exist. Always build indexes in production instances using separate application code, during designated maintenance
windows.
Building Indexes on Secondaries
Background index operations on a replica set primary become foreground indexing operations on secondary members
of the set. All indexing operations on secondaries block replication.
4 http://api.mongodb.org/
337
To build large indexes on secondaries the best approach is to restart one secondary at a time in standalone mode and
build the index. After building the index, restart as a member of the replica set, allow it to catch up with the other
members of the set, and then build the index on the next secondary. When all the secondaries have the new index, step
down the primary, restart it as a standalone, and build the index on the former primary.
Remember, the amount of time required to build the index on a secondary must be within the window of the oplog, so
that the secondary can catch up with the primary.
Indexes on secondary members in recovering mode are always built in the foreground to allow them to catch up as
soon as possible.
See Build Indexes on Replica Sets (page 344) for a complete procedure for building indexes on secondaries.
Drop Duplicates
MongoDB cannot create a unique index (page 334) on a field that has duplicate values. To force the creation of a
unique index, you can specify the dropDups option, which will only index the first occurrence of a value for the key,
and delete all subsequent values.
Important: As in all unique indexes, if a document does not have the indexed field, MongoDB will include it in the
index with a null value.
If subsequent fields do not have the indexed field, and you have set {dropDups: true}, MongoDB will remove
these documents from the collection when creating the index. If you combine dropDups with the sparse (page 335)
option, this index will only include documents in the index that have the value, and the documents without the field
will remain in the database.
To create a unique index that drops duplicates on the username field of the accounts collection, use a command
in the following form:
db.accounts.ensureIndex( { username: 1 }, { unique: true, dropDups: true } )
true } will delete data from your database. Use with extreme cau-
338
Chapter 7. Indexes
339
Create an Index
Indexes allow MongoDB to process and fulfill queries quickly by creating small and efficient representations of the
documents in a collection. MongoDB creates an index on the _id field of every collection by default, but allows users
to create indexes for any collection using on any field in a document.
This tutorial describes how to create an index on a single field. MongoDB also supports compound indexes (page 322),
which are indexes on multiple fields. See Create a Compound Index (page 341) for instructions on building compound
indexes.
Create an Index on a Single Field
To create an index, use ensureIndex() or a similar method from your driver5 . For example the following creates
an index on the phone-number field of the people collection:
db.people.ensureIndex( { "phone-number": 1 } )
ensureIndex() only creates an index if an index of the same specification does not already exist.
All indexes support and optimize the performance for queries that select on this field. For queries that cannot use an
index, MongoDB must scan all documents in a collection for documents that match the query.
Tip
The value of the field in the index specification describes the kind of index for that field. For example, a value of 1
specifies an index that orders items in ascending order. A value of -1 specifies an index that orders items in descending
order.
Examples
If you create an index on the user_id field in the records, this index is, the index will support the following
query:
db.records.find( { user_id: 2 } )
However, the following query, on the profile_url field is not supported by this index:
db.records.find( { profile_url: 2 } )
Additional Considerations
If your collection holds a large amount of data, and your application needs to be able to access the data while building
the index, consider building the index in the background, as described in Background Construction (page 336). To
build indexes on replica sets, see the Build Indexes on Replica Sets (page 344) section for more information.
Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 344).
Some drivers may specify indexes, using NumberLong(1) rather than 1 as the specification. This does not have any
affect on the resulting index.
See also:
5 http://api.mongodb.org/
340
Chapter 7. Indexes
Create a Compound Index (page 341), Indexing Tutorials (page 339) and Index Concepts (page 318) for more information.
Create a Compound Index
Indexes allow MongoDB to process and fulfill queries quickly by creating small and efficient representations of the
documents in a collection. MongoDB supports indexes that include content on a single field, as well as compound
indexes (page 322) that include content from multiple fields. Continue reading for instructions and examples of
building a compound index.
Build a Compound Index
To create a compound index (page 322) use an operation that resembles the following prototype:
db.collection.ensureIndex( { a: 1, b: 1, c: 1 } )
Example
The following operation will create an index on the item, category, and price fields of the products collection:
db.products.ensureIndex( { item: 1, category: 1, price: 1 } )
Additional Considerations
If your collection holds a large amount of data, and your application needs to be able to access the data while building
the index, consider building the index in the background, as described in Background Construction (page 336). To
build indexes on replica sets, see the Build Indexes on Replica Sets (page 344) section for more information.
Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 344).
Some drivers may specify indexes, using NumberLong(1) rather than 1 as the specification. This does not have any
affect on the resulting index.
Tip
The value of the field in the index specification describes the kind of index for that field. For example, a value of 1
specifies an index that orders items in ascending order. A value of -1 specifies an index that orders items in descending
order.
See also:
Create an Index (page 340), Indexing Tutorials (page 339) and Index Concepts (page 318) for more information.
Create a Unique Index
MongoDB allows you to specify a unique constraint (page 334) on an index. These constraints prevent applications
from inserting documents that have duplicate values for the inserted fields. Additionally, if you want to create an index
on a collection that has existing data that might have duplicate values for the indexed field, you may choose to combine
unique enforcement with duplicate dropping (page 338).
341
Unique Indexes
For example, you may want to create a unique index on the "tax-id": of the accounts collection to prevent
storing multiple account records for the same legal entity:
db.accounts.ensureIndex( { "tax-id": 1 }, { unique: true } )
The _id index (page 321) is a unique index. In some situations you may consider using _id field itself for this kind of
data rather than using a unique index on another field.
In many situations you will want to combine the unique constraint with the sparse option. When MongoDB
indexes a field, if a document does not have a value for a field, the index entry for that item will be null. Since
unique indexes cannot have duplicate values for a field, without the sparse option, MongoDB will reject the second
document and all subsequent documents without the indexed field. Consider the following prototype.
db.collection.ensureIndex( { a: 1 }, { unique: true, sparse: true } )
You can also enforce a unique constraint on compound indexes (page 322), as in the following prototype:
db.collection.ensureIndex( { a: 1, b: 1 }, { unique: true } )
These indexes enforce uniqueness for the combination of index keys and not for either key individually.
Drop Duplicates
To force the creation of a unique index (page 334) index on a collection with duplicate values in the field you are
indexing you can use the dropDups option. This will force MongoDB to create a unique index by deleting documents
with duplicate values when building the index. Consider the following prototype invocation of ensureIndex():
db.collection.ensureIndex( { a: 1 }, { unique: true, dropDups: true } )
See the full documentation of duplicate dropping (page 338) for more information.
Warning: Specifying { dropDups:
tion.
true } may delete data from your database. Use with extreme cau-
To create a sparse index (page 335) on a field, use an operation that resembles the following prototype:
342
Chapter 7. Indexes
Example
The following operation, creates a sparse index on the users collection that only includes a document in the index if
the twitter_name field exists in a document.
db.users.ensureIndex( { twitter_name: 1 }, { sparse: true } )
The index excludes all documents that do not include the twitter_name field.
Considerations
Note: Sparse indexes can affect the results returned by the query, particularly with respect to sorts on fields not
included in the index. See the sparse index (page 335) section for more information.
Procedure
To create a hashed index (page 333), specify hashed as the value of the index key, as in the following example:
Example
Specify a hashed index on _id
db.collection.ensureIndex( { _id: "hashed" } )
Considerations
MongoDB supports hashed indexes of any single field. The hashing function collapses sub-documents and computes
the hash for the entire value, but does not support multi-key (i.e. arrays) indexes.
You may not create compound indexes that have hashed index fields.
343
Considerations
Ensure that your oplog is large enough to permit the indexing or re-indexing operation to complete without
falling too far behind to catch up. See the oplog sizing (page 411) documentation for additional information.
This procedure does take one member out of the replica set at a time. However, this procedure will only affect
one member of the set at a time rather than all secondaries at the same time.
Do not use this procedure when building a unique index (page 334) with the dropDups option.
Procedure
Note: If you need to build an index in a sharded cluster, repeat the following procedure for each replica set that
provides each shard.
Stop One Secondary Stop the mongod process on one secondary. Restart the mongod process without the
--replSet option and running on a different port. 6 This instance is now in standalone mode.
For example, if your mongod normally runs with on the default port of 27017 with the --replSet option you
would use the following invocation:
mongod --port 47017
Build the Index Create the new index using the ensureIndex() in the mongo shell, or comparable method in
your driver. This operation will create or rebuild the index on this mongod instance
For example, to create an ascending index on the username field of the records collection, use the following
mongo shell operation:
db.records.ensureIndex( { username: 1 } )
See also:
Create an Index (page 340) and Create a Compound Index (page 341) for more information.
6 By running the mongod on a different port, you ensure that the other members of the replica set and all clients will not contact the member
while you are building the index.
344
Chapter 7. Indexes
Restart the Program mongod When the index build completes, start the mongod instance with the --replSet
option on its usual port:
mongod --port 27017 --replSet rs0
Modify the port number (e.g. 27017) or the replica set name (e.g. rs0) as needed.
Allow replication to catch up on this member.
Build Indexes on all Secondaries For each secondary in the set, build an index according to the following steps:
1. Stop One Secondary (page 344)
2. Build the Index (page 344)
3. Restart the Program mongod (page 345)
Build the Index on the Primary To build an index on the primary you can either:
1. Build the index in the background (page 345) on the primary.
2. Step down the primary using the rs.stepDown() method in the mongo shell to cause the current primary to
become a secondary graceful and allow the set to elect another member as primary.
Then repeat the index building procedure, listed below, to build the index on the primary:
(a) Stop One Secondary (page 344)
(b) Build the Index (page 344)
(c) Restart the Program mongod (page 345)
Building the index on the background, takes longer than the foreground index build and results in a less compact index
structure. Additionally, the background index build may impact write performance on the primary. However, building
the index in the background allows the set to be continuously up for write operations during while MongoDB builds
the index.
Build Indexes in the Background
By default, MongoDB builds indexes in the foreground and prevent all read and write operations to the database while
the index builds. Also, no operation that requires a read or write lock on all databases (e.g. listDatabases) can occur
during a foreground index build.
Background index construction (page 336) allows read and write operations to continue while building the index.
See also:
Index Concepts (page 318) and Indexing Tutorials (page 339) for more information.
Considerations
Background index builds take longer to complete and result in an index that is initially larger, or less compact, than an
index built in the foreground. Overtime the compactness of indexes built in the background will approach foregroundbuilt indexes.
After MongoDB finishes building the index, background-built indexes are functionally identical to any other index.
345
Procedure
To create an index in the background, add the background argument to the ensureIndex() operation, as in the
following index:
db.collection.ensureIndex( { a: 1 }, { background: true } )
Consider the section on background index construction (page 336) for more information about these indexes and their
implications.
Build Old Style Indexes
Important: Use this procedure only if you must have indexes that are compatible with a version of MongoDB earlier
than 2.0.
MongoDB version 2.0 introduced the {v:1} index format. MongoDB versions 2.0 and later support both the {v:1}
format and the earlier {v:0} format.
MongoDB versions prior to 2.0, however, support only the {v:0} format. If you need to roll back MongoDB to a
version prior to 2.0, you must drop and re-create your indexes.
To build pre-2.0 indexes, use the dropIndexes() and ensureIndex() methods. You cannot simply reindex the
collection. When you reindex on versions that only support {v:0} indexes, the v fields in the index definition still
hold values of 1, even though the indexes would now use the {v:0} format. If you were to upgrade again to version
2.0 or later, these indexes would not work.
Example
Suppose you rolled back from MongoDB 2.0 to MongoDB 1.8, and suppose you had the following index on the
items collection:
{ "v" : 1, "key" : { "name" : 1 }, "ns" : "mydb.items", "name" : "name_1" }
The v field tells you the index is a {v:1} index, which is incompatible with version 1.8.
To drop the index, issue the following command:
db.items.dropIndex( { name : 1 } )
See also:
Index Performance Enhancements (page 620).
346
Chapter 7. Indexes
Return a List of All Indexes (page 348) Obtain a list of all indexes on a collection or of all indexes on all collections
in a database.
Measure Index Use (page 349) Study query operations and observe index use for your database.
Remove Indexes
To remove an index from a collection use the dropIndex() method and the following procedure. If you simply
need to rebuild indexes you can use the process described in the Rebuild Indexes (page 347) document.
See also:
Indexing Tutorials (page 339) and Index Concepts (page 318) for more information about indexes and indexing operations in MongoDB.
Operations
This will remove the index on the "tax-id" field in the accounts collection. The shell provides the following
document after completing the operation:
{ "nIndexesWas" : 3, "ok" : 1 }
Where the value of nIndexesWas reflects the number of indexes before removing this index. You can also use the
db.collection.dropIndexes() to remove all indexes, except for the _id index (page 321) from a collection.
These shell helpers provide wrappers around the dropIndexes database command. Your client library (page 95)
may have a different or additional interface for these operations.
Rebuild Indexes
If you need to rebuild indexes for a collection you can use the db.collection.reIndex() method to rebuild all
indexes on a collection in a single operation. This operation drops all indexes, including the _id index (page 321), and
then rebuilds all indexes.
See also:
Index Concepts (page 318) and Indexing Tutorials (page 339).
Process
MongoDB will return the following document when the operation completes:
{
"nIndexesWas" : 2,
"msg" : "indexes dropped for collection",
"nIndexes" : 2,
"indexes" : [
{
347
"key" : {
"_id" : 1,
"tax-id" : 1
},
"ns" : "records.accounts",
"name" : "_id_"
}
],
"ok" : 1
}
This shell helper provides a wrapper around the reIndex database command. Your client library (page 95) may
have a different or additional interface for this operation.
Additional Considerations
Note: To build or rebuild indexes for a replica set see Build Indexes on Replica Sets (page 344).
To return a list of all indexes on a collection, use the db.collection.getIndexes() method or a similar
method for your driver7 .
For example, to view all indexes on the people collection:
db.people.getIndexes()
7 http://api.mongodb.org/
348
Chapter 7. Indexes
To return a list of all indexes on all collections in a database, use the following operation in the mongo shell:
db.system.indexes.find()
See system.indexes (page 227) for more information about these documents.
Measure Index Use
Synopsis
Query performance is a good general indicator of index use; however, for more precise insight into index use, MongoDB provides a number of tools that allow you to study query operations and observe index use for your database.
See also:
Index Concepts (page 318) and Indexing Tutorials (page 339) for more information.
Operations
Return Query Plan with explain() Append the explain() method to any cursor (e.g. query) to return a
document with statistics about the query process, including the index used, the number of documents scanned, and the
time the query takes to process in milliseconds.
Control Index Use with hint() Append the hint() to any cursor (e.g. query) with the index as the argument to
force MongoDB to use a specific index to fulfill the query. Consider the following example:
db.people.find( { name: "John Doe", zipcode: { $gt: 63000 } } } ).hint( { zipcode: 1 } )
You can use hint() and explain() in conjunction with each other to compare the effectiveness of a specific
index. Specify the $natural operator to the hint() method to prevent MongoDB from using any index:
db.people.find( { name: "John Doe", zipcode: { $gt: 63000 } } } ).hint( { $natural: 1 } )
Instance Index Use Reporting MongoDB provides a number of metrics of index use and operation that you may
want to consider when analyzing index use for your database:
In the output of serverStatus:
indexCounters
scanned
scanAndOrder
In the output of collStats:
totalIndexSize
indexSizes
In the output of dbStats:
dbStats.indexes
dbStats.indexSize
349
The following are four example commands for creating a 2dsphere index:
db.points.ensureIndex(
db.points.ensureIndex(
db.points.ensureIndex(
db.points.ensureIndex(
{
{
{
{
loc : "2dsphere"
loc : "2dsphere"
rating : 1 , loc
loc : "2dsphere"
}
,
:
,
)
type : 1 } )
"2dsphere" } )
rating : 1 , category : -1 } )
The first example creates a simple geospatial index on the location field loc. The second example creates a compound
index where the second field contains non-location data. The third example creates an index where the location field
is not the primary field: the location field does not have to be the first field in a 2dsphere index. The fourth example
creates a compound index with three fields. You can include as many fields as you like in a 2dsphere index.
Query a 2dsphere Index
The following sections describe queries supported by the 2dsphere index. For an overview of recommended geospatial queries, see geospatial-query-compatibility-chart.
GeoJSON Objects Bounded by a Polygon
The $geoWithin operator queries for location data found within a GeoJSON polygon. Your location data must be
stored in GeoJSON format. Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ $geometry :
{ type : "Polygon" ,
350
Chapter 7. Indexes
coordinates : [ <coordinates> ]
} } } } )
The following example selects all points and shapes that exist entirely within a GeoJSON polygon:
db.places.find( { loc :
{ $geoWithin :
{ $geometry :
{ type : "Polygon" ,
coordinates : [ [
[ 0 , 0 ] ,
[ 3 , 6 ] ,
[ 6 , 1 ] ,
[ 0 , 0 ]
] ]
} } } } )
The following example uses $geoIntersects to select all indexed points and shapes that intersect with the polygon
defined by the coordinates array.
db.places.find( { loc :
{ $geoIntersects :
{ $geometry :
{ type : "Polygon" ,
coordinates: [ [
[ 0 , 0 ] ,
[ 3 , 6 ] ,
[ 6 , 1 ] ,
[ 0 , 0 ]
] ]
} } } } )
Proximity queries return the points closest to the defined point and sorts the results by distance. A proximity query on
GeoJSON data requires a 2dsphere index.
To query for proximity to a GeoJSON point, use either the $near operator or geoNear command. Distance is in
meters.
The $near uses the following syntax:
351
The geoNear command offers more options and returns more information than does the $near operator. To run the
command, see geoNear.
Points within a Circle Defined on a Sphere
To select all grid coordinates in a spherical cap on a sphere, use $geoWithin with the $centerSphere operator.
Specify an array that contains:
The grid coordinates of the circles center point
The circles radius measured in radians. To calculate radians, see Calculate Distance Using Spherical Geometry
(page 356).
Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ $centerSphere :
[ [ <x>, <y> ] , <radius> ] }
} } )
The following example queries grid coordinates and returns all documents within a 10 mile radius of longitude 88 W
and latitude 30 N. The example converts the distance, 10 miles, to radians by dividing by the approximate radius of
the earth, 3959 miles:
db.places.find( { loc :
{ $geoWithin :
{ $centerSphere :
[ [ 88 , 30 ] , 10 / 3959 ]
} } } )
Create a 2d Index
To build a geospatial 2d index, use the ensureIndex() method and specify 2d. Use the following syntax:
db.<collection>.ensureIndex( { <location field> : "2d" ,
<additional field> : <value> } ,
{ <index-specification options> } )
352
Chapter 7. Indexes
By default, a 2d index assumes longitude and latitude and has boundaries of -180 inclusive and 180 non-inclusive
(i.e. [ -180 , 180 ]). If documents contain coordinate data outside of the specified range, MongoDB returns an
error.
Important: The default boundaries allow applications to insert documents with invalid latitudes greater than 90 or
less than -90. The behavior of geospatial queries with such invalid points is not defined.
On 2d indexes you can change the location range.
You can build a 2d geospatial index with a location range other than the default. Use the min and max options when
creating the index. Use the following syntax:
db.collection.ensureIndex( { <location field> : "2d" } ,
{ min : <lower bound> , max : <upper bound> } )
By default, a 2d index on legacy coordinate pairs uses 26 bits of precision, which is roughly equivalent to 2 feet or 60
centimeters of precision using the default range of -180 to 180. Precision is measured by the size in bits of the geohash
values used to store location data. You can configure geospatial indexes with up to 32 bits of precision.
Index precision does not affect query accuracy. The actual grid coordinates are always used in the final query processing. Advantages to lower precision are a lower processing overhead for insert operations and use of less space. An
advantage to higher precision is that queries scan smaller portions of the index to return results.
To configure a location precision other than the default, use the bits option when creating the index. Use following
syntax:
db.<collection>.ensureIndex( {<location field> : "<index type>"} ,
{ bits : <bit precision> } )
For information on the internals of geohash values, see Calculation of Geohash Values for 2d Indexes (page 331).
Query a 2d Index
The following sections describe queries supported by the 2d index. For an overview of recommended geospatial
queries, see geospatial-query-compatibility-chart.
Points within a Shape Defined on a Flat Surface
To select all legacy coordinate pairs found within a given shape on a flat surface, use the $geoWithin operator along
with a shape operator. Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ $box|$polygon|$center : <coordinates>
} } } )
353
The following queries for documents within a rectangle defined by [ 0 , 0 ] at the bottom left corner and by [
100 , 100 ] at the top right corner.
db.places.find( { loc :
{ $geoWithin :
{ $box : [ [ 0 , 0 ] ,
[ 100 , 100 ] ]
} } } )
The following queries for documents that are within the circle centered on [ -74 , 40.74 ] and with a radius of
10:
db.places.find( { loc: { $geoWithin :
{ $center : [ [-74, 40.74 ] , 10 ]
} } } )
For syntax and examples for each shape, see the following:
$box
$polygon
$center (defines a circle)
Points within a Circle Defined on a Sphere
MongoDB supports rudimentary spherical queries on flat 2d indexes for legacy reasons. In general, spherical calculations should use a 2dsphere index, as described in 2dsphere Indexes (page 328).
To query for legacy coordinate pairs in a spherical cap on a sphere, use $geoWithin with the $centerSphere
operator. Specify an array that contains:
The grid coordinates of the circles center point
The circles radius measured in radians. To calculate radians, see Calculate Distance Using Spherical Geometry
(page 356).
Use the following syntax:
db.<collection>.find( { <location field> :
{ $geoWithin :
{ $centerSphere : [ [ <x>, <y> ] , <radius> ] }
} } )
The following example query returns all documents within a 10-mile radius of longitude 88 W and latitude 30 N.
The example converts distance to radians by dividing distance by the approximate radius of the earth, 3959 miles:
db.<collection>.find( { loc : { $geoWithin :
{ $centerSphere :
[ [ 88 , 30 ] , 10 / 3959 ]
} } } )
Proximity queries return the 100 legacy coordinate pairs closest to the defined point and sort the results by distance.
Use either the $near operator or geoNear command. Both require a 2d index.
The $near operator uses the following syntax:
354
Chapter 7. Indexes
The geoNear command offers more options and returns more information than does the $near operator. To run the
command, see geoNear.
Exact Matches on a Flat Surface
You can use the db.collection.find() method to query for an exact match on a location. These queries use
the following syntax:
db.<collection>.find( { <location field>: [ <x> , <y> ] } )
This query will return any documents with the value of [ <x> , <y> ].
Create a Haystack Index
To build a haystack index, use the bucketSize option when creating the index. A bucketSize of 5 creates an
index that groups location values that are within 5 units of the specified longitude and latitude. The bucketSize also
determines the granularity of the index. You can tune the parameter to the distribution of your data so that in general
you search only very small regions. The areas defined by buckets can overlap. A document can exist in multiple
buckets.
A haystack index can reference two fields: the location field and a second field. The second field is used for exact
matches. Haystack indexes return documents based on location and an exact match on a single additional criterion.
These indexes are not necessarily suited to returning the closest documents to a particular location.
To build a haystack index, use the following syntax:
db.coll.ensureIndex( { <location field> : "geoHaystack" ,
<additional field> : 1 } ,
{ bucketSize : <bucket value> } )
Example
If you have a collection with documents that contain fields similar to the following:
{ _id : 100, pos: { lng : 126.9, lat : 35.2 } , type : "restaurant"}
{ _id : 200, pos: { lng : 127.5, lat : 36.1 } , type : "restaurant"}
{ _id : 300, pos: { lng : 128.0, lat : 36.7 } , type : "national park"}
The following operations create a haystack index with buckets that store keys within 1 unit of longitude or latitude.
db.places.ensureIndex( { pos : "geoHaystack", type : 1 } ,
{ bucketSize : 1 } )
This index stores the document with an _id field that has the value 200 in two different buckets:
In a bucket that includes the document where the _id field has a value of 100
In a bucket that includes the document where the _id field has a value of 300
355
To query using a haystack index you use the geoSearch command. See Query a Haystack Index (page 356).
By default, queries that use a haystack index return 50 documents.
Query a Haystack Index
A haystack index is a special 2d geospatial index that is optimized to return results over small areas. To create a
haystack index see Create a Haystack Index (page 355).
To query a haystack index, use the geoSearch command. You must specify both the coordinates and the additional
field to geoSearch. For example, to return all documents with the value restaurant in the type field near the
example point, the command would resemble:
db.runCommand( { geoSearch : "places" ,
search : { type: "restaurant" } ,
near : [-74, 40.74] ,
maxDistance : 10 } )
Note: Haystack indexes are not suited to queries for the complete list of documents closest to a particular location.
The closest documents could be more distant compared to the bucket size.
Note: Spherical query operations (page 356) are not currently supported by haystack indexes.
The find() method and geoNear command cannot access the haystack index.
true } option.
Important: These three queries use radians for distance. Other query types do not.
For spherical query operators to function properly, you must convert distances to radians, and convert from radians to
the distances units used by your application.
To convert:
distance to radians: divide the distance by the radius of the sphere (e.g. the Earth) in the same units as the
distance measurement.
radians to distance: multiply the radian measure by the radius of the sphere (e.g. the Earth) in the units system
that you want to convert the distance to.
356
Chapter 7. Indexes
You may also use the distanceMultiplier option to the geoNear to convert radians in the mongod process,
rather than in your application code. See distance multiplier (page 357).
The following spherical query, returns all documents in the collection places within 100 miles from the point [
-74, 40.74 ].
db.runCommand( { geoNear: "places",
near: [ -74, 40.74 ],
spherical: true
} )
Warning: Spherical queries that wrap around the poles or at the transition from -180 to 180 longitude raise an
error.
Note: While the default Earth-like bounds for geospatial indexes are between -180 inclusive, and 180, valid values
for latitude are between -90 and 90.
Distance Multiplier
The distanceMultiplier option of the geoNear command returns distances only after multiplying the results
by an assigned value. This allows MongoDB to return converted values, and removes the requirement to convert units
in application logic.
357
Using distanceMultiplier in spherical queries provides results from the geoNear command that do not need
radian-to-distance conversion. The following example uses distanceMultiplier in the geoNear command
with a spherical (page 356) example:
db.runCommand( { geoNear: "places",
near: [ -74, 40.74 ],
spherical: true,
distanceMultiplier: 3959
} )
358
Chapter 7. Indexes
You may prefer to set the textSearchEnabled parameter in the configuration file.
Additionally, you can enable the feature in the mongo shell with the setParameter command. This command
does not propagate from the primary to the secondaries. You must enable on each and every mongod for replica sets.
Note: You must set the parameter every time you start the server. You may prefer to add the parameter to the
configuration files.
The following example creates a text index on the fields subject and content:
db.collection.ensureIndex(
{
subject: "text",
content: "text"
}
)
This text index catalogs all string data in the subject field and the content field, where the field value is either
a string or an array of string elements.
Index All Fields
To allow for text search on all fields with string content, use the wildcard specifier ($**) to index all fields that contain
string content.
The following example indexes any string value in the data of every field of every document in collection and
names the index TextIndex:
359
db.collection.ensureIndex(
{ "$**": "text" },
{ name: "TextIndex" }
)
Because text command is case-insensitive, the text search will match the following document in the quotes collection:
{
"_id" : ObjectId("50ecef5f8abea0fda30ceab3"),
"quote" : "tomorrow, and tomorrow, and tomorrow, creeps in this petty pace",
"related_quotes" : [
"is this a dagger which I see before me",
"the handle toward my hand?"
],
"src" : {
"title" : "Macbeth",
"from" : "Act V, Scene V"
},
"speaker" : "macbeth"
}
If the search string is a space-delimited text, text command performs a logical OR search on each term and returns
documents that contains any of the terms.
For example, the search string "tomorrow largo" searches for the term tomorrow OR the term largo:
db.quotes.runCommand( "text", { search: "tomorrow largo" } )
The command will match the following documents in the quotes collection:
{
"_id" : ObjectId("50ecef5f8abea0fda30ceab3"),
"quote" : "tomorrow, and tomorrow, and tomorrow, creeps in this petty pace",
"related_quotes" : [
360
Chapter 7. Indexes
Match Phrases
To match the exact phrase that includes a space(s) as a single term, escape the quotes.
For example, the following command searches for the exact phrase "and tomorrow":
db.quotes.runCommand( "text", { search: "\"and tomorrow\"" } )
If the search string contains both phrases and individual terms, the text command performs a compound logical AND
of the phrases with the compound logical OR of the single terms, including the individual terms from each phrase.
For example, the following search string contains both individual terms corto and largo as well as the phrase
\"and tomorrow\":
db.quotes.runCommand( "text", { search: "corto largo \"and tomorrow\"" } )
The text command performs the equivalent to the following logical operation, where the individual terms corto,
largo, as well as the term tomorrow from the phrase "and tomorrow", are part of a logical OR expression:
(corto OR largo OR tomorrow) AND ("and tomorrow")
As such, the results for this search will include documents that only contain the phrase "and tomorrow" as well as
documents that contain the phrase "and tomorrow" and the terms corto and/or largo. Documents that contain
the phrase "and tomorrow" as well as the terms corto and largo will generally receive a higher score for this
search.
Match Some Words But Not Others
A negated term is a term that is prefixed by a minus sign -. If you negate a term, the text command will exclude the
documents that contain those terms from the results.
Note: If the search text contains only negated terms, the text command will not return any results.
361
The following example returns those documents that contain the term tomorrow but not the term petty.
db.quotes.runCommand( "text" , { search: "tomorrow -petty" } )
Note: The result from the text command must fit within the maximum BSON Document Size.
By default, the text command will return up to 100 matching documents, from highest to lowest scores. To override
this default limit, use the limit option in the text command, as in the following example:
db.quotes.runCommand( "text", { search: "tomorrow", limit: 2 } )
The text command will return at most 2 of the highest scoring results.
The limit can be any number as long as the result set fits within the maximum BSON Document Size.
Specify Which Fields to Return in the Result Set
In the text command, use the project option to specify the fields to include (1) or exclude (0) in the matching
documents.
Note: The _id field is always returned unless explicitly excluded in the project document.
The following example returns only the _id field and the src field in the matching documents:
db.quotes.runCommand( "text", { search: "tomorrow",
project: { "src": 1 } } )
The text command can also use the filter option to specify additional query conditions.
The following example will return the documents that contain the term tomorrow AND the speaker is macbeth:
db.quotes.runCommand( "text", { search: "tomorrow",
filter: { "speaker" : "macbeth" } } )
See also:
Limit the Number of Entries Scanned (page 366)
Search for Text in Specific Languages
You can specify the language that determines the tokenization, stemming, and removal of stop words, as in the following example:
db.quotes.runCommand( "text", { search: "amor", language: "spanish" } )
See text-search-languages for a list of supported languages as well as Specify a Language for Text Index (page 363)
for specifying languages for the text index.
362
Chapter 7. Indexes
The text command returns a document that contains the result set.
See text-search-output for information on the output.
Specify a Language for Text Index
This tutorial describes how to specify the default language associated with the text index (page 363) and also how to
create text indexes for collections that contain documents in different languages (page 363).
Specify the Default Language for a text Index
The default language associated with the indexed data determines the list of stop words and the rules for the stemmer
and tokenizer. The default language for the indexed data is english.
To specify a different language, use the default_language option when creating the text index. See textsearch-languages for the languages available for default_language.
The following example creates a text index on the content field and sets the default_language to spanish:
db.collection.ensureIndex(
{ content : "text" },
{ default_language: "spanish" }
)
Specify the Index Language within the Document If a collection contains documents that are in different languages, include a field in the documents that contain the language to use:
If you include a field named language in the document, by default, the ensureIndex() method will use
the value of this field to override the default language.
To use a field with a name other than language, you must specify the name of this field to the
ensureIndex() method with the language_override option.
See text-search-languages for a list of supported languages.
Include the language Field Include a field language that specifies the language to use for the individual documents.
For example, the documents of a multi-language collection quotes contain the field language:
{ _id: 1, language: "portuguese", quote: "A sorte protege os audazes" }
{ _id: 2, language: "spanish", quote: "Nada hay ms surreal que la realidad." }
{ _id: 3, language: "english", quote: "is this a dagger which I see before me" }
For the documents that contain the language field, the text index uses that language to determine the stop
words and the rules for the stemmer and the tokenizer.
363
For documents that do not contain the language field, the index uses the default language, which is English,
to determine the stop words and rules for the stemmer and the tokenizer.
For example, the Spanish word que is a stop word. So the following text command would not match any document:
db.quotes.runCommand( "text", { search: "que", language: "spanish" } )
Use any Field to Specify the Language for a Document Include a field that specifies the language to use for the
individual documents. To use a field with a name other than language, include the language_override option
when creating the index.
For example, the documents of a multi-language collection quotes contain the field idioma:
{ _id: 1, idioma: "portuguese", quote: "A sorte protege os audazes" }
{ _id: 2, idioma: "spanish", quote: "Nada hay ms surreal que la realidad." }
{ _id: 3, idioma: "english", quote: "is this a dagger which I see before me" }
Create a text index on the field quote with the language_override option:
db.quotes.ensureIndex( { quote : "text" },
{ language_override: "idioma" } )
For the documents that contain the idioma field, the text index uses that language to determine the stop
words and the rules for the stemmer and the tokenizer.
For documents that do not contain the idioma field, the index uses the default language, which is English, to
determine the stop words and rules for the stemmer and the tokenizer.
For example, the Spanish word que is a stop word. So the following text command would not match any document:
db.quotes.runCommand( "text", { search: "que", language: "spanish" } )
To avoid creating an index with a name that exceeds the index name length limit, you can pass the name
option to the db.collection.ensureIndex() method:
db.collection.ensureIndex(
{
content: "text",
"users.comments": "text",
"users.profiles": "text"
},
{
364
Chapter 7. Indexes
name: "MyTextIndex"
}
)
Note:
To drop the text index, use the index name.
db.collection.getIndexes().
To create a text index with different field weights for the content field and the keywords field, include the
weights option to the ensureIndex() method. For example, the following command creates an index on three
fields and assigns weights to two of the fields:
db.blog.ensureIndex(
{
content: "text",
keywords: "text",
about: "text"
},
{
weights: {
content: 10,
keywords: 5,
},
name: "TextIndex"
}
)
365
_id:
_id:
_id:
_id:
_id:
_id:
1,
2,
3,
4,
5,
6,
dept:
dept:
dept:
dept:
dept:
dept:
A common use case is to perform text searches by individual departments, such as:
db.inventory.runCommand( "text", {
search: "green",
filter: { dept : "kitchen" }
}
)
To limit the text search to scan only those documents within a specific dept, create a compound index that specifies
an ascending/descending index key on the field dept and a text index key on the field description:
db.inventory.ensureIndex(
{
dept: 1,
description: "text"
}
)
Important:
The ascending/descending index keys must be listed before, or prefix, the text index keys.
By prefixing the text index fields with ascending/descending index fields, MongoDB will only index documents that have the prefix fields.
You cannot include multi-key (page 324) index fields or geospatial (page 327) index fields.
The text command must include the filter option that specifies an equality condition for the prefix fields.
366
Chapter 7. Indexes
Then, the text search within a particular department will limit the scan of indexed documents. For example, the
following text command scans only those documents with dept equal to kitchen:
db.inventory.runCommand( "text", {
search: "green",
filter: { dept : "kitchen" }
}
)
The returned result includes the statistics that shows that the command scanned 1 document, as indicated by the
nscanned field:
{
"queryDebugString" : "green||||||",
"language" : "english",
"results" : [
{
"score" : 0.75,
"obj" : {
"_id" : 3,
"dept" : "kitchen",
"description" : "a green placemat"
}
}
],
"stats" : {
"nscanned" : 1,
"nscannedObjects" : 0,
"n" : 1,
"nfound" : 1,
"timeMicros" : 211
},
"ok" : 1
}
To create a text index that can fulfill the filter condition of a text search:
1. Append scalar index fields to a text index, as in the following example which specifies an ascending index
key on cited:
db.quotes.ensureIndex(
{
comments: "text",
cited: 1
367
}
)
2. Use the filter option in the text to specify an condition on the cited field, as in the following:
db.quotes.runCommand( "text",
{
search: "tomorrow",
filter: { cited: { $gt: 10 } }
}
)
Considerations
When creating a compound index with that includes the text index, you cannot include multi-key (page 324) index
field or geospatial (page 327) index field.
With a compound index that includes the text index and an ascending/descending key or keys, sort operations do not
use the ascending/descending key from this index; only the text score determines the sort order.
Chapter 7. Indexes
This document describes strategies for creating indexes that support queries.
Create a Single-Key Index if All Queries Use the Same, Single Key
If you only ever query on a single key in a given collection, then you need to create just one single-key index for that
collection. For example, you might create an index on category in the product collection:
db.products.ensureIndex( { "category": 1 } )
If you sometimes query on only one key and at other times query on that key combined with a second key, then creating
a compound index is more efficient than creating a single-key index. MongoDB will use the compound index for both
queries. For example, you might create an index on both category and item.
db.products.ensureIndex( { "category": 1, "item": 1 } )
This allows you both options. You can query on just category, and you also can query on category combined
with item. A single compound index (page 322) on multiple fields can support all the queries that search a prefix
subset of those fields.
Note: With the exception of queries that use the $or operator, a query does not use multiple indexes. A query uses
only one index.
Example
The following index on a collection:
{ x: 1, y: 1, z: 1 }
There are some situations where the prefix indexes may offer better query performance: for example if z is a large
array.
The { x:
1, y:
1, z:
1 } index can also support many of the same queries as the following index:
{ x: 1, z: 1 }
Also, { x:
1, z:
db.collection.find( { x: 5 } ).sort( { z: 1} )
The { x: 1, z: 1 } index supports both the query and the sort operation, while the { x: 1, y: 1,
z: 1 } index only supports the query. For more information on sorting, see Use Indexes to Sort Query Results
(page 371).
369
all the fields returned in the results are in the same index.
Because the index covers the query, MongoDB can both match the query conditions (page 60) and return the results
using only the index; MongoDB does not need to look at the documents, only the index, to fulfill the query. An index
can also cover an aggregation pipeline operation (page 283) on unsharded collections.
Querying only the index can be much faster than querying documents outside of the index. Index keys are typically
smaller than the documents they catalog, and indexes are typically available in RAM or located sequentially on disk.
MongoDB automatically uses an index that covers a query when possible. To ensure that an index can cover a query,
create an index that includes all the fields listed in the query document (page 60) and in the query result. You can
specify the fields to return in the query results with a projection (page 64) document. By default, MongoDB includes
the _id field in the query result. So, if the index does not include the _id field, then you must exclude the _id field
(i.e. _id: 0) from the query results.
Example
Given collection users with an index on the fields user and status, as created by the following option:
db.users.ensureIndex( { status: 1, user: 1 } )
Then, this index will cover the following query which selects on the status field and returns only the user field:
db.users.find( { status: "A" }, { user: 1, _id: 0 } )
If the projection document does not specify the exclusion of the _id field, the query returns the _id field. The
following query is not covered by the index on the status and the user fields because with the projection document
{ user: 1 }, the query returns both the user field and the _id field:
db.users.find( { status: "A" }, { user: 1 } )
The { user:
370
Chapter 7. Indexes
To determine whether a query is a covered query, use the explain() method. If the explain() output displays
true for the indexOnly field, the query is covered by an index, and MongoDB queries only that index to match
the query and return the results.
For more information see Measure Index Use (page 349).
Use Indexes to Sort Query Results
In MongoDB sort operations that sort documents based on an indexed field provide the greatest performance. Indexes
in MongoDB, as in other databases, have an order: as a result, using an index to access documents returns in the same
order as the index.
To sort on multiple fields, create a compound index (page 322). With compound indexes, the results can be in the
sorted order of either the full index or an index prefix. An index prefix is a subset of a compound index; the subset
consists of one or more fields at the start of the index, in order. For example, given an index { a:1, b: 1, c:
1, d: 1 }, the following subsets are index prefixes:
{ a: 1 }
{ a: 1, b: 1 }
{ a: 1, b: 1, c: 1 }
For more information on sorting by index prefixes, see Sort Subset Starts at the Index Beginning (page 372).
If the query includes equality match conditions on an index prefix, you can sort on a subset of the index that starts
after or overlaps with the prefix. For example, given an index { a: 1, b: 1, c: 1, d: 1 }, if the
query condition includes equality match conditions on a and b, you can specify a sort on the subsets { c: 1 } or
{ c: 1, d: 1 }:
db.collection.find( { a: 5, b: 3 } ).sort( { c: 1 } )
db.collection.find( { a: 5, b: 3 } ).sort( { c: 1, d: 1 } )
In these operations, the equality match and the sort documents together cover the index prefixes { a:
c: 1 } and { a: 1, b: 1, c: 1, d: 1 } respectively.
1, b:
1,
You can also specify a sort order that includes the prefix; however, since the query condition specifies equality matches
on these fields, they are constant in the resulting documents and do not contribute to the sort order:
db.collection.find( { a: 5, b: 3 } ).sort( { a: 1, b: 1, c: 1 } )
db.collection.find( { a: 5, b: 3 } ).sort( { a: 1, b: 1, c: 1, d: 1 } )
For more information on sorting by index subsets that are not prefixes, see Sort Subset Does Not Start at the Index
Beginning (page 372).
Note: For in-memory sorts that do not use an index, the sort() operation is significantly slower. The sort()
operation will abort when it uses 32 megabytes of memory.
If the sort document contains a subset of the compound index fields, the subset can determine whether MongoDB can
use the index efficiently to both retrieve and sort the query results. If MongoDB can efficiently use the index to both
retrieve and sort the query results, the output from the explain() will display scanAndOrder as false or 0.
If MongoDB can only use the index for retrieving documents that meet the query criteria, MongoDB must manually
sort the resulting documents without the use of the index. For in-memory sort operations, explain() will display
scanAndOrder as true or 1.
371
Sort Subset Starts at the Index Beginning If the sort document is a subset of a compound index and starts from
the beginning of the index, MongoDB can use the index to both retrieve and sort the query results.
For example, the collection collection has the following index:
{ a: 1, b: 1, c: 1, d: 1 }
The following operations include a sort with a subset of the index. Because the sort subset starts at beginning of the
index, the operations can use the index for both the query retrieval and sort:
db.collection.find().sort( { a:1 } )
db.collection.find().sort( { a:1, b:1 } )
db.collection.find().sort( { a:1, b:1, c:1 } )
db.collection.find( { a: 4 } ).sort( { a: 1, b: 1 } )
db.collection.find( { a: { $gt: 4 } } ).sort( { a: 1, b: 1 } )
db.collection.find( { b: 5 } ).sort( { a: 1, b: 1 } )
db.collection.find( { b: { $gt:5 }, c: { $gt: 1 } } ).sort( { a: 1, b: 1 } )
The last two operations include query conditions on the field b but does not include a query condition on the field a:
db.collection.find( { b: 5 } ).sort( { a: 1, b: 1 } )
db.collection.find( { b: { $gt:5 }, c: { $gt: 1 } } ).sort( { a: 1, b: 1 } )
Consider the case where the collection has the index { b: 1 } in addition to the { a: 1, b: 1, c: 1,
d: 1 } index. Because of the query condition on b, it is not immediately obvious which index MongoDB may
select as the best index. To explicitly specify the index to use, see hint().
Sort Subset Does Not Start at the Index Beginning The sort document can be a subset of a compound index that
does not start from the beginning of the index. For instance, { c: 1 } is a subset of the index { a: 1, b:
1, c: 1, d: 1 } that omits the preceding index fields a and b. MongoDB can use the index efficiently if the
query document includes all the preceding fields of the index, in this case a and b, in equality conditions. In other
words, the equality conditions in the query document and the subset in the sort document contiguously cover a prefix
of the index.
For example, the collection collection has the following index:
{ a: 1, b: 1, c: 1, d: 1 }
In the first operation, the query document { a: 5 } with the sort document { b:
the prefix { a:1 , b: 1, c: 1 } of the index.
In the second operation, the query document { a:
1 } covers the full index.
5, c:
4, b:
1, c:
1 } cover
Only the index fields preceding the sort subset must have the equality conditions in the query document. The other
index fields may have other conditions. The following operations can efficiently use the index since the equality
conditions in the query document and the subset in the sort document contiguously cover a prefix of the index:
db.collection.find( { a: 5, b: 3 } ).sort( { c: 1 } )
db.collection.find( { a: 5, b: 3, c: { $lt: 4 } } ).sort( { c: 1 } )
372
Chapter 7. Indexes
1, b:
1, c:
1, d:
The above example shows an index size of almost 4.3 gigabytes. To ensure this index fits in RAM, you must not only
have more than that much RAM available but also must have RAM available for the rest of the working set. Also
remember:
If you have and use multiple collections, you must consider the size of all indexes on all collections. The indexes and
the working set must be able to fit in memory at the same time.
There are some limited cases where indexes do not need to fit in memory. See Indexes that Hold Only Recent Values
in RAM (page 373).
See also:
collStats and db.collection.stats()
Indexes that Hold Only Recent Values in RAM
Indexes do not have to fit entirely into RAM in all cases. If the value of the indexed field increments with every insert,
and most queries select recently added documents; then MongoDB only needs to keep the parts of the index that hold
the most recent or right-most values in RAM. This allows for efficient index use for read and write operations and
minimize the amount of RAM required to support the index.
Create Queries that Ensure Selectivity
Selectivity is the ability of a query to narrow results using the index. Effective indexes are more selective and allow
MongoDB to use the index for a larger portion of the work associated with fulfilling the query.
To ensure selectivity, write queries that limit the number of possible documents with the indexed field. Write queries
that are appropriately selective relative to your indexed data.
Example
Suppose you have a field called status where the possible values are new and processed. If you add an index
on status youve created a low-selectivity index. The index will be of little help in locating records.
A better strategy, depending on your queries, would be to create a compound index (page 322) that includes the lowselectivity field and another field. For example, you could create a compound index on status and created_at.
Another option, again depending on your use case, might be to use separate collections, one for each status.
373
Example
Consider an index { a : 1 } (i.e. an index on the key a sorted in ascending order) on a collection where a has
three values evenly distributed across the collection:
{
{
{
{
{
{
{
{
{
_id:
_id:
_id:
_id:
_id:
_id:
_id:
_id:
_id:
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
a:
a:
a:
a:
a:
a:
a:
a:
a:
1,
1,
1,
2,
2,
2,
3,
3,
3,
b:
b:
b:
b:
b:
b:
b:
b:
b:
"ab"
"cd"
"ef"
"jk"
"lm"
"no"
"pq"
"rs"
"tv"
}
}
}
}
}
}
}
}
}
If you query for { a: 2, b: "no" } MongoDB must scan 3 documents in the collection to return the one
matching result. Similarly, a query for { a: { $gt: 1}, b: "tv" } must scan 6 documents, also to
return one result.
Consider the same index on a collection where a has nine values evenly distributed across the collection:
{
{
{
{
{
{
{
{
{
_id:
_id:
_id:
_id:
_id:
_id:
_id:
_id:
_id:
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
ObjectId(),
a:
a:
a:
a:
a:
a:
a:
a:
a:
1,
2,
3,
4,
5,
6,
7,
8,
9,
b:
b:
b:
b:
b:
b:
b:
b:
b:
"ab"
"cd"
"ef"
"jk"
"lm"
"no"
"pq"
"rs"
"tv"
}
}
}
}
}
}
}
}
}
If you query for { a: 2, b: "cd" }, MongoDB must scan only one document to fulfill the query. The index
and query are more selective because the values of a are evenly distributed and the query can select a specific document
using the index.
However, although the index on a is more selective, a query such as { a:
still need to scan 4 documents.
{ $gt:
5 }, b:
"tv" } would
If overall selectivity is low, and if MongoDB must read a number of documents to return results, then some queries
may perform faster without indexes. To determine performance, see Measure Index Use (page 349).
For a conceptual introduction to indexes in MongoDB see Index Concepts (page 318).
374
Chapter 7. Indexes
Description
Removes indexes from a collection.
Defragments a collection and rebuilds the indexes.
Rebuilds all indexes on a collection.
Internal command that scans for a collections data and indexes for correctness.
Experimental command that collects and aggregates statistics on all indexes.
Performs a geospatial query that returns the documents closest to a given point.
Performs a geospatial query that uses MongoDBs haystack index functionality.
An internal command to support geospatial queries.
Internal command that validates index on shard key.
Description
Selects geometries within a bounding GeoJSON geometry.
Selects geometries that intersect with a GeoJSON geometry.
Returns geospatial objects in proximity to a point.
Returns geospatial objects in proximity to a point on a sphere.
375
376
Description
Forces MongoDB to report on query execution plans. See explain().
Forces MongoDB to use a specific index. See hint()
Specifies an exclusive upper limit for the index to use in a query. See max().
Specifies an inclusive lower limit for the index to use in a query. See min().
Forces the cursor to only return fields included in the index.
Forces the query to use the index on the _id field. See snapshot().
Chapter 7. Indexes
CHAPTER 8
Replication
A replica set in MongoDB is a group of mongod processes that maintain the same data set. Replica sets provide
redundancy and high availability, and are the basis for all production deployments. This section introduces replication
in MongoDB as well as the components and architecture of replica sets. The section also provides tutorials for common
tasks related to replica sets.
Replication Introduction (page 377) An introduction to replica sets, their behavior, operation, and use.
Replication Concepts (page 381) The core documentation of replica set operations, configurations, architectures and
behaviors.
Replica Set Members (page 382) Introduces the components of replica sets.
Replica Set Deployment Architectures (page 390) Introduces architectural considerations related to replica
sets deployment planning.
Replica Set High Availability (page 397) Presents the details of the automatic failover and recovery process
with replica sets.
Replica Set Read and Write Semantics (page 402) Presents the semantics for targeting read and write operations to the replica set, with an awareness of location and set configuration.
Replica Set Tutorials (page 419) Tutorials for common tasks related to the use and maintenance of replica sets.
Replication Reference (page 466) Reference for functions and operations related to replica sets.
377
Figure 8.1: Diagram of default routing of reads and writes to the primary.
The secondaries replicate the primarys oplog and apply the operations to their data sets. Secondaries data sets reflect
the primarys data set. If the primary is unavailable, the replica set will elect a secondary to be primary. By default,
clients read from the primary; however, clients can specify a read preference (page 406) to send read operations to
secondaries. See secondaries (page 382) for more information.
You may add an extra mongod instance to a replica set as an arbiter. Arbiters do not maintain a data set. Arbiters
only exist to vote in elections. If your replica set has an even number of members, add an arbiter to obtain a majority
of votes in an election for primary. Arbiters do not require dedicated hardware. See arbiter (page 389) for more
information.
378
Chapter 8. Replication
Figure 8.2: Diagram of a 3 member replica set that consists of a primary and two secondaries.
Figure 8.3: Diagram of a replica set that consists of a primary, a secondary, and an arbiter.
379
Note: The primary may, under some conditions, step down and become a secondary. A secondary may become
the primary during an election. An arbiter, however, will never change state and will always be an arbiter.
Asynchronous Replication
Secondaries apply operations from the primary asynchronously. By applying operations after the primary, replica sets
can continue to function without some members. However, as a result, secondaries may not return the most current
data to clients.
See Replica Set Oplog (page 410) and Replica Set Data Synchronization (page 412) for more information. See Read
Preference (page 406) for more on read operations and secondaries.
Automatic Failover
When a primary does not communicate with the other members of the replica set for more than 10 seconds, the replica
set will attempt to select another member to become the new primary. The first secondary that receives a majority of
the votes becomes primary.
Figure 8.4: Diagram of an election of a new primary. In a three member replica set with two secondaries, the primary
becomes unreachable. The loss of a primary triggers an election where one of the secondaries becomes the new
primary
380
Chapter 8. Replication
See Replica Set Elections (page 397) and Rollbacks During Replica Set Failover (page 401) for more information.
Additional Features
Replica sets provide a number of options to support application needs. For example, you may deploy a replica set
with members in multiple data centers (page 396), or control the outcome of elections by adjusting the priority
(page 469) of some members. Replica sets also support configuring dedicated members for reporting, disaster recovery,
or backup functions.
See Priority 0 Replica Set Members (page 386), Hidden Replica Set Members (page 387) and Delayed Replica Set
Members (page 388) for more information.
381
The minimum requirements for a replica set are: A primary (page ??), a secondary (page ??), and an arbiter (page ??).
Most deployments, however, will keep three members that store data: A primary (page ??) and two secondary members
(page ??).
Replica Set Primary
The primary is the only member in the replica set that receives write operations. MongoDB applies write operations
on the primary and then records the operations on the primarys oplog (page 410). Secondary (page ??) members
replicate this log and apply the operations to their data sets.
In the following three-member replica set, the primary accepts all write operations. Then the secondaries replicate the
oplog to apply to their data sets.
All members of the replica set can accept read operations. However, by default, an application directs its read operations to the primary member. See Read Preference (page 406) for details on changing the default read behavior.
The replica set can have at most one primary. If the current primary becomes unavailable, an election determines the
new primary. See Replica Set Elections (page 397) for more details.
In the following 3-member replica set, the primary becomes unavailable. This triggers an election which selects one
of the remaining secondaries as the new primary.
Replica Set Secondary Members
A secondary maintains a copy of the primarys data set. To replicate data, a secondary applies operations from the
primarys oplog (page 410) to its own data set in an asynchronous process. A replica set can have one or more
secondaries.
The following three-member replica set has two secondary members. The secondaries replicate the primarys oplog
and apply the operations to their data sets.
Although clients cannot write data to secondaries, clients can read data from secondary members. See Read Preference
(page 406) for more information on how clients direct read operations to replica sets.
A secondary can become a primary. If the current primary becomes unavailable, the replica set holds an election to
choose which of the secondaries becomes the new primary.
In the following three-member replica set, the primary becomes unavailable. This triggers an election where one of
the remaining secondaries becomes the new primary.
See Replica Set Elections (page 397) for more details.
1 While replica sets are the recommended solution for production, a replica set can support only 12 members in total. If your deployment requires
more than 12 members, youll need to use master-slave (page 413) replication. Master-slave replication lacks the automatic failover capabilities.
382
Chapter 8. Replication
Figure 8.5: Diagram of default routing of reads and writes to the primary.
383
Figure 8.6: Diagram of an election of a new primary. In a three member replica set with two secondaries, the primary
becomes unreachable. The loss of a primary triggers an election where one of the secondaries becomes the new
primary
Figure 8.7: Diagram of a 3 member replica set that consists of a primary and two secondaries.
384
Chapter 8. Replication
Figure 8.8: Diagram of an election of a new primary. In a three member replica set with two secondaries, the primary
becomes unreachable. The loss of a primary triggers an election where one of the secondaries becomes the new
primary
385
You can configure a secondary member for a specific purpose. You can configure a secondary to:
Prevent it from becoming a primary in an election, which allows it to reside in a secondary data center or to
serve as a cold standby. See Priority 0 Replica Set Members (page 386).
Prevent applications from reading from it, which allows it to run applications that require separation from normal
traffic. See Hidden Replica Set Members (page 387).
Keep a running historical snapshot for use in recovery from certain errors, such as unintentionally deleted
databases. See Delayed Replica Set Members (page 388).
Priority 0 Replica Set Members
A priority 0 member is a secondary that cannot become primary. Priority 0 members cannot trigger elections.
Otherwise these members function as normal secondaries. A priority 0 member maintains a copy of the data set,
accepts read operations, and votes in elections. Configure a priority 0 member to prevent secondaries from becoming
primary, which is particularly useful in multi-data center deployments.
In a three-member replica set, in one data center hosts the primary and a secondary. A second data center hosts one
priority 0 member that cannot become primary.
Figure 8.9: Diagram of a 3 member replica set distributed across two data centers. Replica set includes a priority 0
member.
Priority 0 Members as Standbys A priority 0 member can function as a standby. In some replica sets, it might not
be possible to add a new member in a reasonable amount of time. A standby member keeps a current copy of the data
to be able to replace an unavailable member.
In many cases, you need not set standby to priority 0. However, in sets with varied hardware or geographic distribution
(page 396), a priority 0 standby ensures that only qualified members become primary.
A priority 0 standby may also be valuable for some members of a set with different hardware or workload profiles.
In these cases, deploy a member with priority 0 so it cant become primary. Also consider using an hidden member
(page 387) for this purpose.
If your set already has seven voting members, also configure the member as non-voting (page 400).
386
Chapter 8. Replication
Priority 0 Members and Failover When configuring a priority 0 member, consider potential failover patterns,
including all possible network partitions. Always ensure that your main data center contains both a quorum of voting
members and contains members that are eligible to be primary.
Configuration To configure a priority 0 member, see Prevent Secondary from Becoming Primary (page 438).
Hidden Replica Set Members
A hidden member maintains a copy of the primarys data set but is invisible to client applications. Hidden members
are good for workloads with different usage patterns from the other members in the replica set. Hidden members are
always priority 0 members (page 386) and cannot become primary. The db.isMaster() method does not display
hidden members. Hidden members, however, do vote in elections (page 397).
In the following five-member replica set, all four secondary members have copies of the primarys data set, but one of
the secondary members is hidden.
Figure 8.10: Diagram of a 5 member replica set with a hidden priority 0 member.
Behavior
Read Operations Clients will not distribute reads with the appropriate read preference (page 406) to hidden members. As a result, these members receive no traffic other than basic replication. Use hidden members for dedicated
tasks such as reporting and backups. Delayed members (page 388) should be hidden.
In a sharded cluster, mongos do not interact with hidden members.
Voting Hidden members do vote in replica set elections. If you stop a hidden member, ensure that the set has an
active majority or the primary will step down.
For the purposes of backups, you can avoid stopping a hidden member with the db.fsyncLock() and
db.fsyncUnlock() operations to flush all writes and lock the mongod instance for the duration of the backup
operation.
Further Reading For more information about backing up MongoDB databases, see MongoDB Backup Methods
(page 136). To configure a hidden member, see Configure a Hidden Replica Set Member (page 439).
8.2. Replication Concepts
387
Delayed members contain copies of a replica sets data set. However, a delayed members data set reflects an earlier,
or delayed, state of the set. For example, if the current time is 09:52 and a member has a delay of an hour, the delayed
member has no operation more recent than 08:52.
Because delayed members are a rolling backup or a running historical snapshot of the data set, they may help
you recover from various kinds of human error. For example, a delayed member can make it possible to recover from
unsuccessful application upgrades and operator errors including dropped databases and collections.
Considerations
Requirements Delayed members:
Must be priority 0 (page 386) members. Set the priority to 0 to prevent a delayed member from becoming
primary.
Should be hidden (page 387) members. Always prevent applications from seeing and querying delayed members.
do vote in elections for primary.
Behavior Delayed members apply operations from the oplog on a delay. When choosing the amount of delay,
consider that the amount of delay:
must be is equal to or greater than your maintenance windows.
must be smaller than the capacity of the oplog. For more information on oplog size, see Oplog Size (page 411).
Sharding In sharded clusters, delayed members have limited utility when the balancer is enabled. Because delayed
members replicate chunk migrations with a delay, the state of delayed members in a sharded cluster are not useful for
recovering to a previous state of the sharded cluster if any migrations occur during the delay window.
Example In the following 5-member replica set, the primary and all secondaries have copies of the data set. One
member applies operations with a delay of 3600 seconds, or an hour. This delayed member is also hidden and is a
priority 0 member.
Configuration A delayed member has its priority (page 469) equal to 0, hidden (page 468) equal to true,
and its slaveDelay (page 469) equal to the number of seconds of delay:
{
"_id" : <num>,
"host" : <hostname:port>,
"priority" : 0,
"slaveDelay" : <seconds>,
"hidden" : true
}
To configure a delayed member, see Configure a Delayed Replica Set Member (page 441).
388
Chapter 8. Replication
Figure 8.11: Diagram of a 5 member replica set with a hidden delayed priority 0 member.
Replica Set Arbiter
An arbiter does not have a copy of data set and cannot become a primary. Replica sets may have arbiters to add a vote
in elections of for primary (page 397). Arbiters allow replica sets to have an uneven number of members, without the
overhead of a member that replicates data.
Important: Do not run an arbiter on systems that also host the primary or the secondary members of the replica set.
Only add an arbiter to sets with even numbers of members. If you add an arbiter to a set with an odd number of
members, the set may suffer from tied elections. To add an arbiter, see Add an Arbiter to Replica Set (page 431).
Example
For example, in the following replica set, an arbiter allows the set to have an odd number of votes for elections:
Figure 8.12: Diagram of a four member replica set plus an arbiter for odd number of votes.
389
Security
Authentication When running with auth, arbiters exchange credentials with other members of the set to authenticate. MongoDB encrypts the authentication process. The MongoDB authentication exchange is cryptographically
secure.
Arbiters, use keyfiles to authenticate to the replica set.
Communication The only communication between arbiters and other set members are: votes during elections,
heartbeats, and configuration data. These exchanges are not encrypted.
However, if your MongoDB deployment uses SSL, MongoDB will encrypt all communication between replica set
members. See Connect to MongoDB with SSL (page 252) for more information.
As with all MongoDB components, run arbiters on in trusted network environments.
Strategies
Determine the Number of Members
Fault Tolerance.
1
1
2
2
Adding a member to the replica set does not always increase the fault tolerance. However, in these cases, additional
members can provide support for dedicated functions, such as backups or reporting.
390
Chapter 8. Replication
Use Hidden and Delayed Members for Dedicated Functions Add hidden (page 387) or delayed (page 388) members to support dedicated functions, such as backup or reporting.
Load Balance on Read-Heavy Deployments In a deployment with very high read traffic, you can improve read
throughput by distributing reads to secondary members. As your deployment grows, add or move members to alternate
data centers to improve redundancy and availability.
Always ensure that the main facility is able to elect a primary.
Add Capacity Ahead of Demand The existing members of a replica set must have spare capacity to support adding
a new member. Always add new members before the current demand saturates the capacity of the set.
Determine the Distribution of Members
Distribute Members Geographically To protect your data if your main data center fails, keep at least one member
in an alternate data center. Set these members priority (page 469) to 0 to prevent them from becoming primary.
Keep a Majority of Members in One Location When a replica set has members in multiple data centers, network
partitions can prevent communication between data centers. To replicate data, members must be able to communicate
to other members.
In an election, members must see each other to create a majority. To ensure that the replica set members can confirm
a majority and elect a primary, keep a majority of the sets members in one location.
Target Operations with Tags
Use replica set tags (page 450) to ensure that operations replicate to specific data centers. Tags also support targeting
read operations to specific machines.
See also:
Data Center Awareness (page 159) and Operational Segregation in MongoDB Deployments (page 160).
Use Journaling to Protect Against Power Failures
Enable journaling to protect data against service interruptions. Without journaling MongoDB cannot recover data after
unexpected shutdowns, including power failures and unexpected reboots.
All 64-bit versions of MongoDB after version 2.0 have journaling enabled by default.
Deployment Patterns
The following documents describe common replica set deployment patterns. Other patterns are possible and effective depending on the the applications requirements. If needed, combine features of each architecture in your own
deployment:
Three Member Replica Sets (page 392) Three-member replica sets provide the minimum recommended architecture
for a replica set.
Replica Sets with Four or More Members (page 395) Four or more member replica sets provide greater redundancy
and can support greater distribution of read operations and dedicated functionality.
391
Geographically Distributed Replica Sets (page 396) Geographically distributed sets include members in multiple locations to protect against facility-specific failures, such as power outages.
Three Member Replica Sets
The minimum architecture of a replica set has three members. A three member replica set can have either three
members that hold data, or two members that hold data and an arbiter.
Primary with Two Secondary Members A replica set with three members that store data has:
One primary (page 382).
Two secondary (page 382) members. Both secondaries can become the primary in an election (page 397).
Figure 8.13: Diagram of a 3 member replica set that consists of a primary and two secondaries.
These deployments provide two complete copies of the data set at all times in addition to the primary. These replica
sets provide additional fault tolerance and high availability (page 397). If the primary is unavailable, the replica set
elects a secondary to be primary and continues normal operation. The old primary rejoins the set when available.
Primary with a Secondary and an Arbiter A three member replica set with a two members that store data has:
One primary (page 382).
One secondary (page 382) member. The secondary can become primary in an election (page 397).
One arbiter (page 389). The arbiter only votes in elections.
Since the arbiter does not hold a copy of the data, these deployments provides only one complete copy of the data.
Arbiters require fewer resources, at the expense of more limited redundancy and fault tolerance.
However, a deployment with a primary, secondary, and an arbiter ensures that a replica set remains available if the
primary or the secondary is unavailable. If the primary is unavailable, the replica set will elect the secondary to be
primary.
See also:
Deploy a Replica Set (page 420).
392
Chapter 8. Replication
Figure 8.14: Diagram of an election of a new primary. In a three member replica set with two secondaries, the primary
becomes unreachable. The loss of a primary triggers an election where one of the secondaries becomes the new
primary
Figure 8.15: Diagram of a replica set that consists of a primary, a secondary, and an arbiter.
393
Figure 8.16: Diagram of an election of a new primary. In a three member replica set with a secondary and an arbiter, the
primary becomes unreachable. The loss of a primary triggers an election where the secondary becomes new primary.
394
Chapter 8. Replication
Although the standard replica set configuration has three members you can deploy larger sets. Add additional members
to a set to increase redundancy or to add capacity for distributing secondary read operations.
When adding members, ensure that:
The set has an odd number of voting members. If you have an even number of voting members, deploy an
arbiter (page ??) so that the set has an odd number.
The following replica set needs an arbiter to have an odd number of voting members.
Figure 8.17: Diagram of a four member replica set plus an arbiter for odd number of votes.
A replica set can have up to 12 members, 2 but only 7 voting members. See non-voting members (page 400) for
more information.
The following 9 member replica set has 7 voting members and 2 non-voting members.
Figure 8.18: Diagram of a 9 member replica set with the maximum of 7 voting members.
2 While replica sets are the recommended solution for production, a replica set can support only 12 members in total. If your deployment requires
more than 12 members, youll need to use master-slave (page 413) replication. Master-slave replication lacks the automatic failover capabilities.
395
Members that cannot become primary in a failover have priority 0 configuration (page 386).
For instance, some members that have limited resources or networking constraints and should never be able to
become primary. Configure members that should not become primary to have priority 0 (page 386). In following
replica set, the secondary member in the third data center has a priority of 0:
Figure 8.19: Diagram of a 5 member replica set distributed across three data centers. Replica set includes a priority 0
member.
A majority of the sets members should be in your applications main data center.
See also:
Deploy a Replica Set (page 420), Add an Arbiter to Replica Set (page 431), and Add Members to a Replica Set
(page 433).
Geographically Distributed Replica Sets
Adding members to a replica set in multiple data centers adds redundancy and provides fault tolerance if one data
center is unavailable. Members in additional data centers should have a priority of 0 (page 386) to prevent them from
becoming primary.
For example: the architecture of a geographically distributed replica set may be:
One primary in the main data center.
One secondary member in the main data center. This member can become primary at any time.
One priority 0 (page 386) member in a second data center. This member cannot become primary.
In the following replica set, the primary and one secondary are in Data Center 1, while Data Center 2 has a priority 0
(page 386) secondary that cannot become a primary.
If the primary is unavailable, the replica set will elect a new primary from Data Center 1. If the data centers cannot
connect to each other, the member in Data Center 2 will not become the primary.
If Data Center 1 becomes unavailable, you can manually recover the data set from Data Center 2 with minimal
downtime. With sufficient write concern (page 47), there will be no data loss.
To facilitate elections, the main data center should hold a majority of members. Also ensure that the set has an odd
number of members. If adding a member in another data center results in a set with an even number of members,
deploy an arbiter (page ??). For more information on elections, see Replica Set Elections (page 397).
396
Chapter 8. Replication
Figure 8.20: Diagram of a 3 member replica set distributed across two data centers. Replica set includes a priority 0
member.
See also:
Deploy a Geographically Redundant Replica Set (page 425).
Replica sets use elections to determine which set member will become primary. Elections occur after initiating a
replica set, and also any time the primary becomes unavailable. The primary is the only member in the set that can
3
Replica sets remove rollback data when needed without intervention. Administrators must apply or discard rollback data manually.
397
accept write operations. If a primary becomes unavailable, elections allow the set to recover normal operations without
manual intervention. Elections are part of the failover process (page 397).
Important: Elections are essential for independent operation of a replica set; however, elections take time to complete. While an election is in process, the replica set has no primary and cannot accept writes. MongoDB avoids
elections unless necessary.
In the following three-member replica set, the primary is unavailable. The remaining secondaries hold an election to
choose a new primary.
Figure 8.21: Diagram of an election of a new primary. In a three member replica set with two secondaries, the primary
becomes unreachable. The loss of a primary triggers an election where one of the secondaries becomes the new
primary
Chapter 8. Replication
Members with a priority value of 0 cannot become primary and do not seek election. For details, see Priority 0 Replica
Set Members (page 386).
A replica set does not hold an election as long as the current primary has the highest priority value and is within 10
seconds of the latest oplog entry in the set. If a higher-priority member catches up to within 10 seconds of the latest
oplog entry of the current primary, the set holds an election in order to provide the higher-priority node a chance to
become primary.
Optime The optime is the timestamp of the last operation that a member applied from the oplog. A replica set
member cannot become primary unless it has the highest (i.e. most recent) optime of any visible member in the set.
Connections A replica set member cannot become primary unless it can connect to a majority of the members in the
replica set. For the purposes of elections, a majority refers to the total number of votes, rather than the total number of
members.
If you have a three-member replica set, where every member has one vote, the set can elect a primary as long as two
members can connect to each other. If two members are unavailable, the remaining member remains a secondary
because it cannot connect to a majority of the sets members. If the remaining member is a primary and two members
become unavailable, the primary steps down and becomes and secondary.
Network Partitions Network partitions affect the formation of a majority for an election. If a primary steps down
and neither portion of the replica set has a majority the set will not elect a new primary. The replica set becomes
read-only.
To avoid this situation, place a majority of instances in one data center and a minority of instances in any other data
centers combined.
Election Mechanics
Election Triggering Events Replica sets hold an election any time there is no primary. Specifically, the following:
the initiation of a new replica set.
a secondary loses contact with a primary. Secondaries call for elections when they cannot see a primary.
a primary steps down.
Note: Priority 0 members (page 386), do not trigger elections, even when they cannot connect to the primary.
A primary will step down:
after receiving the replSetStepDown command.
if one of the current secondaries is eligible for election and has a higher priority.
if primary cannot contact a majority of the members of the replica set.
Important: When a primary steps down, it closes all open client connections, so that clients dont attempt to write
data to a secondary. This helps clients maintain an accurate view of the replica set and helps prevent rollbacks.
399
Participation in Elections Every replica set member has a priority that helps determine its eligibility to become a
primary. In an election, the replica set elects an eligible member with the highest priority (page 469) value as
primary. By default, all members have a priority of 1 and have an equal chance of becoming primary. In the default,
all members also can trigger an election.
You can set the priority (page 469) value to weight the election in favor of a particular member or group of
members. For example, if you have a geographically distributed replica set (page 396), you can adjust priorities so
that only members in a specific data center can become primary.
The first member to receive the majority of votes becomes primary. By default, all members have a single vote, unless
you modify the votes (page 469) setting. Non-voting members (page 442) have votes (page 469) value of 0.
The state of a member also affects its eligibility to vote. Only members in the following states can vote: PRIMARY,
SECONDARY, RECOVERING, ARBITER, and ROLLBACK.
Important: Do not alter the number of votes in a replica set to control the outcome of an election. Instead, modify
the priority (page 469) value.
Vetoes in Elections All members of a replica set can veto an election, including non-voting members (page 400). A
member will veto an election:
If the member seeking an election is not a member of the voters set.
If the member seeking an election is not up-to-date with the most recent operation accessible in the replica set.
If the member seeking an election has a lower priority than another member in the set that is also eligible for
election.
If a priority 0 member (page 386) 4 is the most current member at the time of the election. In this case, another
eligible member of the set will catch up to the state of this secondary member and then attempt to become
primary.
If the current primary has more recent operations (i.e. a higher optime) than the member seeking election,
from the perspective of the voting member.
If the current primary has the same or more recent operations (i.e. a higher or equal optime) than the member
seeking election.
Non-Voting Members Non-voting members hold copies of the replica sets data and can accept read operations from
client applications. Non-voting members do not vote in elections, but can veto (page 400) an election and become
primary.
Because a replica set can have up to 12 members but only up to seven voting members, non-voting members allow a
replica set to have more than seven members.
For instance, the following nine-member replica set has seven voting members and two non-voting members.
A non-voting member has a votes (page 469) setting equal to 0 in its member configuration:
{
"_id" : <num>
"host" : <hostname:port>,
"votes" : 0
}
Important: Do not alter the number of votes to control which members will become primary. Instead, modify the
4
Remember that hidden (page 387) and delayed (page 388) imply priority 0 (page 386) configuration.
400
Chapter 8. Replication
Figure 8.22: Diagram of a 9 member replica set with the maximum of 7 voting members.
priority (page 469) option. Only alter the number of votes in exceptional cases. For example, to permit more than
seven members.
When possible, all members should have only one vote. Changing the number of votes can cause ties, deadlocks, and
the wrong members to become primary.
To configure a non-voting member, see Configure Non-Voting Replica Set Member (page 442).
Rollbacks During Replica Set Failover
A rollback reverts write operations on a former primary when the member rejoins its replica set after a failover.
A rollback is necessary only if the primary had accepted write operations that the secondaries had not successfully
replicated before the primary stepped down. When the primary rejoins the set as a secondary, it reverts, or rolls back,
its write operations to maintain database consistency with the other members.
MongoDB attempts to avoid rollbacks, which should be rare. When a rollback does occur, it is often the result of a
network partition. Secondaries that can not keep up with the throughput of operations on the former primary, increase
the size and impact of the rollback.
A rollback does not occur if the write operations replicate to another member of the replica set before the primary
steps down and if that member remains available and accessible to a majority of the replica set.
Collect Rollback Data When a rollback does occur, administrators must decide whether to apply or ignore the
rollback data. MongoDB writes the rollback data to BSON files in the rollback/ folder under the databases
dbpath directory. The names of rollback files have the following form:
<database>.<collection>.<timestamp>.bson
For example:
records.accounts.2011-05-09T18-10-04.0.bson
Administrators must apply rollback data manually after the member completes the rollback and returns to secondary
status. Use bsondump to read the contents of the rollback files. Then use mongorestore to apply the changes to
the new primary.
401
Avoid Replica Set Rollbacks To prevent rollbacks, use replica acknowledged write concern (page 49) to guarantee
that the write operations propagate to the members of a replica set.
Rollback Limitations A mongod instance will not rollback more than 300 megabytes of data. If your system must
rollback more than 300 megabytes, you must manually intervene to recover the data. If this is the case, the following
line will appear in your mongod log:
[replica set sync] replSet syncThread: 13410 replSet too much data to roll back
In this situation, save the data directly or force the member to perform an initial sync. To force initial sync, sync from
a current member of the set by deleting the content of the dbpath directory for the member that requires a larger
rollback.
See also:
Replica Set High Availability (page 397) and Replica Set Elections (page 397).
402
Chapter 8. Replication
From the perspective of a client application, whether a MongoDB instance is running as a single server (i.e. standalone) or a replica set is transparent. However, replica sets offer some configuration options for write and read
operations. 5
Verify Write Operations
The default write concern confirms write operations only on the primary. You can configure write concern to confirm
write operations to additional replica set members as well by issuing the getLastError command with the w
option.
The w option confirms that write operations have replicated to the specified number of replica set members, including
the primary. You can either specify a number or specify majority, which ensures the write propagates to a majority
of set members.
If you specify a w value greater than the number of members that hold a copy of the data (i.e., greater than the number
of non-arbiter members), the operation blocks until those members become available. This can cause the operation to
block forever. To specify a timeout threshold for the getLastError operation, use the wtimeout argument. A
wtimeout value of 0 means that the operation will never time out.
See getLastError Examples for example invocations.
Modify Default Write Concern
You can configure your own default getLastError behavior for a replica set.
Use the
getLastErrorDefaults (page 470) setting in the replica set configuration (page 467). The following sequence of commands creates a configuration that waits for the write operation to complete on a majority of the set
members before returning:
cfg = rs.conf()
cfg.settings = {}
cfg.settings.getLastErrorDefaults = {w: "majority"}
rs.reconfig(cfg)
The getLastErrorDefaults (page 470) setting affects only those getLastError commands that have no
other arguments.
Note: Use of insufficient write concern can lead to rollbacks (page 401) in the case of replica set failover (page 397).
Always ensure that your operations have specified the required write concern for your application.
See also:
Write Concern (page 47) and connections-write-concern
Custom Write Concerns
You can use replica set tags to create custom write concerns using the getLastErrorDefaults (page 470) and
getLastErrorModes (page 470) replica set settings.
Note: Custom write concern modes specify the field name and a number of distinct values for that field. By contrast,
read preferences use the value of fields in the tag document to direct read operations.
In some cases, you may be able to use the same tags for read preferences and write concerns; however, you may need
to create additional tags for write concerns depending on the requirements of your application.
5
Sharded clusters where the shards are also replica sets provide the same configuration options with regards to write and read operations.
403
Figure 8.23: Write operation to a replica set with write concern level of w:2 or write to the primary and at least one
secondary.
404
Chapter 8. Replication
Consider a five member replica set, where each member has one of the following tag sets:
{
{
{
{
{
"use":
"use":
"use":
"use":
"use":
"reporting" }
"backup" }
"application" }
"application" }
"application" }
You could create a custom write concern mode that will ensure that applicable write operations will not return until
members with two different values of the use tag have acknowledged the write operation. Create the mode with the
following sequence of operations in the mongo shell:
cfg = rs.conf()
cfg.settings = { getLastErrorModes: { use2: { "use": 2 } } }
rs.reconfig(cfg)
To use this mode pass the string use2 to the w option of getLastError as follows:
db.runCommand( { getLastError: 1, w: "use2" } )
If you have a three member replica with the following tag sets:
{ "disk": "ssd" }
{ "disk": "san" }
{ "disk": "spinning" }
You cannot specify a custom getLastErrorModes (page 470) value to ensure that the write propagates to the san
before returning. However, you may implement this write concern policy by creating the following additional tags, so
that the set resembles the following:
{ "disk": "ssd" }
{ "disk": "san", "disk.san": "san" }
{ "disk": "spinning" }
To use this mode pass the string san to the w option of getLastError as follows:
db.runCommand( { getLastError: 1, w: "san" } )
This operation will not return until a replica set member with the tag disk.san returns.
You may set a custom write concern mode as the default write concern mode using getLastErrorDefaults
(page 470) replica set as in the following setting:
cfg = rs.conf()
cfg.settings.getLastErrorDefaults = { ssd: 1 }
rs.reconfig(cfg)
405
See also:
Configure Replica Set Tag Sets (page 450) for further information about replica set reconfiguration and tag sets.
Read Preference
Read preference describes how MongoDB clients route read operations to members of a replica set.
By default, an application directs its read operations to the primary member in a replica set. Reading from the primary
guarantees that read operations reflect the latest version of a document. However, by distributing some or all reads to
secondary members of the replica set, you can improve read throughput or reduce latency for an application that does
not require fully up-to-date data.
Important: You must exercise care when specifying read preferences: modes other than primary (page 476) can
and will return stale data because the secondary queries will not include the most recent write operations to the replica
sets primary.
The following are common use cases for using non-primary (page 476) read preference modes:
Running systems operations that do not affect the front-end application.
Issuing reads to secondaries helps distribute load and prevent operations from affecting the main workload of
the primary. This can be a good choice for reporting and analytics workloads, for example.
Note: Read preferences arent relevant to direct connections to a single mongod instance. However, in order
to perform read operations on a direct connection to a secondary member of a replica set, you must set a read
preference, such as secondary.
Providing local reads for geographically distributed applications.
If you have application servers in multiple data centers, you may consider having a geographically distributed
replica set (page 396) and using a non primary read preference or the nearest (page 477). This allows the
client to read from the lowest-latency members, rather than always reading from the primary.
Maintaining availability during a failover.
Use primaryPreferred (page 476) if you want an application to read from the primary under normal
circumstances, but to allow stale reads from secondaries in an emergency. This provides a read-only mode for
your application during a failover.
Note: In general, do not use primary (page 476) and primaryPreferred (page 476) to provide extra capacity.
Sharding (page 479) increases read and write capacity by distributing read and write operations across a group of
machines, and is often a better strategy for adding capacity.
See
Read Preference Processes (page 408) for more information about the internal application of read preferences.
406
Chapter 8. Replication
Figure 8.24: Read operations to a replica set. Default read preference routes the read to the primary. Read preference
of nearest routes the read to the nearest member.
407
For more information, see read preference background (page 406) and read preference behavior (page 408). See also
the documentation for your driver7 .
Tag Sets
Tag sets allow you to target read operations to specific members of a replica set.
Custom read preferences and write concerns evaluate tags sets in different ways. Read preferences consider the value
of a tag when selecting a member to read from. Write concerns ignore the value of a tag to when selecting a member,
except to consider whether or not the value is unique.
You can specify tag sets with the following read preference modes:
primaryPreferred (page 476)
secondary (page 476)
secondaryPreferred (page 477)
nearest (page 477)
Tags are not compatible with mode primary (page 476) and, in general, only apply when selecting (page 408) a
secondary member of a set for a read operation. However, the nearest (page 477) read mode, when combined with
a tag set, selects the matching member with the lowest network latency. This member may be a primary or secondary.
All interfaces use the same member selection logic (page 408) to choose the member to which to direct read operations,
basing the choice on read preference mode and tag sets.
For information on configuring tag sets, see the Configure Replica Set Tag Sets (page 450) tutorial.
For more information on how read preference modes (page 476) interact with tag sets, see the documentation for each
read preference mode (page 476).
Read Preference Processes
Changed in version 2.2.
MongoDB drivers use the following procedures to direct operations to replica sets and sharded clusters. To determine
how to route their operations, applications periodically update their view of the replica sets state, identifying which
members are up or down, which member is primary, and verifying the latency to each mongod instance.
Member Selection
Clients, by way of their drivers, and mongos instances for sharded clusters, periodically update their view of the
replica sets state.
When you select non-primary (page 476) read preference, the driver will determine which member to target using
the following process:
1. Assembles a list of suitable members, taking into account member type (i.e. secondary, primary, or all members).
2. Excludes members not matching the tag sets, if specified.
3. Determines which suitable member is the closest to the client in absolute terms.
7 http://api.mongodb.org/
408
Chapter 8. Replication
4. Builds a list of members that are within a defined ping distance (in milliseconds) of the absolute nearest
member.
Applications can configure the threshold used in this stage. The default acceptable latency is 15 milliseconds,
which you can override in the drivers with their own secondaryAcceptableLatencyMS option. For
mongos you can use the --localThreshold or localThreshold runtime options to set this value.
5. Selects a member from these hosts at random. The member receives the read operation.
Drivers can then associate the thread or connection with the selected member. This request association (page 409) is
configurable by the application. See your driver (page 95) documentation about request association configuration and
default behavior.
Request Association
Important: Request association is configurable by the application. See your driver (page 95) documentation about
request association configuration and default behavior.
Because secondary members of a replica set may lag behind the current primary by different amounts, reads for
secondary members may reflect data at different points in time. To prevent sequential reads from jumping around in
time, the driver can associate application threads to a specific member of the set after the first read, thereby preventing
reads from other members. The thread will continue to read from the same member until:
The application performs a read with a different read preference,
The thread terminates, or
The client receives a socket exception, as is the case when theres a network error or when the mongod closes
connections during a failover. This triggers a retry (page 409), which may be transparent to the application.
When using request association, if the client detects that the set has elected a new primary, the driver will discard all
associations between threads and members.
Auto-Retry
Connections between MongoDB drivers and mongod instances in a replica set must balance two concerns:
1. The client should attempt to prefer current results, and any connection should read from the same member of
the replica set as much as possible.
2. The client should minimize the amount of time that the database is inaccessible as the result of a connection
issue, networking problem, or failover in a replica set.
As a result, MongoDB drivers and mongos:
Reuse a connection to specific mongod for as long as possible after establishing a connection to that instance.
This connection is pinned to this mongod.
Attempt to reconnect to a new member, obeying existing read preference modes (page 476), if the connection to
mongod is lost.
Reconnections are transparent to the application itself. If the connection permits reads from secondary members, after reconnecting, the application can receive two sequential reads returning from different secondaries.
Depending on the state of the individual secondary members replication, the documents can reflect the state of
your database at different moments.
Return an error only after attempting to connect to three members of the set that match the read preference mode
(page 476) and tag set (page 408). If there are fewer than three members of the set, the client will error after
connecting to all existing members of the set.
8.2. Replication Concepts
409
After this error, the driver selects a new member using the specified read preference mode. In the absence of a
specified read preference, the driver uses primary (page 476).
After detecting a failover situation,
possible.
the driver attempts to refresh the state of the replica set as quickly as
Changed in version 2.2: Before version 2.2, mongos did not support the read preference mode semantics (page 476).
In most sharded clusters, each shard consists of a replica set. As such, read preferences are also applicable. With
regard to read preference, read operations in a sharded cluster are identical to unsharded replica sets.
Unlike simple replica sets, in sharded clusters, all interactions with the shards pass from the clients to the mongos
instances that are actually connected to the set members. mongos is then responsible for the application of read
preferences, which is transparent to applications.
There are no configuration changes required for full support of read preference modes in sharded environments, as long
as the mongos is at least version 2.2. All mongos maintain their own connection pool to the replica set members.
As a result:
A request without a specified preference has primary (page 476), the default, unless, the mongos reuses an
existing connection that has a different mode set.
To prevent confusion, always explicitly set your read preference mode.
All nearest (page 477) and latency calculations reflect the connection between the mongos and the mongod
instances, not the client and the mongod instances.
This produces the desired result, because all results must pass through the mongos before returning to the
client.
410
Chapter 8. Replication
initial sync
post-rollback catch-up
sharding chunk migrations
Oplog Size
When you start a replica set member for the first time, MongoDB creates an oplog of a default size. The size depends
on the architectural details of your operating system.
In most cases, the default oplog size is sufficient. For example, if an oplog is 5% of free disk space and fills up in 24
hours of operations, then secondaries can stop copying entries from the oplog for up to 24 hours without becoming
stale. However, most replica sets have much lower operation volumes, and their oplogs can hold much higher numbers
of operations.
Before mongod creates an oplog, you can specify its size with the oplogSize option. However, after you have
started a replica set member for the first time, you can only change the size of the oplog using the Change the Size of
the Oplog (page 445) procedure.
By default, the size of the oplog is as follows:
For 64-bit Linux, Solaris, FreeBSD, and Windows systems, MongoDB allocates 5% of the available free disk
space to the oplog. If this amount is smaller than a gigabyte, then MongoDB allocates 1 gigabyte of space.
For 64-bit OS X systems, MongoDB allocates 183 megabytes of space to the oplog.
For 32-bit systems, MongoDB allocates about 48 megabytes of space to the oplog.
Workloads that Might Require a Larger Oplog Size
If you can predict your replica sets workload to resemble one of the following patterns, then you might want to create
an oplog that is larger than the default. Conversely, if your application predominantly performs reads and writes only
a small amount of data, you will oplog may be sufficient.
The following workloads might require a larger oplog size.
Updates to Multiple Documents at Once The oplog must translate multi-updates into individual operations in order
to maintain idempotency. This can use a great deal of oplog space without a corresponding increase in data size or
disk use.
Deletions Equal the Same Amount of Data as Inserts If you delete roughly the same amount of data as you insert,
the database will not grow significantly in disk use, but the size of the operation log can be quite large.
Significant Number of In-Place Updates If a significant portion of the workload is in-place updates, the database
records a large number of operations but does not change the quantity of data on disk.
Oplog Status
To view oplog status, including the size and the time range of operations, issue the
db.printReplicationInfo() method. For more information on oplog status, see Check the Size of the
Oplog (page 464).
411
Under various exceptional situations, updates to a secondarys oplog might lag behind the desired performance time.
Use db.getReplicationInfo() from a secondary member and the replication status output to assess
the current state of replication and determine if there is any unintended replication delay.
See Replication Lag (page 461) for more information.
Replica Set Data Synchronization
In order to maintain up-to-date copies of the shared data set, members of a replica set sync or replicate data from other
members. MongoDB uses two forms of data synchronization: initial sync (page 412) to populate new members with
the full data set, and replication to apply ongoing changes to the entire data set.
Initial Sync
Initial sync copies all the data from one member of the replica set to another member. A member uses initial sync
when the member has no data, such as when the member is new, or when the member has data but is missing a history
of the sets replication.
When you perform an initial sync, MongoDB does the following:
1. Clones all databases. To clone, the mongod queries every collection in each source database and inserts all data
into its own copies of these collections.
2. Applies all changes to the data set. Using the oplog from the source, the mongod updates its data set to reflect
the current state of the replica set.
3. Builds all indexes on all collections.
When the mongod finishes building all index builds, the member can transition to a normal state, i.e. secondary.
To perform an initial sync, see Resync a Member of a Replica Set (page 449).
Replication
Replica set members replicate data continuously after the initial sync. This process keeps the members up to date
with all changes to the replica sets data. In most cases, secondaries synchronize from the primary. Secondaries
may automatically change their sync targets if needed based on changes in the ping time and state of other members
replication.
For a member to sync from another, the buildIndexes (page 468) setting for both members must have the same
value/ buildIndexes (page 468) must be either true or false for both members.
Beginning in version 2.2, secondaries avoid syncing from delayed members (page 388) and hidden members
(page 387).
Validity and Durability
In a replica set, only the primary can accept write operations. Writing only to the primary provides strict consistency
among members.
Journaling provides single-instance write durability. Without journaling, if a MongoDB instance terminates ungracefully, you must assume that the database is in an invalid state.
412
Chapter 8. Replication
Multithreaded Replication
MongoDB applies write operations in batches using multiple threads to improve concurrency. MongoDB groups
batches by namespace and applies operations using a group of threads, but always applies the write operations to a
namespace in order.
While applying a batch, MongoDB blocks all reads. As a result, secondaries can never return data that reflects a state
that never existed on the primary.
Pre-Fetching Indexes to Improve Replication Throughput
To help improve the performance of applying oplog entries, MongoDB fetches memory pages that hold affected data
and indexes. This pre-fetch stage minimizes the amount of time MongoDB holds the write lock while applying oplog
entries. By default, secondaries will pre-fetch all Indexes (page 313).
Optionally, you can disable all pre-fetching or only pre-fetch the index on the _id field.
replIndexPrefetch setting for more information.
See the
To configure a master-slave deployment, start two mongod instances: one in master mode, and the other in slave
mode.
To start a mongod instance in master mode, invoke mongod as follows:
mongod --master --dbpath /data/masterdb/
With the --master option, the mongod will create a local.oplog.$main (page 474) collection, which the operation log that queues operations that the slaves will apply to replicate operations from the master. The --dbpath
is optional.
To start a mongod instance in slave mode, invoke mongod as follows:
mongod --slave --source <masterhostname><:<port>> --dbpath /data/slavedb/
Specify the hostname and port of the master instance to the --source argument. The --dbpath is optional.
For slave instances, MongoDB stores data about the source server in the local.sources (page 474) collection.
413
As an alternative to specifying the --source run-time option, can add a document to local.sources (page 474)
specifying the master instance, as in the following operation in the mongo shell:
1
2
3
use local
db.sources.find()
db.sources.insert( { host: <masterhostname> <,only: databasename> } );
In line 1, you switch context to the local database. In line 2, the find() operation should return no documents, to
ensure that there are no documents in the sources collection. Finally, line 3 uses db.collection.insert()
to insert the source document into the local.sources (page 474) collection. The model of the local.sources
(page 474) document is as follows:
host
The host field specifies the mastermongod instance, and holds a resolvable hostname, i.e. IP address, or a
name from a host file, or preferably a fully qualified domain name.
You can append <:port> to the host name if the mongod is not running on the default 27017 port.
only
Optional. Specify a name of a database. When specified, MongoDB will only replicate the indicated database.
Operational Considerations for Replication with Master Slave Deployments
Master instances store operations in an oplog which is a capped collection (page 161). As a result, if a slave falls too
far behind the state of the master, it cannot catchup and must re-sync from scratch. Slave may become out of sync
with a master if:
The slave falls far behind the data updates available from that master.
The slave stops (i.e. shuts down) and restarts later after the master has overwritten the relevant operations from
the master.
When slaves, are out of sync, replication stops. Administrators must intervene manually to restart replication. Use the
resync command. Alternatively, the --autoresync allows a slave to restart replication automatically, after ten
second pause, when the slave falls out of sync with the master. With --autoresync specified, the slave will only
attempt to re-sync once in a ten minute period.
To prevent these situations you should specify a larger oplog when you start the master instance, by adding the
--oplogSize option when starting mongod. If you do not specify --oplogSize, mongod will allocate 5%
of available disk space on start up to the oplog, with a minimum of 1GB for 64bit machines and 50MB for 32bit
machines.
Run time Master-Slave Configuration
MongoDB provides a number of run time configuration options for mongod instances in master-slave deployments.
You can specify these options in configuration files (page 146) or on the command-line. See documentation of the
following:
For master nodes:
master
slave
For slave nodes:
source
414
Chapter 8. Replication
only
slaveDelay
Also consider the Master-Slave Replication Command Line Options for related options.
Diagnostics
On a master instance, issue the following operation in the mongo shell to return replication status from the perspective
of the master:
db.printReplicationInfo()
On a slave instance, use the following operation in the mongo shell to return the replication status from the perspective
of the slave:
db.printSlaveReplicationInfo()
Use the serverStatus as in the following operation, to return status of the replication:
db.serverStatus()
See server status repl fields for documentation of the relevant section of output.
Security
When running with auth enabled, in master-slave deployments configure a keyFile so that slave mongod instances can authenticate and communicate with the master mongod instance.
To enable authentication and configure the keyFile add the following option to your configuration file:
keyFile = /srv/mongodb/keyfile
Note: You may chose to set these run-time configuration options using the --keyFile option on the command line.
Setting keyFile enables authentication and specifies a key file for the mongod instances to use when authenticating
to each other. The content of the key file is arbitrary but must be the same on all members of the deployment can
connect to each other.
The key file must be less one kilobyte in size and may only contain characters in the base64 set. The key file must not
have group or world permissions on UNIX systems. Use the following command to use the OpenSSL package to
generate random content for use in a key file:
openssl rand -base64 741
See also:
Security (page 237) for more information about security in MongoDB
Ongoing Administration and Operation of Master-Slave Deployments
Deploy Master-Slave Equivalent using Replica Sets
If you want a replication configuration that resembles master-slave replication, using replica sets replica sets, consider the following replica configuration document. In this deployment hosts <master> and <slave> 9 provide
9
In replica set configurations, the host (page 468) field must hold a resolvable hostname.
415
See Replica Set Configuration (page 467) for more information about replica set configurations.
Convert a Master-Slave Deployment to a Replica Set
To convert a master-slave deployment to a replica set, restart the current master as a one-member replica set. Then
remove the data directors from previous secondaries and add them as new secondaries to the new replica set.
1. To confirm that the current instance is master, run:
db.isMaster()
2. Shut down the mongod processes on the master and all slave(s), using the following command while connected
to each instance:
db.adminCommand({shutdown : 1, force : true})
3. Back up your /data/db directories, in case you need to revert to the master-slave deployment.
4. Start the former master with the --replSet option, as in the following:
mongod --replSet <setname>
5. Connect to the mongod with the mongo shell, and initiate the replica set with the following command:
rs.initiate()
When the command returns, you will have successfully deployed a one-member replica set. You can check the
status of your replica set at any time by running the following command:
rs.status()
You can now follow the convert a standalone to a replica set (page 432) tutorial to deploy your replica set, picking up
from the Expand the Replica Set (page 432) section.
Failing over to a Slave (Promotion)
To permanently failover from a unavailable or damaged master (A in the following example) to a slave (B):
1. Shut down A.
416
Chapter 8. Replication
2. Stop mongod on B.
3. Back up and move all data files that begin with local on B from the dbpath.
Warning:
caution.
Removing local.* is irrevocable and cannot be undone. Perform this step with extreme
If you have a master (A) and a slave (B) and you would like to reverse their roles, follow this procedure. The procedure
assumes A is healthy, up-to-date and available.
If A is not healthy but the hardware is okay (power outage, server crash, etc.), skip steps 1 and 2 and in step 8 replace
all of As files with Bs files in step 8.
If A is not healthy and the hardware is not okay, replace A with a new machine. Also follow the instructions in the
previous paragraph.
To invert the master and slave in a deployment:
1. Halt writes on A using the fsync command.
2. Make sure B is up to date with the state of A.
3. Shut down B.
4. Back up and move all data files that begin with local on B from the dbpath to remove the existing
local.sources data.
Warning:
caution.
Removing local.* is irrevocable and cannot be undone. Perform this step with extreme
If you can stop write operations to the master for an indefinite period, you can copy the data files from the master to
the new slave and then start the slave with --fastsync.
Warning: Be careful with --fastsync. If the data on both instances is not identical, a discrepancy will exist
forever.
417
fastsync is a way to start a slave by starting with an existing master disk image/backup. This option declares that
the administrator guarantees the image is correct and completely up-to-date with that of the master. If you have a full
and complete copy of data from a master you can use this option to avoid a full synchronization upon starting the
slave.
Creating a Slave from an Existing Slaves Disk Image
You can just copy the other slaves data file snapshot without any special options. Only take data snapshots when a
mongod process is down or locked using db.fsyncLock().
Resyncing a Slave that is too Stale to Recover
Slaves asynchronously apply write operations from the master that the slaves poll from the masters oplog. The oplog
is finite in length, and if a slave is too far behind, a full resync will be necessary. To resync the slave, connect to a
slave using the mongo and issue the resync command:
use admin
db.runCommand( { resync: 1 } )
This forces a full resync of all data (which will be very slow on a large database). You can achieve the same effect by
stopping mongod on the slave, deleting the entire content of the dbpath on the slave, and restarting the mongod.
Slave Chaining
Slaves cannot be chained. They must all connect to the master directly.
If a slave attempts slave from another slave you will see the following line in the mongod long of the shell:
assertion 13051 tailable cursor requested on non capped collection ns:local.oplog.$main
To change a slaves source, manually modify the slaves local.sources (page 474) collection.
Example
Consider the following: If you accidentally set an incorrect hostname for the slaves source, as in the following
example:
mongod --slave --source prod.mississippi
You can correct this, by restarting the slave without the --slave and --source arguments:
mongod
Connect to this mongod instance using the mongo shell and update the local.sources (page 474) collection,
with the following operation sequence:
use local
db.sources.update( { host : "prod.mississippi" },
{ $set : { host : "prod.mississippi.example.net" } } )
418
Chapter 8. Replication
Restart the slave with the correct command line arguments or with no --source option. After configuring
local.sources (page 474) the first time, the --source will have no subsequent effect. Therefore, both of
the following invocations are correct:
mongod --slave --source prod.mississippi.example.net
or
mongod --slave
419
Troubleshoot Replica Sets (page 461) Describes common issues and operational challenges for replica sets. For additional diagnostic information, see FAQ: MongoDB Diagnostics (page 587).
Three member replica sets provide enough redundancy to survive most network partitions and other system failures.
These sets also have sufficient capacity for many distributed read operations. Replica sets should always have an odd
number of members. This ensures that elections (page 397) will proceed smoothly. For more about designing replica
sets, see the Replication overview (page 377).
The basic procedure is to start the mongod instances that will become members of the replica set, configure the replica
set itself, and then add the mongod instances to it.
Requirements
For production deployments, you should maintain as much separation between members as possible by hosting the
mongod instances on separate machines. When using virtual machines for production deployments, you should place
each mongod instance on a separate host server serviced by redundant power circuits and redundant network paths.
Before you can deploy a replica set, you must install MongoDB on each system that will be part of your replica set. If
you have not already installed MongoDB, see the installation tutorials (page 3).
420
Chapter 8. Replication
Before creating your replica set, you should verify that your network configuration allows all possible connections
between each member. For a successful replica set deployment, every member must be able to connect to every other
member. For instructions on how to check your connection, see Test Connections Between all Members (page 463).
Procedure
Each member of the replica set resides on its own machine and all of the MongoDB processes bind to port
27017 (the standard MongoDB port).
Each member of the replica set must be accessible by way of resolvable DNS or hostnames, as in the following
scheme:
mongodb0.example.net
mongodb1.example.net
mongodb2.example.net
mongodbn.example.net
You will need to either configure your DNS names appropriately, or set up your systems /etc/hosts file to
reflect this configuration.
Ensure that network traffic can pass between all members in the network securely and efficiently. Consider the
following:
Establish a virtual private network. Ensure that your network topology routes all traffic between members
within a single site over the local area network.
Configure authentication using auth and keyFile, so that only servers and processes with authentication can connect to the replica set.
Configure networking and firewall rules so that only traffic (incoming and outgoing packets) on the default
MongoDB port (e.g. 27017) from within your deployment is permitted.
For more information on security and firewalls, see Inter-Process Authentication (page 240).
You must specify the run time configuration on each system in a configuration file stored in
/etc/mongodb.conf or a related location. Do not specify the sets configuration in the mongo shell.
Use the following configuration for each of your MongoDB instances. You should set values that are appropriate
for your systems, as needed:
port = 27017
bind_ip = 10.8.0.10
dbpath = /srv/mongodb/
fork = true
replSet = rs0
The dbpath indicates where you want mongod to store data files. The dbpath must exist before you start
mongod. If it does not exist, create the directory and ensure mongod has permission to read and write data to
this path. For more information on permissions, see the security operations documentation (page 238).
Modifying bind_ip ensures that mongod will only listen for connections from applications on the configured
address.
For more information about the run time options used above and other configuration options, see
http://docs.mongodb.org/manualreference/configuration-options.
8.3. Replica Set Tutorials
421
3. Use rs.initiate() to initiate a replica set consisting of the current member and using the default configuration, as follows:
rs.initiate()
1. In the mongo shell connected to the primary, add the remaining members to the replica set using rs.add()
in the mongo shell on the current primary (in this example, mongodb0.example.net). The commands
should resemble the following:
rs.add("mongodb1.example.net")
rs.add("mongodb2.example.net")
When complete, you should have a fully functional replica set. The new replica set will elect a primary.
Check the status of your replica set at any time with the rs.status() operation.
See also:
The documentation of the following shell functions for more information:
rs.initiate()
422
Chapter 8. Replication
rs.conf()
rs.reconfig()
rs.add()
Refer to Replica Set Read and Write Semantics (page 402) for a detailed explanation of read and write semantics in
MongoDB.
Deploy a Replica Set for Testing and Development
Note: This tutorial provides instructions for deploying a replica set in a development or test environment. For a
production deployment, refer to the Deploy a Replica Set (page 420) tutorial.
This tutorial describes how to create a three-member replica set from three existing mongod instances.
If you wish to deploy a replica set from a single MongoDB instance, see Convert a Standalone to a Replica Set
(page 432). For more information on replica set deployments, see the Replication (page 377) and Replica Set Deployment Architectures (page 390) documentation.
Overview
Three member replica sets provide enough redundancy to survive most network partitions and other system failures.
These sets also have sufficient capacity for many distributed read operations. Replica sets should always have an odd
number of members. This ensures that elections (page 397) will proceed smoothly. For more about designing replica
sets, see the Replication overview (page 377).
The basic procedure is to start the mongod instances that will become members of the replica set, configure the replica
set itself, and then add the mongod instances to it.
Requirements
For test and development systems, you can run your mongod instances on a local system, or within a virtual instance.
Before you can deploy a replica set, you must install MongoDB on each system that will be part of your replica set. If
you have not already installed MongoDB, see the installation tutorials (page 3).
Before creating your replica set, you should verify that your network configuration allows all possible connections
between each member. For a successful replica set deployment, every member must be able to connect to every other
member. For instructions on how to check your connection, see Test Connections Between all Members (page 463).
Procedure
Important: These instructions should only be used for test or development deployments.
The examples in this procedure create a new replica set named rs0.
Important: If your application connects to more than one replica set, each set should have a distinct
name. Some drivers group replica set connections by replica set name.
You will begin by starting three mongod instances as members of a replica set named rs0.
1. Create the necessary data directories for each member by issuing a command similar to the following:
423
This will create directories called rs0-0, rs0-1, and rs0-2, which will contain the instances database files.
2. Start your mongod instances in their own shell windows by issuing the following commands:
First member:
mongod --port 27017 --dbpath /srv/mongodb/rs0-0 --replSet rs0 --smallfiles --oplogSize 128
Second member:
mongod --port 27018 --dbpath /srv/mongodb/rs0-1 --replSet rs0 --smallfiles --oplogSize 128
Third member:
mongod --port 27019 --dbpath /srv/mongodb/rs0-2 --replSet rs0 --smallfiles --oplogSize 128
This starts each instance as a member of a replica set named rs0, each running on a distinct port, and specifies
the path to your data directory with the --dbpath setting. If you are already using the suggested ports, select
different ports.
The --smallfiles and --oplogSize settings reduce the disk space that each mongod
instance uses.
This is ideal for testing and development deployments as it prevents overloading your machine.
For more information on these and other configuration options, see
http://docs.mongodb.org/manualreference/configuration-options.
3. Connect to one of your mongod instances through the mongo shell. You will need to indicate which instance
by specifying its port number. For the sake of simplicity and clarity, you may want to choose the first one, as in
the following command;
mongo --port 27017
4. In the mongo shell, use rs.initiate() to initiate the replica set. You can create a replica set configuration
object in the mongo shell environment, as in the following example:
rsconf = {
_id: "rs0",
members: [
{
_id: 0,
host: "<hostname>:27017"
}
]
}
replacing <hostname> with your systems hostname, and then pass the rsconf file to rs.initiate() as
follows:
rs.initiate( rsconf )
5. Display the current replica configuration (page 467) by issuing the following command:
rs.conf()
424
Chapter 8. Replication
{
"_id" : 1,
"host" : "localhost:27017"
}
]
}
6. In the mongo shell connected to the primary, add the second and third mongod instances to the replica set
using the rs.add() method. Replace <hostname> with your systems hostname in the following examples:
rs.add("<hostname>:27018")
rs.add("<hostname>:27019")
When complete, you should have a fully functional replica set. The new replica set will elect a primary.
Check the status of your replica set at any time with the rs.status() operation.
See also:
The documentation of the following shell functions for more information:
rs.initiate()
rs.conf()
rs.reconfig()
rs.add()
You may also consider the simple setup script10 as an example of a basic automatically-configured replica set.
Refer to Replica Set Read and Write Semantics (page 402) for a detailed explanation of read and write semantics in
MongoDB.
Deploy a Geographically Redundant Replica Set
This tutorial outlines the process for deploying a replica set with members in multiple locations. The tutorial addresses
three-member sets, four-member sets, and sets with more than four members.
For appropriate background, see Replication (page 377) and Replica Set Deployment Architectures (page 390). For
related tutorials, see Deploy a Replica Set (page 420) and Add Members to a Replica Set (page 433).
Overview
While replica sets provide basic protection against single-instance failure, replica sets whose members are all located
in a single facility are susceptible to errors in that facility. Power outages, network interruptions, and natural disasters
are all issues that can affect replica sets whose members are colocated. To protect against these classes of failures,
deploy a replica set with one or more members in a geographically distinct facility or data center to provide redundancy.
Requirements
In general, the requirements for any geographically redundant replica set are as follows:
Ensure that a majority of the voting members (page 400) are within a primary facility, Site A. This includes
priority 0 members (page 386) and arbiters (page 389). Deploy other members in secondary facilities, Site B,
Site C, etc., to provide additional copies of the data. See Determine the Distribution of Members (page 391)
for more information on the voting requirements for geographically redundant replica sets.
10 https://github.com/mongodb/mongo-snippets/blob/master/replication/simple-setup.py
425
If you deploy a replica set with an even number of members, deploy an arbiter (page 389) on Site A. The arbiter
must be on site A to keep the majority there.
For instance, for a three-member replica set you need two instances in a Site A, and one member in a secondary facility,
Site B. Site A should be the same facility or very close to your primary application infrastructure (i.e. application
servers, caching layer, users, etc.)
A four-member replica set should have at least two members in Site A, with the remaining members in one or more
secondary sites, as well as a single arbiter in Site A.
For all configurations in this tutorial, deploy each replica set member on a separate system. Although you may deploy
more than one replica set member on a single system, doing so reduces the redundancy and capacity of the replica set.
Such deployments are typically for testing purposes and beyond the scope of this tutorial.
This tutorial assumes you have installed MongoDB on each system that will be part of your replica set. If you have
not already installed MongoDB, see the installation tutorials (page 3).
Procedures
General Considerations
Each member of the replica set resides on its own machine and all of the MongoDB processes bind to port
27017 (the standard MongoDB port).
Each member of the replica set must be accessible by way of resolvable DNS or hostnames, as in the following
scheme:
mongodb0.example.net
mongodb1.example.net
mongodb2.example.net
mongodbn.example.net
You will need to either configure your DNS names appropriately, or set up your systems /etc/hosts file to
reflect this configuration.
Ensure that network traffic can pass between all members in the network securely and efficiently. Consider the
following:
Establish a virtual private network. Ensure that your network topology routes all traffic between members
within a single site over the local area network.
Configure authentication using auth and keyFile, so that only servers and processes with authentication can connect to the replica set.
Configure networking and firewall rules so that only traffic (incoming and outgoing packets) on the default
MongoDB port (e.g. 27017) from within your deployment is permitted.
For more information on security and firewalls, see Inter-Process Authentication (page 240).
You must specify the run time configuration on each system in a configuration file stored in
/etc/mongodb.conf or a related location. Do not specify the sets configuration in the mongo shell.
Use the following configuration for each of your MongoDB instances. You should set values that are appropriate
for your systems, as needed:
port = 27017
bind_ip = 10.8.0.10
dbpath = /srv/mongodb/
426
Chapter 8. Replication
fork = true
replSet = rs0
The dbpath indicates where you want mongod to store data files. The dbpath must exist before you start
mongod. If it does not exist, create the directory and ensure mongod has permission to read and write data to
this path. For more information on permissions, see the security operations documentation (page 238).
Modifying bind_ip ensures that mongod will only listen for connections from applications on the configured
address.
For more information about the run time options used above and other configuration options, see
http://docs.mongodb.org/manualreference/configuration-options.
Figure 8.25: Diagram of a 3 member replica set distributed across two data centers. Replica set includes a priority 0
member.
Deploy a Geographically Redundant Three-Member Replica Set
1. Start a mongod instance on each system that will be part of your replica set. Specify the same replica set name
on each instance. For additional mongod configuration options specific to replica sets, see cli-mongod-replicaset.
Important: If your application connects to more than one replica set, each set should have a distinct name.
Some drivers group replica set connections by replica set name.
If you use a configuration file, then start each mongod instance with a command that resembles the following:
mongod --config /etc/mongodb.conf
427
3. Use rs.initiate() to initiate a replica set consisting of the current member and using the default configuration, as follows:
rs.initiate()
1. In the mongo shell connected to the primary, add the remaining members to the replica set using rs.add()
in the mongo shell on the current primary (in this example, mongodb0.example.net). The commands
should resemble the following:
rs.add("mongodb1.example.net")
rs.add("mongodb2.example.net")
When complete, you should have a fully functional replica set. The new replica set will elect a primary.
6. Make sure that you have configured the member located
mongodb2.example.net) as a priority 0 member (page 386):
in
Site
(in
this
example,
(a) Issue the following command to determine the members (page 467) array position for the member:
rs.conf()
(b) In the members (page 467) array, save the position of the member whose priority you wish to change.
The example in the next step assumes this value is 2, for the third item in the list. You must record array
position, not _id, as these ordinals will be different if you remove a member.
(c) In the mongo shell connected to the replica sets primary, issue a command sequence similar to the following:
cfg = rs.conf()
cfg.members[2].priority = 0
rs.reconfig(cfg)
When the operations return, mongodb2.example.net has a priority of 0. It cannot become primary.
Note: The rs.reconfig() shell method can force the current primary to step down, causing an
election. When the primary steps down, all clients will disconnect. This is the intended behavior. While
most elections complete within a minute, always make sure any replica configuration changes occur during
scheduled maintenance periods.
After these commands return, you have a geographically redundant three-member replica set.
Check the status of your replica set at any time with the rs.status() operation.
See also:
428
Chapter 8. Replication
3. Use rs.initiate() to initiate a replica set consisting of the current member and using the default configuration, as follows:
rs.initiate()
429
rs.conf()
5. Add the remaining members to the replica set using rs.add() in a mongo shell connected to the current
primary. The commands should resemble the following:
rs.add("mongodb1.example.net")
rs.add("mongodb2.example.net")
rs.add("mongodb3.example.net")
When complete, you should have a fully functional replica set. The new replica set will elect a primary.
6. In the same shell session, issue the following command to add the arbiter (e.g. mongodb4.example.net):
rs.addArb("mongodb4.example.net")
7. Make sure that you have configured each member located outside of Site A (e.g. mongodb3.example.net)
as a priority 0 member (page 386):
(a) Issue the following command to determine the members (page 467) array position for the member:
rs.conf()
(b) In the members (page 467) array, save the position of the member whose priority you wish to change.
The example in the next step assumes this value is 2, for the third item in the list. You must record array
position, not _id, as these ordinals will be different if you remove a member.
(c) In the mongo shell connected to the replica sets primary, issue a command sequence similar to the following:
cfg = rs.conf()
cfg.members[2].priority = 0
rs.reconfig(cfg)
When the operations return, mongodb2.example.net has a priority of 0. It cannot become primary.
Note: The rs.reconfig() shell method can force the current primary to step down, causing an
election. When the primary steps down, all clients will disconnect. This is the intended behavior. While
most elections complete within a minute, always make sure any replica configuration changes occur during
scheduled maintenance periods.
After these commands return, you have a geographically redundant four-member replica set.
Check the status of your replica set at any time with the rs.status() operation.
See also:
The documentation of the following shell functions for more information:
430
Chapter 8. Replication
rs.initiate()
rs.conf()
rs.reconfig()
rs.add()
Refer to Replica Set Read and Write Semantics (page 402) for a detailed explanation of read and write semantics in
MongoDB.
Deploy a Geographically Redundant Set with More than Four Members The above procedures detail the steps
necessary for deploying a geographically redundant replica set. Larger replica set deployments follow the same steps,
but have additional considerations:
Never deploy more than seven voting members.
If you have an even number of members, use the procedure for a four-member set (page 429)). Ensure that
a single facility, Site A, always has a majority of the members by deploying the arbiter in that site. For
example, if a set has six members, deploy at least three voting members in addition to the arbiter in Site A, and
the remaining members in alternate sites.
If you have an odd number of members, use the procedure for a three-member set (page 427). Ensure that a
single facility, Site A always has a majority of the members of the set. For example, if a set has five members,
deploy three members within Site A and two members in other facilities.
If you have a majority of the members of the set outside of Site A and the network partitions to prevent communication between sites, the current primary in Site A will step down, even if none of the members outside of
Site A are eligible to become primary.
Add an Arbiter to Replica Set
Arbiters are mongod instances that are part of replica set but do not hold data. Arbiters participate in elections
(page 397) in order to break ties. If a replica set has an even number of members, add an arbiter.
Arbiters have minimal resource requirements and do not require dedicated hardware. You can deploy an arbiter on an
application server, monitoring host.
Important: Do not run an arbiter on the same system as a member of the replica set.
Add an Arbiter
1. Create a data directory (e.g. dbpath) for the arbiter. The mongod instance uses the directory for configuration
data. The directory will not hold the data set. For example, create the /data/arb directory:
mkdir /data/arb
2. Start the arbiter. Specify the data directory and the replica set name. The following, starts an arbiter using the
/data/arb dbpath for the rs replica set:
mongod --port 30000 --dbpath /data/arb --replSet rs
3. Connect to the primary and add the arbiter to the replica set. Use the rs.addArb() method, as in the following
example:
rs.addArb("m1.example.net:30000")
This operation adds the arbiter running on port 30000 on the m1.example.net host.
8.3. Replica Set Tutorials
431
Important: If your application connects to more than one replica set, each set should have a distinct name.
Some drivers group replica set connections by replica set name.
Expand the Replica Set Add additional replica set members by doing the following:
1. On two distinct systems, start two new standalone mongod instances. For information on starting a standalone
instance, see the installation tutorial (page 3) specific to your environment.
2. On your connection to the original mongod instance (the former standalone instance), issue a command in the
following form for each new instance to add to the replica set:
rs.add("<hostname><:port>")
Replace <hostname> and <port> with the resolvable hostname and port of the mongod instance to add to
the set. For more information on adding a host to a replica set, see Add Members to a Replica Set (page 433).
Sharding Considerations If the new replica set is part of a sharded cluster, change the shard host information in
the config database by doing the following:
1. Connect to one of the sharded clusters mongos instances and issue a command in the following form:
Replace <name> with the name of the shard. Replace <replica-set> with the name of the replica set.
Replace <member,><member,><> with the list of the members of the replica set.
432
Chapter 8. Replication
2. Restart all mongos instances. If possible, restart all components of the replica sets (i.e., all mongos and all
shard mongod instances).
Add Members to a Replica Set
Overview
This tutorial explains how to add an additional member to an existing replica set. For background on replication
deployment patterns, see the Replica Set Deployment Architectures (page 390) document.
Maximum Voting Members A replica set can have a maximum of seven voting members (page 397). To add
a member to a replica set that already has seven votes, you must either add the member as a non-voting member
(page 400) or remove a vote from an existing member (page 469).
Control Scripts In production deployments you can configure a control script to manage member processes.
Existing Members You can use these procedures to add new members to an existing set. You can also use the same
procedure to re-add a removed member. If the removed members data is still relatively recent, it can recover and
catch up easily.
Data Files If you have a backup or snapshot of an existing member, you can move the data files (e.g. the dbpath
directory) to a new system and use them to quickly initiate a new member. The files must be:
A valid copy of the data files from a member of the same replica set. See Backup and Restore with Filesystem
Snapshots (page 190) document for more information.
Important: Always use filesystem snapshots to create a copy of a member of the existing replica set. Do not
use mongodump and mongorestore to seed a new replica set member.
More recent than the oldest operation in the primarys oplog. The new member must be able to become current
by applying operations from the primarys oplog.
Requirements
Prepare the Data Directory Before adding a new member to an existing replica set, prepare the new members data
directory using one of the following strategies:
Make sure the new members data directory does not contain data. The new member will copy the data from an
existing member.
If the new member is in a recovering state, it must exit and become a secondary before MongoDB can copy all
data as part of the replication process. This process takes time but does not require administrator intervention.
8.3. Replica Set Tutorials
433
Manually copy the data directory from an existing member. The new member becomes a secondary member
and will catch up to the current state of the replica set. Copying the data over may shorten the amount of time
for the new member to become current.
Ensure that you can copy the data directory to the new member and begin replication within the window allowed
by the oplog (page 411). Otherwise, the new instance will have to perform an initial sync, which completely
resynchronizes the data, as described in Resync a Member of a Replica Set (page 449).
Use db.printReplicationInfo() to check the current state of replica set members with regards to the
oplog.
For background on replication deployment patterns, see the Replica Set Deployment Architectures (page 390) document.
Add a Member to an Existing Replica Set
1. Start the new mongod instance. Specify the data directory and the replica set name. The following example
specifies the /srv/mongodb/db0 data directory and the rs0 replica set:
mongod --dbpath /srv/mongodb/db0 --replSet rs0
Take note of the host name and port information for the new mongod instance.
For more information on configuration options, see the mongod manual page.
Optional
You can specify the data directory and replica set in the mongo.conf configuration file, and start the
mongod with the following command:
mongod --config /etc/mongodb.conf
rs.add("mongodb3.example.net")
4. Verify that the member is now part of the replica set. Call the rs.conf() method, which displays the replica
set configuration (page 467):
rs.conf()
To view replica set status, issue the rs.status() method. For a description of the status fields, see
http://docs.mongodb.org/manualreference/command/replSetGetStatus.
Configure and Add a Member You can add a member to a replica set by passing to the rs.add() method a
members (page 467) document. The document must be in the form of a local.system.replset.members
(page 467) document. These documents define a replica set member in the same form as the replica set configuration
document (page 467).
434
Chapter 8. Replication
Important: Specify a value for the _id field of the members (page 467) document. MongoDB does not automatically populate the _id field in this case. Finally, the members (page 467) document must declare the host value.
All other fields are optional.
Example
To add a member with the following configuration:
an _id of 1.
a hostname and port number (page 468) of mongodb3.example.net:27017.
a priority (page 469) value within the replica set of 0.
a configuration as hidden (page 468),
Issue the following:
rs.add({_id: 1, host: "mongodb3.example.net:27017", priority: 0, hidden: true})
1. Shut down the mongod instance for the member you wish to remove. To shut down the instance, connect using
the mongo shell and the db.shutdownServer() method.
2. Connect to the replica sets current primary. To determine the current primary, use db.isMaster() while
connected to any member of the replica set.
3. Use rs.remove() in either of the following forms to remove the member:
rs.remove("mongod3.example.net:27017")
rs.remove("mongod3.example.net")
MongoDB disconnects the shell briefly as the replica set elects a new primary. The shell then automatically
reconnects. The shell displays a DBClientCursor::init call() failed error even though the command succeeds.
Remove a Member Using rs.reconfig()
To remove a member you can manually edit the replica set configuration document (page 467), as described here.
1. Shut down the mongod instance for the member you wish to remove. To shut down the instance, connect using
the mongo shell and the db.shutdownServer() method.
2. Connect to the replica sets current primary. To determine the current primary, use db.isMaster() while
connected to any member of the replica set.
3. Issue the rs.conf() method to view the current configuration document and determine the position in the
members array of the member to remove:
Example
mongod_C.example.net is in position 2 of the following configuration file:
435
{
"_id" : "rs",
"version" : 7,
"members" : [
{
"_id" : 0,
"host" : "mongod_A.example.net:27017"
},
{
"_id" : 1,
"host" : "mongod_B.example.net:27017"
},
{
"_id" : 2,
"host" : "mongod_C.example.net:27017"
}
]
}
6. Overwrite the replica set configuration document with the new configuration by issuing the following:
rs.reconfig(cfg)
As a result of rs.reconfig() the shell will disconnect while the replica set renegotiates which member is
primary. The shell displays a DBClientCursor::init call() failed error even though the command succeeds, and will automatically reconnected.
7. To confirm the new configuration, issue rs.conf().
For the example above the output would be:
{
"_id" : "rs",
"version" : 8,
"members" : [
{
"_id" : 0,
"host" : "mongod_A.example.net:27017"
},
{
"_id" : 1,
"host" : "mongod_B.example.net:27017"
}
]
}
436
Chapter 8. Replication
To change the hostname for a replica set member modify the host (page 468) field. The value of _id (page 467)
field will not change when you reconfigure the set.
See Replica Set Configuration (page 467) and rs.reconfig() for more information.
Note: Any replica set configuration change can trigger the current primary to step down, which forces an election
(page 397). During the election, the current shell session and clients connected to this replica set disconnect, which
produces an error even when the operation succeeds.
Example
To change the hostname to mongo2.example.net for the replica set member configured at members[0], issue
the following sequence of commands:
cfg = rs.conf()
cfg.members[0].host = "mongo2.example.net"
rs.reconfig(cfg)
437
cfg = rs.conf()
cfg.members[0].priority = 0.5
cfg.members[1].priority = 2
cfg.members[2].priority = 2
rs.reconfig(cfg)
The first operation uses rs.conf() to set the local variable cfg to the contents of the current replica set configuration, which is a document. The next three operations change the priority (page 469) value in the cfg document
for the first three members configured in the members (page 467) array. The final operation calls rs.reconfig()
with the argument of cfg to initialize the new configuration.
When updating the replica configuration object, access the replica set members in the members (page 467) array with
the array index. The array index begins with 0. Do not confuse this index value with the value of the _id (page 467)
field in each document in the members (page 467) array.
If a member has priority (page 469) set to 0, it is ineligible to become primary and will not seek election. Hidden
members (page 387), delayed members (page 388), and arbiters (page ??) all have priority (page 469) set to 0.
All members have a priority (page 469) equal to 1 by default.
The value of priority (page 469) can be any floating point (i.e. decimal) number between 0 and 1000. Priorities
are only used to determine the preference in election. The priority value is used only in relation to other members.
With the exception of members with a priority of 0, the absolute value of the priority (page 469) value is irrelevant.
Replica sets will preferentially elect and maintain the primary status of the member with the highest priority
(page 469) setting.
Warning: Replica set reconfiguration can force the current primary to step down, leading to an election for
primary in the replica set. Elections cause the current primary to close all open client connections.
Perform routine replica set reconfiguration during scheduled maintenance windows.
See also:
The Replica Reconfiguration Usage (page 470) example revolves around changing the priorities of the members
(page 467) of a replica set.
Prevent Secondary from Becoming Primary
To prevent a secondary member from ever becoming a primary in a failover, assign the secondary a priority of 0,
as described here. You can set this secondary-only mode for any member of the replica set, except the current
primary. For a detailed description of secondary-only members and their purposes, see Priority 0 Replica Set Members
(page 386).
To configure a member as secondary-only, set its priority (page 469) value to 0 in the members (page 467)
document in its replica set configuration. Any member with a priority (page 469) equal to 0 will never seek
election (page 397) and cannot become primary in any situation.
{
"_id" : <num>,
"host" : <hostname:port>,
"priority" : 0
}
MongoDB does not permit the current primary to have a priority of 0. To prevent the current primary from again
becoming a primary, you must first step down the current primary using rs.stepDown(), and then you must
reconfigure the replica set (page 470) with rs.conf() and rs.reconfig().
438
Chapter 8. Replication
Example
As an example of modifying member priorities, assume a four-member replica set. Use the following sequence of
operations to modify member priorities in the mongo shell connected to the primary. Identify each member by its
array index in the members (page 467) array:
cfg = rs.conf()
cfg.members[0].priority
cfg.members[1].priority
cfg.members[2].priority
cfg.members[3].priority
rs.reconfig(cfg)
=
=
=
=
2
1
0.5
0
The sequence of operations reconfigures the set with the following priority settings:
Member at 0 has a priority of 2 so that it becomes primary under most circumstances.
Member at 1 has a priority of 1, which is the default value. Member 1 becomes primary if no member with a
higher priority is eligible.
Member at 2 has a priority of 0.5, which makes it less likely to become primary than other members but doesnt
prohibit the possibility.
Member at 3 has a priority of 0. Member at 3 cannot become the primary member under any circumstances.
When updating the replica configuration object, access the replica set members in the members (page 467) array with
the array index. The array index begins with 0. Do not confuse this index value with the value of the _id (page 467)
field in each document in the members (page 467) array.
Warning:
The rs.reconfig() shell method can force the current primary to step down, which causes an election
(page 397). When the primary steps down, the mongod closes all client connections. While this typically
takes 10-20 seconds, try to make these changes during scheduled maintenance periods.
To successfully reconfigure a replica set, a majority of the members must be accessible. If your replica set
has an even number of members, add an arbiter (page 431) to ensure that members can quickly obtain a
majority of votes in an election for primary.
Related Documents
439
Considerations
The most common use of hidden nodes is to support delayed members (page 388). If you only need to prevent a
member from becoming primary, configure a priority 0 member.
If the chainingAllowed (page 470) setting allows secondary members to sync from other secondaries, MongoDB
by default prefers non-hidden members over hidden members when selecting a sync target. MongoDB will only choose
hidden members as a last resort. If you want a secondary to sync from a hidden member, use the replSetSyncFrom
database command to override the default sync target. See the documentation for replSetSyncFrom before using
the command.
See also:
Manage Chained Replication (page 456)
Changed in version 2.0: For sharded clusters running with replica sets before 2.0, if you reconfigured a member as
hidden, you had to restart mongos to prevent queries from reaching the hidden member.
Examples
Member Configuration Document To configure a secondary member as hidden, set its priority (page 469)
value to 0 and set its hidden (page 468) value to true in its member configuration:
{
"_id" : <num>
"host" : <hostname:port>,
"priority" : 0,
"hidden" : true
}
Configuration Procedure The following example hides the secondary member currently at the index 0 in the
members (page 467) array. To configure a hidden member, use the following sequence of operations in a mongo
shell connected to the primary, specifying the member to configure by its array index in the members (page 467)
array:
cfg = rs.conf()
cfg.members[0].priority = 0
cfg.members[0].hidden = true
rs.reconfig(cfg)
After re-configuring the set, this secondary member has a priority of 0 so that it cannot become primary and is hidden.
The other members in the set will not advertise the hidden member in the isMaster or db.isMaster() output.
When updating the replica configuration object, access the replica set members in the members (page 467) array with
the array index. The array index begins with 0. Do not confuse this index value with the value of the _id (page 467)
field in each document in the members (page 467) array.
Warning:
The rs.reconfig() shell method can force the current primary to step down, which causes an election
(page 397). When the primary steps down, the mongod closes all client connections. While this typically
takes 10-20 seconds, try to make these changes during scheduled maintenance periods.
To successfully reconfigure a replica set, a majority of the members must be accessible. If your replica set
has an even number of members, add an arbiter (page 431) to ensure that members can quickly obtain a
majority of votes in an election for primary.
440
Chapter 8. Replication
Related Documents
The following example sets a 1-hour delay on a secondary member currently at the index 0 in the members (page 467)
array. To set the delay, issue the following sequence of operations in a mongo shell connected to the primary:
cfg = rs.conf()
cfg.members[0].priority = 0
cfg.members[0].hidden = true
cfg.members[0].slaveDelay = 3600
rs.reconfig(cfg)
After the replica set reconfigures, the delayed secondary member cannot become primary and is hidden from applications. The slaveDelay (page 469) value delays both replication and the members oplog by 3600 seconds (1
hour).
When updating the replica configuration object, access the replica set members in the members (page 467) array with
the array index. The array index begins with 0. Do not confuse this index value with the value of the _id (page 467)
field in each document in the members (page 467) array.
Warning:
The rs.reconfig() shell method can force the current primary to step down, which causes an election
(page 397). When the primary steps down, the mongod closes all client connections. While this typically
takes 10-20 seconds, try to make these changes during scheduled maintenance periods.
To successfully reconfigure a replica set, a majority of the members must be accessible. If your replica set
has an even number of members, add an arbiter (page 431) to ensure that members can quickly obtain a
majority of votes in an election for primary.
Related Documents
441
To disable the ability to vote in elections for the fourth, fifth, and sixth replica set members, use the following command
sequence in the mongo shell connected to the primary. You identify each replica set member by its array index in the
members (page 467) array:
cfg = rs.conf()
cfg.members[3].votes = 0
cfg.members[4].votes = 0
cfg.members[5].votes = 0
rs.reconfig(cfg)
This sequence gives 0 votes to the fourth, fifth, and sixth members of the set according to the order of the members
(page 467) array in the output of rs.conf(). This setting allows the set to elect these members as primary but does
not allow them to vote in elections. Place voting members so that your designated primary or primaries can reach a
majority of votes in the event of a network partition.
When updating the replica configuration object, access the replica set members in the members (page 467) array with
the array index. The array index begins with 0. Do not confuse this index value with the value of the _id (page 467)
field in each document in the members (page 467) array.
Warning:
The rs.reconfig() shell method can force the current primary to step down, which causes an election
(page 397). When the primary steps down, the mongod closes all client connections. While this typically
takes 10-20 seconds, try to make these changes during scheduled maintenance periods.
To successfully reconfigure a replica set, a majority of the members must be accessible. If your replica set
has an even number of members, add an arbiter (page 431) to ensure that members can quickly obtain a
majority of votes in an election for primary.
In general and when possible, all members should have only 1 vote. This prevents intermittent ties, deadlocks, or the
wrong members from becoming primary. Use priority (page 469) to control which members are more likely to
become primary.
Related Documents
442
Chapter 8. Replication
1. If your application is connecting directly to the secondary, modify the application so that MongoDB queries
dont reach the secondary.
2. Shut down the secondary.
3. Remove the secondary from the replica set by calling the rs.remove() method. Perform this operation while
connected to the current primary in the mongo shell:
rs.remove("<hostname><:port>")
4. Verify that the replica set no longer includes the secondary by calling the rs.conf() method in the mongo
shell:
rs.conf()
Optional
You may remove the data instead.
6. Create a new, empty data directory to point to when restarting the mongod instance. You can reuse the previous
name. For example:
mkdir /data/db
7. Restart the mongod instance for the secondary, specifying the port number, the empty data directory, and the
replica set. You can use the same port number you used before. Issue a command similar to the following:
mongod --port 27021 --dbpath /data/db --replSet rs
8. In the mongo shell convert the secondary to an arbiter using the rs.addArb() method:
rs.addArb("<hostname><:port>")
9. Verify the arbiter belongs to the replica set by calling the rs.conf() method in the mongo shell.
rs.conf()
443
"arbiterOnly" : true
1. If your application is connecting directly to the secondary or has a connection string referencing the secondary,
modify the application so that MongoDB queries dont reach the secondary.
2. Create a new, empty data directory to be used with the new port number. For example:
mkdir /data/db-temp
3. Start a new mongod instance on the new port number, specifying the new data directory and the existing replica
set. Issue a command similar to the following:
mongod --port 27021 --dbpath /data/db-temp --replSet rs
4. In the mongo shell connected to the current primary, convert the new mongod instance to an arbiter using the
rs.addArb() method:
rs.addArb("<hostname><:port>")
5. Verify the arbiter has been added to the replica set by calling the rs.conf() method in the mongo shell.
rs.conf()
8. Verify that the replica set no longer includes the old secondary by calling the rs.conf() method in the mongo
shell:
rs.conf()
Optional
You may remove the data instead.
444
Chapter 8. Replication
Resync a Member of a Replica Set (page 449) Sync the data on a member. Either perform initial sync on a new
member or resync the data on an existing member that has fallen too far behind to catch up by way of normal
replication.
Configure Replica Set Tag Sets (page 450) Assign tags to replica set members for use in targeting read and write
operations to specific members.
Reconfigure a Replica Set with Unavailable Members (page 454) Reconfigure a replica set when a majority of
replica set members are down or unreachable.
Manage Chained Replication (page 456) Disable or enable chained replication. Chained replication occurs when a
secondary replicates from another secondary instead of the primary.
Change Hostnames in a Replica Set (page 457) Update the replica set configuration to reflect changes in members
hostnames.
Configure a Secondarys Sync Target (page 461) Specify the member that a secondary member synchronizes from.
Change the Size of the Oplog
The oplog exists internally as a capped collection, so you cannot modify its size in the course of normal operations. In
most cases the default oplog size (page 411) is an acceptable size; however, in some situations you may need a larger
or smaller oplog. For example, you might need to change the oplog size if your applications perform large numbers of
multi-updates or deletes in short periods of time.
This tutorial describes how to resize the oplog. For a detailed explanation of oplog sizing, see Oplog Size (page 411).
For details how oplog size affects delayed members and affects replication lag, see Delayed Replica Set Members
(page 388).
Overview
To change the size of the oplog, you must perform maintenance on each member of the replica set in turn. The
procedure requires: stopping the mongod instance and starting as a standalone instance, modifying the oplog size,
and restarting the member.
Important: Always start rolling replica set maintenance with the secondaries, and finish with the maintenance on
primary member.
Procedure
445
Restart a Secondary in Standalone Mode on a Different Port Shut down the mongod instance for one of the
non-primary members of your replica set. For example, to shut down, use the db.shutdownServer() method:
db.shutdownServer()
Restart this mongod as a standalone instance running on a different port and without the --replSet parameter. Use
a command similar to the following:
mongod --port 37017 --dbpath /srv/mongodb
Create a Backup of the Oplog (Optional) Optionally, backup the existing oplog on the standalone instance, as in
the following example:
mongodump --db local --collection 'oplog.rs' --port 37017
Recreate the Oplog with a New Size and a Seed Entry Save the last entry from the oplog. For example, connect
to the instance using the mongo shell, and enter the following command to switch to the local database:
use local
In mongo shell scripts you can use the following operation to set the db object:
db = db.getSiblingDB('local')
Use the db.collection.save() method and a sort on reverse natural order to find the last entry and save it to a
temporary collection:
db.temp.save( db.oplog.rs.find( { }, { ts: 1, h: 1 } ).sort( {$natural : -1} ).limit(1).next() )
Remove the Existing Oplog Collection Drop the old oplog.rs collection in the local database. Use the following command:
db = db.getSiblingDB('local')
db.oplog.rs.drop()
Insert the Last Entry of the Old Oplog into the New Oplog Insert the previously saved last entry from the old
oplog into the new oplog. For example:
db.oplog.rs.save( db.temp.findOne() )
446
Chapter 8. Replication
To confirm the entry is in the new oplog, use the following operation:
db.oplog.rs.find()
Restart the Member Restart the mongod as a member of the replica set on its usual port. For example:
db.shutdownServer()
mongod --replSet rs0 --dbpath /srv/mongodb
The replica set member will recover and catch up before it is eligible for election to primary.
Repeat Process for all Members that may become Primary Repeat this procedure for all members you want to
change the size of the oplog. Repeat the procedure for the primary as part of the following step.
Change the Size of the Oplog on the Primary To finish the rolling maintenance operation, step down the primary
with the rs.stepDown() method and repeat the oplog resizing procedure above.
Force a Member to Become Primary
Synopsis
You can force a replica set member to become primary by giving it a higher priority (page 469) value than any
other member in the set.
Optionally, you also can force a member never to become primary by setting its priority (page 469) value to 0,
which means the member can never seek election (page 397) as primary. For more information, see Priority 0 Replica
Set Members (page 386).
Procedures
Force a Member to be Primary by Setting its Priority High Changed in version 2.0.
For more information on priorities, see priority (page 469).
This procedure assumes your current primary is m1.example.net and that youd like to instead make
m3.example.net primary. The procedure also assumes you have a three-member replica set with the configuration below. For more information on configurations, see Replica Set Configuration Use (page 470).
This procedure assumes this configuration:
{
"_id" : "rs",
"version" : 7,
"members" : [
{
"_id" : 0,
"host" : "m1.example.net:27017"
},
{
"_id" : 1,
"host" : "m2.example.net:27017"
},
{
"_id" : 2,
447
"host" : "m3.example.net:27017"
}
]
}
1. In the mongo shell, use the following sequence of operations to make m3.example.net the primary:
cfg = rs.conf()
cfg.members[0].priority = 0.5
cfg.members[1].priority = 0.5
cfg.members[2].priority = 1
rs.reconfig(cfg)
This prevents m1.example.net from being primary for 86,400 seconds (24 hours), even if there is no other
member that can become primary. When m3.example.net catches up with m1.example.net it will
become primary.
If you later want to make m1.example.net primary again while it waits for m3.example.net to catch
up, issue the following command to make m1.example.net seek election again:
rs.freeze()
448
Chapter 8. Replication
rs.freeze(120)
3. In a mongo shell connected the mongod running on mdb0.example.net, step down this instance that the
mongod is not eligible to become primary for 120 seconds:
rs.stepDown(120)
Warning: During initial sync, mongod will remove the content of the dbpath.
This procedure relies on MongoDBs regular process for initial sync (page 412). This will store the current data on the
member. For an overview of MongoDB initial sync process, see the Replication Processes (page 410) section.
If the instance has no data, you can simply follow the Add Members to a Replica Set (page 433) or Replace a Replica
Set Member (page 437) procedure to add a new member to a replica set.
You can also force a mongod that is already a member of the set to to perform an initial sync by restarting the instance
without the content of the dbpath as follows:
1. Stop the members mongod instance. To ensure a clean shutdown, use the db.shutdownServer() method
from the mongo shell or on Linux systems, the mongod --shutdown option.
2. Delete all data and sub-directories from the members data directory. By removing the data dbpath, MongoDB
will perform a complete resync. Consider making a backup first.
449
At this point, the mongod will perform an initial sync. The length of the initial sync process depends on the size of
the database and network connection between members of the replica set.
Initial sync operations can impact the other members of the set and create additional traffic to the primary and can only
occur if another member of the set is accessible and up to date.
Sync by Copying Data Files from Another Member This approach seeds a new or stale member using the data
files from an existing member of the replica set. The data files must be sufficiently recent to allow the new member to
catch up with the oplog. Otherwise the member would need to perform an initial sync.
Copy the Data Files You can capture the data files as either a snapshot or a direct copy. However, in most cases you
cannot copy data files from a running mongod instance to another because the data files will change during the file
copy operation.
Important: If copying data files, you must copy the content of the local database.
You cannot use a mongodump backup to for the data files, only a snapshot backup. For approaches to capture a
consistent snapshot of a running mongod instance, see the MongoDB Backup Methods (page 136) documentation.
Sync the Member After you have copied the data files from the seed source, start the mongod instance and allow
it to apply all operations from the oplog until it reflects the current state of the replica set.
Configure Replica Set Tag Sets
Tag sets let you customize write concern and read preferences for a replica set. MongoDB stores tag sets in the replica
set configuration object, which is the document returned by rs.conf(), in the members[n].tags (page 469)
sub-document.
This section introduces the configuration of tag sets. For an overview on tag sets and their use, see Replica Set Write
Concern (page 49) and Tag Sets (page 408).
Differences Between Read Preferences and Write Concerns
Custom read preferences and write concerns evaluate tags sets in different ways:
Read preferences consider the value of a tag when selecting a member to read from.
Write concerns do not use the value of a tag to select a member except to consider whether or not the value is
unique.
For example, a tag set for a read operation may resemble the following document:
{ "disk": "ssd", "use": "reporting" }
To fulfill such a read operation, a member would need to have both of these tags. Any of the following tag sets would
satisfy this requirement:
{
{
{
{
"disk":
"disk":
"disk":
"disk":
"ssd",
"ssd",
"ssd",
"ssd",
"use":
"use":
"use":
"use":
"reporting" }
"reporting", "rack": "a" }
"reporting", "rack": "d" }
"reporting", "mem": "r"}
The following tag sets would not be able to fulfill this query:
450
Chapter 8. Replication
{
{
{
{
{
"disk": "ssd" }
"use": "reporting" }
"disk": "ssd", "use": "production" }
"disk": "ssd", "use": "production", "rack": "k" }
"disk": "spinning", "use": "reporting", "mem": "32" }
You could add tag sets to the members of this replica set with the following command sequence in the mongo shell:
conf = rs.conf()
conf.members[0].tags = { "dc": "east", "use": "production" }
conf.members[1].tags = { "dc": "east", "use": "reporting" }
conf.members[2].tags = { "use": "production" }
rs.reconfig(conf)
After this operation the output of rs.conf() would resemble the following:
{
"_id" : "rs0",
"version" : 2,
"members" : [
{
"_id" : 0,
"host" : "mongodb0.example.net:27017",
"tags" : {
"dc": "east",
"use": "production"
}
},
{
"_id" : 1,
"host" : "mongodb1.example.net:27017",
"tags" : {
"dc": "east",
"use": "reporting"
451
}
},
{
"_id" : 2,
"host" : "mongodb2.example.net:27017",
"tags" : {
"use": "production"
}
}
]
}
Given a five member replica set with members in two data centers:
1. a facility VA tagged dc.va
2. a facility GTO tagged dc.gto
Create a custom write concern to require confirmation from two data centers using replica set tags, using the following
sequence of operations in the mongo shell:
1. Create a replica set configuration JavaScript object conf:
conf = rs.conf()
=
=
=
=
=
{
{
{
{
{
"dc.va": "rack1"}
"dc.va": "rack2"}
"dc.gto": "rack1"}
"dc.gto": "rack2"}
"dc.va": "rack1"}
3. Create a custom getLastErrorModes (page 470) setting to ensure that the write operation will propagate
to at least one member of each facility:
conf.settings = { getLastErrorModes: { MultipleDC : { "dc.va": 1, "dc.gto": 1}}
4. Reconfigure the replica set using the modified conf configuration object:
rs.reconfig(conf)
To ensure that a write operation propagates to at least one member of the set in both data centers, use the MultipleDC
write concern mode as follows:
db.runCommand( { getLastError: 1, w: "MultipleDC" } )
Alternatively, if you want to ensure that each write operation propagates to at least 2 racks in each facility, reconfigure
the replica set as follows in the mongo shell:
1. Create a replica set configuration object conf:
conf = rs.conf()
452
Chapter 8. Replication
2. Redefine the getLastErrorModes (page 470) value to require two different values of both dc.va and
dc.gto:
conf.settings = { getLastErrorModes: { MultipleDC : { "dc.va": 2, "dc.gto": 2}}
3. Reconfigure the replica set using the modified conf configuration object:
rs.reconfig(conf)
Now, the following write concern operation will only return after the write operation propagates to at least two different
racks in the each facility:
db.runCommand( { getLastError: 1, w: "MultipleDC" } )
Configure Tag Sets for Functional Segregation of Read and Write Operations
11
To target a read operation to a member of the replica set with a disk type of ssd, you could use the following tag set:
{ disk: "ssd" }
However, to create comparable write concern modes, you would specify a different set of getLastErrorModes
(page 470) configuration. Consider the following sequence of operations in the mongo shell:
1. Create a replica set configuration object conf:
conf = rs.conf()
2. Redefine the getLastErrorModes (page 470) value to configure two write concern modes:
conf.settings = {
"getLastErrorModes" : {
"ssd" : {
"ssd" : 1
},
"MultipleDC" : {
"dc.va" : 1,
"dc.gto" : 1
}
}
}
3. Reconfigure the replica set using the modified conf configuration object:
11
Since read preferences and write concerns use the value of fields in tag sets differently, larger deployments may have some redundancy.
453
rs.reconfig(conf)
Now you can specify the MultipleDC write concern mode, as in the following operation, to ensure that a write
operation propagates to each data center.
db.runCommand( { getLastError: 1, w: "MultipleDC" } )
Additionally, you can specify the ssd write concern mode to ensure that a write operation propagates to at least one
instance with an SSD.
Reconfigure a Replica Set with Unavailable Members
To reconfigure a replica set when a minority of members are unavailable, use the rs.reconfig() operation on
the current primary, following the example in the Replica Set Reconfiguration Procedure (page 470).
This document provides the following options for re-configuring a replica set when a majority of members are not
accessible:
Reconfigure by Forcing the Reconfiguration (page 454)
Reconfigure by Replacing the Replica Set (page 455)
You may need to use one of these procedures, for example, in a geographically distributed replica set, where no local
group of members can reach a majority. See Replica Set Elections (page 397) for more information on this situation.
Reconfigure by Forcing the Reconfiguration
3. On the same member, remove the down and unreachable members of the replica set from the members
(page 467) array by setting the array equal to the surviving members alone. Consider the following example,
which uses the cfg variable created in the previous step:
cfg.members = [cfg.members[0] , cfg.members[4] , cfg.members[7]]
4. On the same member, reconfigure the set by using the rs.reconfig() command with the force option set
to true:
rs.reconfig(cfg, {force : true})
454
Chapter 8. Replication
This operation forces the secondary to use the new configuration. The configuration is then propagated to all the
surviving members listed in the members array. The replica set then elects a new primary.
Note: When you use force : true, the version number in the replica set configuration increases significantly, by tens or hundreds of thousands. This is normal and designed to prevent set version collisions if you
accidentally force re-configurations on both sides of a network partition and then the network partitioning ends.
5. If the failure or partition was only temporary, shut down or decommission the removed members as soon as
possible.
Reconfigure by Replacing the Replica Set
Use the following procedure only for versions of MongoDB prior to version 2.0. If youre running MongoDB 2.0 or
later, use the above procedure, Reconfigure by Forcing the Reconfiguration (page 454).
These procedures are for situations where a majority of the replica set members are down or unreachable. If a majority
is running, then skip these procedures and instead use the rs.reconfig() command according to the examples in
Example Reconfiguration Operations (page 470).
If you run a pre-2.0 version and a majority of your replica set is down, you have the two options described here. Both
involve replacing the replica set.
Reconfigure by Turning Off Replication This option replaces the replica set with a standalone server.
1. Stop the surviving mongod instances. To ensure a clean shutdown, use an existing control script or use the
db.shutdownServer() method.
For example, to use the db.shutdownServer() method, connect to the server using the mongo shell and
issue the following sequence of commands:
use admin
db.shutdownServer()
2. Create a backup of the data directory (i.e. dbpath) of the surviving members of the set.
Optional
If you have a backup of the database you may instead remove this data.
3. Restart one of the mongod instances without the --replSet parameter.
The data is now accessible and provided by a single server that is not a replica set member. Clients can use this
server for both reads and writes.
When possible, re-deploy a replica set to provide redundancy and to protect your deployment from operational interruption.
Reconfigure by Breaking the Mirror This option selects a surviving replica set member to be the new primary
and to seed a new replica set. In the following procedure, the new primary is db0.example.net. MongoDB
copies the data from db0.example.net to all the other members.
1. Stop the surviving mongod instances. To ensure a clean shutdown, use an existing control script or use the
db.shutdownServer() method.
For example, to use the db.shutdownServer() method, connect to the server using the mongo shell and
issue the following sequence of commands:
455
use admin
db.shutdownServer()
2. Move the data directories (i.e. dbpath) for all the members except db0.example.net, so that all the
members except db0.example.net have empty data directories. For example:
mv /data/db /data/db-old
3. Move the data files for local database (i.e. local.*) so that db0.example.net has no local database.
For example
mkdir /data/local-old
mv /data/db/local* /data/local-old/
MongoDB performs an initial sync on the added members by copying all data from db0.example.net to
the added members.
See also:
Resync a Member of a Replica Set (page 449)
Manage Chained Replication
Starting in version 2.0, MongoDB supports chained replication. A chained replication occurs when a secondary
member replicates from another secondary member instead of from the primary. This might be the case, for example,
if a secondary selects its replication target based on ping time and if the closest member is another secondary.
Chained replication can reduce load on the primary. But chained replication can also result in increased replication
lag, depending on the topology of the network.
New in version 2.2.2.
You can use the chainingAllowed (page 470) setting in Replica Set Configuration (page 467) to disable chained
replication for situations where chained replication is causing lag.
MongoDB enables chained replication by default. This procedure describes how to disable it and how to re-enable it.
Note: If chained replication is disabled, you still can use replSetSyncFrom to specify that a secondary replicates
from another secondary. But that configuration will last only until the secondary recalculates which member to sync
from.
To disable chained replication, set the chainingAllowed (page 470) field in Replica Set Configuration (page 467)
to false.
You can use the following sequence of commands to set chainingAllowed (page 470) to false:
1. Copy the configuration settings into the cfg object:
456
Chapter 8. Replication
cfg = rs.config()
2. Take note of whether the current configuration settings contain the settings sub-document. If they do, skip
this step.
Warning:
document.
To avoid data loss, skip this step if the configuration settings contain the settings sub-
If the current configuration settings do not contain the settings sub-document, create the sub-document by
issuing the following command:
cfg.settings = { }
3. Issue the following sequence of commands to set chainingAllowed (page 470) to false:
cfg.settings.chainingAllowed = false
rs.reconfig(cfg)
To re-enable chained replication, set chainingAllowed (page 470) to true. You can use the following sequence
of commands:
cfg = rs.config()
cfg.settings.chainingAllowed = true
rs.reconfig(cfg)
Overview
This document provides two separate procedures for changing the hostnames in the host (page 468) field. Use either
of the following approaches:
Change hostnames without disrupting availability (page 458). This approach ensures your applications will
always be able to read and write data to the replica set, but the approach can take a long time and may incur
downtime at the application layer.
If you use the first procedure, you must configure your applications to connect to the replica set at both the old
and new locations, which often requires a restart and reconfiguration at the application layer and which may
affect the availability of your applications. Re-configuring applications is beyond the scope of this document.
Stop all members running on the old hostnames at once (page 460). This approach has a shorter maintenance
window, but the replica set will be unavailable during the operation.
See also:
457
Replica Set Reconfiguration Process (page 470), Deploy a Replica Set (page 420), and Add Members to a Replica Set
(page 433).
Assumptions
458
Chapter 8. Replication
(d) Use rs.reconfig() to update the replica set configuration document (page 467) with the new hostname.
For example, the following sequence of commands updates the hostname for the secondary at the array
index 1 of the members array (i.e. members[1]) in the replica set configuration document:
cfg = rs.conf()
cfg.members[1].host = "mongodb1.example.net:27017"
rs.reconfig(cfg)
For more information on updating the configuration document, see Example Reconfiguration Operations
(page 470).
(e) Make sure your client applications are able to access the set at the new location and that the secondary has
a chance to catch up with the other members of the set.
Repeat the above steps for each non-primary member of the set.
2. Open a mongo shell connected to the primary and step down the primary using the rs.stepDown() method:
rs.stepDown()
459
3. For each member of the replica set, perform the following sequence of operations:
(a) Open a mongo shell connected to the mongod running on the new, temporary port. For example, for a
member running on a temporary port of 37017, you would issue this command:
mongo --port 37017
(b) Edit the replica set configuration manually. The replica set configuration is the only document in the
system.replset collection in the local database. Edit the replica set configuration with the new
hostnames and correct ports for all the members of the replica set. Consider the following sequence of
commands to change the hostnames in a three-member set:
use local
cfg = db.system.replset.findOne( { "_id": "rs" } )
cfg.members[0].host = "mongodb0.example.net:27017"
cfg.members[1].host = "mongodb1.example.net:27017"
cfg.members[2].host = "mongodb2.example.net:27017"
db.system.replset.update( { "_id": "rs" } , cfg )
5. Connect to one of the mongod instances using the mongo shell. For example:
mongo --port 27017
460
Chapter 8. Replication
"_id" : 1,
"host" : "mongodb1.example.net:27017"
},
{
"_id" : 2,
"host" : "mongodb2.example.net:27017"
}
]
}
461
Excessive replication lag makes lagged members ineligible to quickly become primary and increases the possibility
that distributed read operations will be inconsistent.
To check the current length of replication lag:
In a mongo shell connected to the primary, call the db.printSlaveReplicationInfo() method.
The returned document displays the syncedTo value for each member, which shows you when each member
last read from the oplog, as shown in the following example:
source:
m1.example.net:30001
syncedTo: Tue Oct 02 2012 11:33:40 GMT-0400 (EDT)
= 7475 secs ago (2.08hrs)
source:
m2.example.net:30002
syncedTo: Tue Oct 02 2012 11:33:40 GMT-0400 (EDT)
= 7475 secs ago (2.08hrs)
Note: The rs.status() method is a wrapper around the replSetGetStatus database command.
Monitor the rate of replication by watching the oplog time in the replica graph in the MongoDB Management
Service12 . For more information see the documentation for MMS13 .
Possible causes of replication lag include:
Network Latency
Check the network routes between the members of your set to ensure that there is no packet loss or network
routing issue.
Use tools including ping to test latency between set members and traceroute to expose the routing of
packets network endpoints.
Disk Throughput
If the file system and disk device on the secondary is unable to flush data to disk as quickly as the primary,
then the secondary will have difficulty keeping state. Disk-related issues are incredibly prevalent on multitenant systems, including vitalized instances, and can be transient if the system accesses disk devices over an IP
network (as is the case with Amazons EBS system.)
Use system-level tools to assess disk status, including iostat or vmstat.
Concurrency
In some cases, long-running operations on the primary can block replication on secondaries. For best results,
configure write concern (page 47) to require confirmation of replication to secondaries, as described in replica
set write concern (page 49). This prevents write operations from returning if replication cannot keep up with
the write load.
Use the database profiler to see if there are slow queries or long-running operations that correspond to the
incidences of lag.
Appropriate Write Concern
If you are performing a large data ingestion or bulk load operation that requires a large number of writes to the
primary, particularly with unacknowledged write concern (page 47), the secondaries will not be able to read the
oplog fast enough to keep up with changes.
To prevent this, require write acknowledgment or journaled write concern (page 47) after every 100, 1,000, or
an another interval to provide an opportunity for secondaries to catch up with the primary.
For more information see:
12 http://mms.mongodb.com/
13 http://mms.mongodb.com/help/
462
Chapter 8. Replication
2. Test the connection from m2.example.net to the other two hosts with the following operation set from
m2.example.net, as in:
mongo --host m1.example.net --port 27017
mongo --host m3.example.net --port 27017
You have now tested the connection between m2.example.net and m1.example.net in both directions.
3. Test the connection from m3.example.net to the other two hosts with the following operation set from the
m3.example.net host, as in:
mongo --host m1.example.net --port 27017
mongo --host m2.example.net --port 27017
If any connection, in any direction fails, check your networking and firewall configuration and reconfigure your environment to allow these connections.
463
Example
Given a three-member replica set where every member has one vote, the set can elect a primary only as long as two
members can connect to each other. If two you reboot the two secondaries once, the primary steps down and becomes
a secondary. Until the at least one secondary becomes available, the set has no primary and cannot elect a new primary.
For more information on votes, see Replica Set Elections (page 397). For related information on connection errors,
see Does TCP keepalive time affect sharded clusters and replica sets? (page 587).
Check the Size of the Oplog
A larger oplog can give a replica set a greater tolerance for lag, and make the set more resilient.
To check the size of the oplog for a given replica set member, connect to the member in a mongo shell and run the
db.printReplicationInfo() method.
The output displays the size of the oplog and the date ranges of the operations contained in the oplog. In the following
example, the oplog is about 10MB and is able to fit about 26 hours (94400 seconds) of operations:
configured oplog size:
log length start to end:
oplog first event time:
oplog last event time:
now:
10.10546875MB
94400 (26.22hrs)
Mon Mar 19 2012 13:50:38 GMT-0400 (EDT)
Wed Oct 03 2012 14:59:10 GMT-0400 (EDT)
Wed Oct 03 2012 15:00:21 GMT-0400 (EDT)
The oplog should be long enough to hold all transactions for the longest downtime you expect on a secondary. At a
minimum, an oplog should be able to hold minimum 24 hours of operations; however, many users prefer to have 72
hours or even a weeks work of operations.
For more information on how oplog size affects operations, see:
Oplog Size (page 411),
Delayed Replica Set Members (page 388), and
Check the Replication Lag (page 461).
Note: You normally want the oplog to be the same size on all members. If you resize the oplog, resize it on all
members.
To change oplog size, see the Change the Size of the Oplog (page 445) tutorial.
Oplog Entry Timestamp Error
Consider the following error in mongod output and logs:
replSet error fatal couldn't query the local local.oplog.rs collection.
<timestamp> [rsStart] bad replSet oplog entry?
Often, an incorrectly typed value in the ts field in the last oplog entry causes this error. The correct data type is
Timestamp.
Check the type of the ts value using the following two queries against the oplog collection:
db = db.getSiblingDB("local")
db.oplog.rs.find().sort({$natural:-1}).limit(1)
db.oplog.rs.find({ts:{$type:17}}).sort({$natural:-1}).limit(1)
464
Chapter 8. Replication
The first query returns the last document in the oplog, while the second returns the last document in the oplog where
the ts value is a Timestamp. The $type operator allows you to select BSON type 17, is the Timestamp data type.
If the queries dont return the same document, then the last document in the oplog has the wrong data type in the ts
field.
Example
If the first query returns this as the last oplog entry:
{ "ts" : {t: 1347982456000, i: 1},
"h" : NumberLong("8191276672478122996"),
"op" : "n",
"ns" : "",
"o" : { "msg" : "Reconfig set", "version" : 4 } }
And the second query returns this as the last entry where ts has the Timestamp type:
{ "ts" : Timestamp(1347982454000, 1),
"h" : NumberLong("6188469075153256465"),
"op" : "n",
"ns" : "",
"o" : { "msg" : "Reconfig set", "version" : 3 } }
Then the value for the ts field in the last oplog entry is of the wrong data type.
To set the proper type for this value and resolve this issue, use an update operation that resembles the following:
db.oplog.rs.update( { ts: { t:1347982456000, i:1 } },
{ $set: { ts: new Timestamp(1347982456000, 1)}})
Modify the timestamp values as needed based on your oplog entry. This operation may take some period to complete
because the update must scan and pull the entire oplog into memory.
Duplicate Key Error on local.slaves
The duplicate key on local.slaves error, occurs when a secondary or slave changes its hostname and the primary or
master tries to update its local.slaves collection with the new name. The update fails because it contains the
same _id value as the document containing the previous hostname. The error itself will resemble the following.
exception 11000 E11000 duplicate key error index: local.slaves.$_id_
This is a benign error and does not affect replication operations on the secondary or slave.
To prevent the error from appearing, drop the local.slaves collection from the primary or master, with the
following sequence of operations in the mongo shell:
use local
db.slaves.drop()
The next time a secondary or slave polls the primary or master, the primary or master recreates the local.slaves
collection.
465
466
Chapter 8. Replication
This reference provides an overview of replica set configuration options and settings.
Use rs.conf() in the mongo shell to retrieve this configuration. Note that default values are not explicitly displayed.
Example Configuration Document
The following document provides a representation of a replica set configuration document. Angle brackets (e.g. < and
>) enclose all optional fields.
{
_id : <setname>,
version: <int>,
members: [
{
_id : <ordinal>,
host : hostname<:port>,
<arbiterOnly : <boolean>,>
<buildIndexes : <boolean>,>
<hidden : <boolean>,>
<priority: <priority>,>
<tags: { <document> },>
<slaveDelay : <number>,>
<votes : <number>>
}
, ...
],
<settings: {
<getLastErrorDefaults : <lasterrdefaults>,>
<chainingAllowed : <boolean>,>
<getLastErrorModes : <modes>>
}>
}
Configuration Variables
local.system.replset._id
Type: string
Value: <setname>
An _id field holding the name of the replica set. This reflects the set name configured with replSet or
mongod --replSet.
local.system.replset.members
Type: array
Contains an array holding an embedded document for each member of the replica set. The members document
contains a number of fields that describe the configuration of each member of the replica set.
The members (page 467) field in the replica set configuration document is a zero-indexed array.
467
local.system.replset.members[n]._id
Type: ordinal
Provides the zero-indexed identifier of every member in the replica set.
Note: When updating the replica configuration object, access the replica set members in the members
(page 467) array with the array index. The array index begins with 0. Do not confuse this index value with the
value of the _id (page 467) field in each document in the members (page 467) array.
local.system.replset.members[n].host
Type: <hostname><:port>
Identifies the host name of the set member with a hostname and port number. This name must be resolvable for
every host in the replica set.
Warning: host (page 468) cannot hold a value that resolves to localhost or the local interface unless
all members of the set are on hosts that resolve to localhost.
local.system.replset.members[n].arbiterOnly
Optional.
Type: boolean
Default: false
Identifies an arbiter. For arbiters, this value is true, and is automatically configured by rs.addArb().
local.system.replset.members[n].buildIndexes
Optional.
Type: boolean
Default: true
Determines whether the mongod builds indexes on this member. Do not set to false for instances that receive
queries from clients.
Omitting index creation, and thus this setting, may be useful, if:
You are only using this instance to perform backups using mongodump,
this instance will receive no queries, and
index creation and maintenance overburdens the host system.
If set to false, secondaries configured with this option do build indexes on the _id field, to facilitate operations required for replication.
Warning: You may only set this value when adding a member to a replica set. You may not reconfigure a
replica set to change the value of the buildIndexes (page 468) field after adding the member to the set.
buildIndexes (page 468) is only valid when priority is 0 to prevent these members from becoming
primary. Make all instances that do not build indexes hidden.
Other secondaries cannot replicate from a members where buildIndexes (page 468) is false.
local.system.replset.members[n].hidden
Optional.
Type: boolean
Default: false
468
Chapter 8. Replication
When this value is true, the replica set hides this instance, and does not include the member in the output of
db.isMaster() or isMaster. This prevents read operations (i.e. queries) from ever reaching this host by
way of secondary read preference.
See also:
Hidden Replica Set Members (page 387)
local.system.replset.members[n].priority
Optional.
Type: Number, between 0 and 100.0 including decimals.
Default: 1
Specify higher values to make a member more eligible to become primary, and lower values to make the member
less eligible to become primary. Priorities are only used in comparison to each other. Members of the set will
veto election requests from members when another eligible member has a higher priority value. Changing the
balance of priority in a replica set will trigger an election.
A priority (page 469) of 0 makes it impossible for a member to become primary.
See also:
priority (page 469) and Replica Set Elections (page 397).
local.system.replset.members[n].tags
Optional.
Type: MongoDB Document
Default: none
Used to represent arbitrary values for describing or tagging members for the purposes of extending write concern
to allow configurable data center awareness.
Use in conjunction with getLastErrorModes (page 470) and getLastErrorDefaults (page 470) and
db.getLastError() (i.e. getLastError.)
For procedures on configuring tag sets, see Configure Replica Set Tag Sets (page 450).
Important: In tag sets, all tag values must be strings.
local.system.replset.members[n].slaveDelay
Optional.
Type: Integer. (seconds.)
Default: 0
Describes the number of seconds behind the primary that this replica set member should lag. Use this
option to create delayed members (page 388), that maintain a copy of the data that reflects the state of the data
set at some amount of time in the past, specified in seconds. Typically such delayed members help protect
against human error, and provide some measure of insurance against the unforeseen consequences of changes
and updates.
local.system.replset.members[n].votes
Optional.
Type: Integer
Default: 1
Controls the number of votes a server will cast in a replica set election (page 397). The number of votes each
member has can be any non-negative integer, but it is highly recommended each member has 1 or 0 votes.
8.4. Replication Reference
469
If you need more than 7 members in one replica set, use this setting to add additional non-voting members with
a votes (page 469) value of 0.
For most deployments and most members, use the default value, 1, for votes (page 469).
local.system.replset.settings
Optional.
Type: MongoDB Document
The settings document configures options that apply to the whole replica set.
local.system.replset.settings.chainingAllowed
Optional.
Type: boolean
Default: true
New in version 2.2.4.
When chainingAllowed (page 470) is true, the replica set allows secondary members to replicate from
other secondary members. When chainingAllowed (page 470) is false, secondaries can replicate only
from the primary.
When you run rs.config() to view a replica sets configuration, the chainingAllowed (page 470) field
appears only when set to false. If not set, chainingAllowed (page 470) is true.
See also:
Manage Chained Replication (page 456)
local.system.replset.settings.getLastErrorDefaults
Optional.
Type: MongoDB Document
Specify arguments to getLastError that members of this replica set will use when getLastError has no
arguments. If you specify any arguments, getLastError, ignores these defaults.
local.system.replset.settings.getLastErrorModes
Optional.
Type: MongoDB Document
Defines the names and combination of members (page 467) for use by the application layer to guarantee write
concern to database using the getLastError command to provide data-center awareness.
Example Reconfiguration Operations
Most modifications of replica set configuration use the mongo shell. Consider the following reconfiguration operation:
Example
Given the following replica set configuration:
{
"_id" : "rs0",
"version" : 1,
"members" : [
{
"_id" : 0,
"host" : "mongodb0.example.net:27017"
470
Chapter 8. Replication
},
{
"_id" : 1,
"host" : "mongodb1.example.net:27017"
},
{
"_id" : 2,
"host" : "mongodb2.example.net:27017"
}
]
}
The following reconfiguration operation updates the priority (page 469) of the replica set members:
cfg = rs.conf()
cfg.members[0].priority = 0.5
cfg.members[1].priority = 2
cfg.members[2].priority = 2
rs.reconfig(cfg)
First, this operation sets the local variable cfg to the current replica set configuration using the rs.conf() method.
Then it adds priority values to the cfg document for the three sub-documents in the members (page 467) array,
accessing each replica set member with the array index and not the replica set members _id (page 467) field. Finally,
it calls the rs.reconfig() method with the argument of cfg to initialize this new configuration. The replica set
configuration after this operation will resemble the following:
{
"_id" : "rs0",
"version" : 1,
"members" : [
{
"_id" : 0,
"host" : "mongodb0.example.net:27017",
"priority" : 0.5
},
{
"_id" : 1,
"host" : "mongodb1.example.net:27017",
"priority" : 2
},
{
"_id" : 2,
"host" : "mongodb2.example.net:27017",
"priority" : 1
}
]
}
Using the dot notation demonstrated in the above example, you can modify any existing setting or specify any of
optional replica set configuration variables (page 467). Until you run rs.reconfig(cfg) at the shell, no changes
will take effect. You can issue cfg = rs.conf() at any time before using rs.reconfig() to undo your
changes and start from the current configuration. If you issue cfg as an operation at any point, the mongo shell at
any point will output the complete document with modifications for your review.
The rs.reconfig() operation has a force option, to make it possible to reconfigure a replica set if a majority of
the replica set is not visible, and there is no primary member of the set. use the following form:
471
Warning: Forcing a rs.reconfig() can lead to rollback situations and other difficult to recover from situations. Exercise caution when using this option.
Note: The rs.reconfig() shell method can force the current primary to step down and triggers an election in
some situations. When the primary steps down, all clients will disconnect. This is by design. Since this typically takes
10-20 seconds, attempt to make such changes during scheduled maintenance periods.
Every mongod instance has its own local database, which stores data used in the replication process, and other
instance-specific data. The local database is invisible to replication: collections in the local database are not
replicated.
In replication, the local database store stores internal replication data for each member of a replica set. The local
stores the following collections:
Changed in version 2.4: When running with authentication (i.e. auth), authenticating to the local database is
not equivalent to authenticating to the admin database. In previous versions, authenticating to the local database
provided access to all databases.
Collection on all mongod Instances
local.startup_log
On startup, each mongod instance inserts a document into startup_log (page 472) with diagnostic information about the mongod instance itself and host information. startup_log (page 472) is a capped collection.
This information is primarily useful for diagnostic purposes.
Example
Consider the following prototype of a document from the startup_log (page 472) collection:
{
"_id" : "<string>",
"hostname" : "<string>",
"startTime" : ISODate("<date>"),
"startTimeLocal" : "<string>",
"cmdLine" : {
"dbpath" : "<path>",
"<option>" : <value>
},
"pid" : <number>,
"buildinfo" : {
"version" : "<string>",
"gitVersion" : "<string>",
"sysInfo" : "<string>",
"loaderFlags" : "<string>",
"compilerFlags" : "<string>",
"allocator" : "<string>",
"versionArray" : [ <num>, <num>, <...> ],
472
Chapter 8. Replication
"javascriptEngine" : "<string>",
"bits" : <number>,
"debug" : <boolean>,
"maxBsonObjectSize" : <number>
}
}
Documents in the startup_log (page 472) collection contain the following fields:
local.startup_log._id
Includes the system hostname and a millisecond epoch value.
local.startup_log.hostname
The systems hostname.
local.startup_log.startTime
A UTC ISODate value that reflects when the server started.
local.startup_log.startTimeLocal
A string that reports the startTime (page 473) in the systems local time zone.
local.startup_log.cmdLine
A sub-document that reports the mongod runtime options and their values.
local.startup_log.pid
The process identifier for this process.
local.startup_log.buildinfo
A sub-document that reports information about the build environment and settings used to compile this
mongod. This is the same output as buildInfo. See buildInfo.
local.system.replset
local.system.replset (page 473) holds the replica sets configuration object as its single document. To
view the objects configuration information, issue rs.conf() from the mongo shell. You can also query this
collection directly.
local.oplog.rs
local.oplog.rs (page 473) is the capped collection that holds the oplog. You set its size at creation using
the oplogSize setting. To resize the oplog after replica set initiation, use the Change the Size of the Oplog
(page 445) procedure. For additional information, see the Oplog Size (page 411) section.
local.replset.minvalid
This contains an object used internally by replica sets to track replication status.
local.slaves
This contains information about each member of the set and the latest point in time that this member has synced
to. If this collection becomes out of date, you can refresh it by dropping the collection and allowing MongoDB
to automatically refresh it during normal replication:
db.getSiblingDB("local").slaves.drop()
473
On the master:
local.oplog.$main
This is the oplog for the master-slave configuration.
local.slaves
This contains information about each slave.
On each slave:
local.sources
This contains information about the slaves master server.
Replica Set Member States
Members of replica sets have states that reflect the startup process, basic operations, and potential error states.
Number
0
1
2
3
4
5
6
7
8
9
10
Name
State Description
STARTUP
(page 475)
PRIMARY
(page 474)
SECONDARY
(page 474)
RECOVERING
(page 475)
FATAL
(page 475)
STARTUP2
(page 475)
UNKNOWN
(page 475)
ARBITER
(page 474)
DOWN
(page 475)
ROLLBACK
(page 475)
SHUNNED
(page 475)
Cannot vote. All members start up in this state. The mongod parses the replica set
configuration document (page 437) while in STARTUP (page 475).
Can vote. The primary (page 382) is the only member to accept write operations.
Can vote. The secondary (page 382) replicates the data store.
Can vote. Members either perform startup self-checks, or transition from completing a
rollback (page 401) or resync (page 449).
Cannot vote. Has encountered an unrecoverable error.
Cannot vote. Forks replication and election threads before becoming a secondary.
Cannot vote. Has never connected to the replica set.
Can vote. Arbiters (page ??) do not replicate data and exist solely to participate in
elections.
Cannot vote. Is not accessible to the set.
Can vote. Performs a rollback (page 401).
Cannot vote. Was once in the replica set but has now been removed.
States
Core States
PRIMARY
Members in PRIMARY (page 474) state accept write operations. A replica set has only one primary at a time.
A SECONDARY (page 474) member becomes primary after an election (page 397). Members in the PRIMARY
(page 474) state are eligible to vote.
SECONDARY
Members in SECONDARY (page 474) state replicate the primarys data set and can be configured to accept read
operations. Secondaries are eligible to vote in elections, and may be elected to the PRIMARY (page 474) state if
the primary becomes unavailable.
474
Chapter 8. Replication
ARBITER
Members in ARBITER (page 474) state do not replicate data or accept write operations. They are eligible to
vote, and exist solely to break a tie during elections. Replica sets should only have a member in the ARBITER
(page 474) state if the set would otherwise have an even number of members, and could suffer from tied elections. Like primaries, there should only be at most one arbiter in any replica set.
See Replica Set Members (page 382) for more information on core states.
Initialization States
STARTUP
Each member of a replica set starts up in STARTUP (page 475) state. mongod then loads that members
replica set configuration, and transitions the members state to STARTUP2 (page 475). Members in STARTUP
(page 475) are not eligible to vote.
STARTUP2
Each member of a replica set enters the STARTUP2 (page 475) state as soon as mongod finishes loading
that members configuration. While in the STARTUP2 (page 475) state, the member creates threads to handle
internal replication operations. Members are in the STARTUP2 (page 475) state for a short period of time before
entering the RECOVERING (page 475) state. Members in the STARTUP2 (page 475) state are not eligible to
vote.
RECOVERING
A member of a replica set enters RECOVERING (page 475) state when it is not ready to accept reads. The
RECOVERING (page 475) state can occur during normal operation, and doesnt necessarily reflect an error
condition. Members in the RECOVERING (page 475) state are eligible to vote in elections, but is not eligible to
enter the PRIMARY (page 474) state.
During startup, members transition through RECOVERING (page 475) after STARTUP2 (page 475) and before
becoming SECONDARY (page 474).
During normal operation, if a secondary falls behind the other members of the replica set, it may need to resync
(page 449) with the rest of the set. While resyncing, the member enters the RECOVERING (page 475) state.
Whenever the replica set replaces a primary in an election, the old primarys data collection may contain documents that did not have time to replicate to the secondary members. In this case the member rolls back those
writes. During rollback (page 401), the member will have RECOVERING (page 475) state.
On secondaries, the compact and replSetMaintenance commands force the secondary to enter
RECOVERING (page 475) state. This prevents clients from reading during those operations.
Error States Members in any error state cant vote.
FATAL
Members that encounter an unrecoverable error enter the FATAL (page 475) state. Members in this state requires
administrator intervention.
UNKNOWN
Members that have never communicated status information to the replica set are in the UNKNOWN (page 475)
state.
DOWN
Members that lose their connection to the replica set enter the DOWN (page 475) state.
SHUNNED
Members that are removed from the replica set enter the SHUNNED (page 475) state.
ROLLBACK
When a SECONDARY (page 474) rolls back a write operation after transitioning from PRIMARY (page 474), it
enters the ROLLBACK (page 475) state. See Rollbacks During Replica Set Failover (page 401).
475
primary
All read operations use only the current replica set primary. This is the default. If the primary is unavailable,
read operations produce an error or throw an exception.
The primary (page 476) read preference mode is not compatible with read preference modes that use tag sets
(page 408). If you specify a tag set with primary (page 476), the driver will produce an error.
primaryPreferred
In most situations, operations read from the primary member of the set. However, if the primary is unavailable,
as is the case during failover situations, operations read from secondary members.
When the read preference includes a tag set (page 408), the client reads first from the primary, if available, and
then from secondaries that match the specified tags. If no secondaries have matching tags, the read operation
produces an error.
Since the application may receive data from a secondary, read operations using the primaryPreferred
(page 476) mode may return stale data in some situations.
Warning: Changed in version 2.2: mongos added full support for read preferences.
When connecting to a mongos instance older than 2.2, using a client that supports read preference modes,
primaryPreferred (page 476) will send queries to secondaries.
secondary
Operations read only from the secondary members of the set. If no secondaries are available, then this read
operation produces an error or exception.
Most sets have at least one secondary, but there are situations where there may be no available secondary. For
example, a set with a primary, a secondary, and an arbiter may not have any secondaries if a member is in
recovering state or unavailable.
When the read preference includes a tag set (page 408), the client attempts to find secondary members that
match the specified tag set and directs reads to a random secondary from among the nearest group (page 408).
476
Chapter 8. Replication
14
Read operations using the secondary (page 476) mode may return stale data.
secondaryPreferred
In most situations, operations read from secondary members, but in situations where the set consists of a single
primary (and no other members), the read operation will use the sets primary.
When the read preference includes a tag set (page 408), the client attempts to find a secondary member that
matches the specified tag set and directs reads to a random secondary from among the nearest group (page 408).
If no secondaries have matching tags, the client ignores tags and reads from the primary.
Read operations using the secondaryPreferred (page 477) mode may return stale data.
nearest
The driver reads from the nearest member of the set according to the member selection (page 408) process.
Reads in the nearest (page 477) mode do not consider the members type. Reads in nearest (page 477)
mode may read from both primaries and secondaries.
Set this mode to minimize the effect of network latency on read operations without preference for current or
stale data.
If you specify a tag set (page 408), the client attempts to find a replica set member that matches the specified
tag set and directs reads to an arbitrary member from among the nearest group (page 408).
Read operations using the nearest (page 477) mode may return stale data.
Note: All operations read from a member of the nearest group of the replica set that matches the specified
read preference mode. The nearest (page 477) mode prefers low latency reads over a members primary or
secondary status.
For nearest (page 477), the client assembles a list of acceptable hosts based on tag set and then narrows that
list to the host with the shortest ping time and all other members of the set that are within the local threshold,
or acceptable latency. See Member Selection (page 408) for more information.
Use Cases
Depending on the requirements of an application, you can configure different applications to use different read preferences, or use different read preferences for different queries in the same application. Consider the following applications for different read preference strategies.
Maximize Consistency To avoid stale reads under all circumstances, use primary (page 476). This prevents all
queries when the set has no primary, which happens during elections, or when a majority of the replica set is not
accessible.
Maximize Availability To permit read operations when possible, Use primaryPreferred (page 476). When
theres a primary you will get consistent reads, but if there is no primary you can still query secondaries.
14 If your set has more than one secondary, and you use the secondary (page 476) read preference mode, consider the following effect. If
you have a three member replica set (page 392) with a primary and two secondaries, and if one secondary becomes unavailable, all secondary
(page 476) queries must target the remaining secondary. This will double the load on this secondary. Plan and provide capacity to support this as
needed.
477
Minimize Latency To always read from a low-latency node, use nearest (page 477). The driver or mongos will
read from the nearest member and those no more than 15 milliseconds 15 further away than the nearest member.
nearest (page 477) does not guarantee consistency. If the nearest member to your application server is a secondary
with some replication lag, queries could return stale data. nearest (page 477) only reflects network distance and
does not reflect I/O or CPU load.
Query From Geographically Distributed Members If the members of a replica set are geographically distributed,
you can create replica tags based that reflect the location of the instance and then configure your application to query
the members nearby.
For example, if members in east and west data centers are tagged (page 450) {dc: east} and {dc:
west}, your application servers in the east data center can read from nearby members with the following read
preference:
db.collection.find().readPref( { mode: 'nearest',
tags: [ {'dc': 'east'} ] } )
Although nearest (page 477) already favors members with low network latency, including the tag makes the choice
more predictable.
Reduce load on the primary To shift read load from the primary, use mode secondary (page 476). Although
secondaryPreferred (page 477) is tempting for this use case, it carries some risk: if all secondaries are unavailable and your set has enough arbiters to prevent the primary from stepping down, then the primary will receive all
traffic from clients. If the primary is unable to handle this load, queries will compete with writes. For this reason, use
secondary (page 476) to distribute read load to replica sets, not secondaryPreferred (page 477).
Read Preferences for Database Commands
Because some database commands read and return data from the database, all of the official drivers support full read
preference mode semantics (page 476) for the following commands:
group
mapReduce 16
aggregate
collStats
dbStats
count
distinct
geoNear
geoSearch
geoWalk
New in version 2.4: mongos adds support for routing commands to shards using read preferences. Previously
mongos sent all commands to shards primaries.
15
This threshold is configurable. See localThreshold for mongos or your driver documentation for the appropriate setting.
Only inline mapReduce operations that do not write data support read preference, otherwise these operations must run on the primary
members.
16
478
Chapter 8. Replication
CHAPTER 9
Sharding
Sharding is the process of storing data records across multiple machines and is MongoDBs approach to meeting the
demands of data growth. As the size of the data increases, a single machine may not be sufficient to store the data nor
provide an acceptable read and write throughput. Sharding solves the problem with horizontal scaling. With sharding,
you add more machines to support data growth and the demands of read and write operations.
Sharding Introduction (page 479) A high-level introduction to horizontal scaling, data partitioning, and sharded
clusters in MongoDB.
Sharding Concepts (page 484) The core documentation of sharded cluster features, configuration, architecture and
behavior.
Sharded Cluster Components (page 485) A sharded cluster consists of shards, config servers, and mongos
instances.
Sharded Cluster Architectures (page 489) Outlines the requirements for sharded clusters, and provides examples of several possible architectures for sharded clusters.
Sharded Cluster Behavior (page 490) Discusses the operations of sharded clusters with regards to the automatic balancing of data in a cluster and other related availability and security considerations.
Sharding Mechanics (page 500) Discusses the internal operation and behavior of sharded clusters, including
chunk migration, balancing, and the cluster metadata.
Sharded Cluster Tutorials (page 506) Tutorials that describe common procedures and administrative operations relevant to the use and maintenance of sharded clusters.
Sharding Reference (page 546) Reference for sharding-related functions and operations.
479
Vertical scaling adds more CPU and storage resources to increase capacity. Scaling by adding capacity has limitations: high performance systems with large numbers of CPUs and large amount of RAM are disproportionately
more expensive than smaller systems. Additionally, cloud-based providers may only allow users to provision smaller
instances. As a result there is a practical maximum capability for vertical scaling.
Sharding, or horizontal scaling, by contrast, divides the data set and distributes the data over multiple servers, or
shards. Each shard is an independent database, and collectively, the shards make up a single logical database.
Figure 9.1: Diagram of a large collection with data distributed across 4 shards.
Sharding addresses the challenge of scaling to support high throughput and large data sets:
Sharding reduces the number of operations each shard handles. Each shard processes fewer operations as the
cluster grows. As a result, shared clusters can increase capacity and throughput horizontally.
For example, to insert data, the application only needs to access the shard responsible for that records.
Sharding reduces the amount of data that each server needs to store. Each shard stores less data as the cluster
grows.
For example, if a database has a 1 terabyte data set, and there are 4 shards, then each shard might hold only
256GB of data. If there are 40 shards, then each shard might hold only 25GB of data.
480
Chapter 9. Sharding
Figure 9.2: Diagram of a sample sharded cluster for production purposes. Contains exactly 3 config servers, 2 or more
mongos query routers, and at least 2 shards. The shards are replica sets.
Sharded cluster has the following components: shards, query routers and config servers.
Shards store the data. To provide high availability and data consistency, in a production sharded cluster, each shard is
a replica set 1 . For more information on replica sets, see Replica Sets (page 381).
Query Routers, or mongos instances, interface with client applications and direct operations to the appropriate shard
or shards. The query router processes and targets operations to shards and then returns results to the clients. A sharded
cluster can contain more than one query router to divide the client request load. A client sends requests to one query
router. Most sharded cluster have many query routers.
Config servers store the clusters metadata. This data contains a mapping of the clusters data set to the shards. The
query router uses this metadata to target operations to specific shards. Production sharded clusters have exactly 3
config servers.
481
Shard Keys
To shard a collection, you need to select a shard key. A shard key is either an indexed field or an indexed compound
field that exists in every document in the collection. MongoDB divides the shard key values into chunks and distributes
the chunks evenly across the shards. To divide the shard key values into chunks, MongoDB uses either range based
partitioning and hash based partitioning. See Shard Keys (page 492) for more information.
Range Based Sharding
For range-based sharding, MongoDB divides the data set into ranges determined by the shard key values to provide
range based partitioning. Consider a numeric shard key: If you visualize a number line that goes from negative
infinity to positive infinity, each value of the shard key falls at some point on that line. MongoDB partitions this line
into smaller, non-overlapping ranges called chunks where a chunk is range of values from some minimum value to
some maximum value.
Given a range based partitioning system, documents with close shard key values are likely to be in the same chunk,
and therefore on the same shard.
Figure 9.3: Diagram of the shard key value space segmented into smaller ranges or chunks.
482
Chapter 9. Sharding
Figure 9.5: Diagram of a shard with a chunk that exceeds the default chunk size of 64 MB and triggers a split of the
chunk into two chunks.
483
Balancing
The balancer (page 501) is a background process that manages chunk migrations. The balancer runs in all of the query
routers in a cluster.
When the distribution of a sharded collection in a cluster is uneven, the balancer process migrates chunks from the
shard that has the largest number of chunks to the shard with the least number of chunks until the collection balances.
For example: if collection users has 100 chunks on shard 1 and 50 chunks on shard 2, the balancer will migrate
chunks from shard 1 to shard 2 until the collections achieves balance.
The shards manage chunk migrations as a background operation. During migration, all requests for a chunks data
address the origin shard.
In a chunk migration, the destination shard receives all the documents in the chunk from the origin shard. Then, the
destination shard captures and applies all changes made to the data during migration process. Finally, the destination
shard updates the metadata regarding the location of the on config server.
If theres an error during the migration, the balancer aborts the process leaving the chunk on the origin shard. MongoDB removes the chunks data from the origin shard after the migration completes successfully.
Figure 9.6: Diagram of a collection distributed across three shards. For this collection, the difference in the number of
chunks between the shards reaches the migration thresholds (in this case, 2) and triggers migration.
484
Chapter 9. Sharding
Config Servers (page 488) Config servers hold the metadata about the cluster, such as the shard location of the
data.
Sharded Cluster Architectures (page 489) Outlines the requirements for sharded clusters, and provides examples of
several possible architectures for sharded clusters.
Sharded Cluster Requirements (page 489) Discusses the requirements for sharded clusters in MongoDB.
Production Cluster Architecture (page 490) Sharded cluster for production has component requirements to
provide redundancy and high availability.
Sharded Cluster Behavior (page 490) Discusses the operations of sharded clusters with regards to the automatic balancing of data in a cluster and other related availability and security considerations.
Shard Keys (page 492) MongoDB uses the shard key to divide a collections data across the clusters shards.
Sharded Cluster High Availability (page 494) Sharded clusters provide ways to address some availability concerns.
Sharded Cluster Query Routing (page 496) The clusters routers, or mongos instances, send reads and writes
to the relevant shard or shards.
Sharding Mechanics (page 500) Discusses the internal operation and behavior of sharded clusters, including chunk
migration, balancing, and the cluster metadata.
Sharded Collection Balancing (page 501) Balancing distributes a sharded collections data cluster to all of the
shards.
Sharded Cluster Metadata (page 505) The cluster maintains internal metadata that reflects the location of data
within the cluster.
485
486
Chapter 9. Sharding
particular order to the data set on a specific shard. MongoDB does not guarantee that any two contiguous chunks will
reside on a single shard.
Primary Shard
Every database has a primary 2 shard that holds all the un-sharded collections in that database.
Figure 9.8: Diagram of a primary shard. A primary shard contains non-sharded collections as well as chunks of
documents from sharded collections. Shard A is the primary shard.
To change the primary shard for a database, use the movePrimary command.
Warning: The movePrimary command can be expensive because it copies all non-sharded data to the new
shard. During this time, this data will be unavailable for other operations.
When you deploy a new sharded cluster, the first shard becomes the primary shard for all existing databases before
enabling sharding. Databases created subsequently may reside on any shard in the cluster.
Shard Status
Use the sh.status() method in the mongo shell to see an overview of the cluster. This reports includes which
shard is primary for the database and the chunk distribution across the shards. See sh.status() method for more
details.
2
The term primary shard has nothing to do with the term primary in the context of replica sets.
487
Config Servers
Config servers are special mongod instances that store the metadata (page 505) for a sharded cluster. Config servers
use a two-phase commit to ensure immediate consistency and reliability. Config servers do not run as replica sets. All
config servers must be available to deploy a sharded cluster or to make any changes to cluster metadata.
A production sharded cluster has exactly three config servers. For testing purposes you may deploy a cluster with a
single config server. But to ensure redundancy and safety in production, you should always use three.
Warning: If your cluster has a single config server, then the config server is a single point of failure. If the config
server is inaccessible, the cluster is not accessible. If you cannot recover the data on a config server, the cluster
will be inoperable.
Always use three config servers for production deployments.
Config servers store metadata for a single sharded cluster. Each cluster must have its own config servers.
Tip
Use CNAMEs to identify your config servers to the cluster so that you can rename and renumber your config servers
without downtime.
Config Database
Config servers store the metadata in the config database (page 547). The mongos instances cache this data and use it
to route reads and writes to shards.
Read and Write Operations on Config Servers
MongoDB only writes data to the config server in the following cases:
To create splits in existing chunks. For more information, see chunk splitting (page 504).
To migrate a chunk between shards. For more information, see chunk migration (page 502).
MongoDB reads data from the config server data in the following cases:
A new mongos starts for the first time, or an existing mongos restarts.
After a chunk migration, the mongos instances update themselves with the new cluster metadata.
MongoDB also uses the config server to manage distributed locks.
Config Server Availability
If one or two config servers become unavailable, the clusters metadata becomes read only. You can still read and
write data from the shards, but no chunk migrations or splits will occur until all three servers are available.
If all three config servers are unavailable, you can still use the cluster if you do not restart the mongos instances
until after the config servers are accessible again. If you restart the mongos instances before the config servers are
available, the mongos will be unable to route reads and writes.
Clusters become inoperable without the cluster metadata. Always, ensure that the config servers remain available and
intact. As such, backups of config servers are critical. The data on the config server is small compared to the data
stored in a cluster. This means the config server has a relatively low activity load, and the config server does not need
to be always available to support a sharded cluster. As a result, it is easy to back up the config servers.
488
Chapter 9. Sharding
If the name or address that a sharded cluster uses to connect to a config server changes, you must restart every mongod
and mongos instance in the sharded cluster. Avoid downtime by using CNAMEs to identify config servers within the
MongoDB deployment.
See Renaming Config Servers and Cluster Availability (page 495) for more information.
Your cluster should manage a large quantity of data if sharding is to have an effect. The default chunk size is 64
megabytes. And the balancer (page 501) will not begin moving data across shards until the imbalance of chunks among
the shards exceeds the migration threshold (page 502). In practical terms, unless your cluster has many hundreds of
megabytes of data, your data will remain on a single shard.
In some situations, you may need to shard a small collection of data. But most of the time, sharding a small collection
is not worth the added complexity and overhead unless you need additional write capacity. If you have a small data
set, a properly configured single MongoDB instance or a replica set will usually be enough for your persistence layer
needs.
489
Chunk size is user configurable. For most deployments, the default value is of 64 megabytes is ideal. See
Chunk Size (page 504) for more information.
Production Cluster Architecture
In a production cluster, you must ensure that data is redundant and that your systems are highly available. To that end,
a production cluster must have the following components:
Components
Config Servers Three config servers (page 488). Each config server must be on separate machines. A single sharded
cluster must have exclusive use of its config servers (page 488). If you have multiple sharded clusters, you will need
to have a group of config servers for each cluster.
Shards Two or more replica sets. These replica sets are the shards. For information on replica sets, see Replication
(page 377).
Query Routers (mongos) One or more mongos instances. The mongos instances are the routers for the cluster.
Typically, deployments have one mongos instance on each application server.
You may also deploy a group of mongos instances and use a proxy/load balancer between the application and the
mongos. In these deployments, you must configure the load balancer for client affinity so that every connection from
a single client reaches the same mongos.
Because cursors and other resources are specific to an single mongos instance, each client must interact with only
one mongos instance.
Example
490
Chapter 9. Sharding
Figure 9.9: Diagram of a sample sharded cluster for production purposes. Contains exactly 3 config servers, 2 or more
mongos query routers, and at least 2 shards. The shards are replica sets.
491
Figure 9.10: Diagram of a sample sharded cluster for testing/development purposes only. Contains only 1 config
server, 1 mongos router, and at least 1 shard. The shard can be either a replica set or a standalone mongod instance.
Shard Keys (page 492) MongoDB uses the shard key to divide a collections data across the clusters shards.
Sharded Cluster High Availability (page 494) Sharded clusters provide ways to address some availability concerns.
Sharded Cluster Query Routing (page 496) The clusters routers, or mongos instances, send reads and writes to the
relevant shard or shards.
Shard Keys
The shard key determines the distribution of the collections documents among the clusters shards. The shard key is
either an indexed field or an indexed compound field that exists in every document in the collection.
MongoDB partitions data in the collection using ranges of shard key values. Each range, or chunk, defines a nonoverlapping range of shard key values. MongoDB distributes the chunks, and their documents, among the shards in
the cluster.
When a chunk grows beyond the chunk size (page 504), MongoDB splits the chunk into smaller chunks, always based
on ranges in the shard key.
Considerations
Shard keys are immutable and cannot be changed after insertion. See the system limits for sharded cluster for more
information.
The index on the shard key cannot be a multikey index (page 324).
492
Chapter 9. Sharding
Figure 9.11: Diagram of the shard key value space segmented into smaller ranges or chunks.
Hashed Shard Keys
The shard key affects write and query performance by determining how the MongoDB partitions data in the cluster
and how effectively the mongos instances can direct operations to the cluster. Consider the following operational
impacts of shard key selection:
Write Scaling Some possible shard keys will allow your application to take advantage of the increased write capacity
that the cluster can provide, while others do not. Consider the following example where you shard by the values of the
default _id field, which is ObjectId.
MongoDB generates ObjectId values upon document creation to produce a unique identifier for the object. However, the most significant bits of data in this value represent a time stamp, which means that they increment in a regular
and predictable pattern. Even though this value has high cardinality (page 512), when using this, any date, or other
monotonically increasing number as the shard key, all insert operations will be storing data into a single chunk, and
therefore, a single shard. As a result, the write capacity of this shard will define the effective write capacity of the
cluster.
9.2. Sharding Concepts
493
A shard key that increases monotonically will not hinder performance if you have a very low insert rate, or if most
of your write operations are update() operations distributed through your entire data set. Generally, choose shard
keys that have both high cardinality and will distribute write operations across the entire cluster.
Typically, a computed shard key that has some amount of randomness, such as ones that include a cryptographic
hash (i.e. MD5 or SHA1) of other content in the document, will allow the cluster to scale write operations. However,
random shard keys do not typically provide query isolation (page 494), which is another important characteristic of
shard keys.
New in version 2.4: MongoDB makes it possible to shard a collection on a hashed index. This can greatly improve
write scaling. See Shard a Collection Using a Hashed Shard Key (page 513).
Querying The mongos provides an interface for applications to interact with sharded clusters that hides the complexity of data partitioning. A mongos receives queries from applications, and uses metadata from the config server
(page 488), to route queries to the mongod instances with the appropriate data. While the mongos succeeds in making all querying operational in sharded environments, the shard key you select can have a profound affect on query
performance.
See also:
The Sharded Cluster Query Routing (page 496) and config server (page 488) sections for a more general overview of
querying in sharded environments.
Query Isolation The fastest queries in a sharded environment are those that mongos will route to a single shard,
using the shard key and the cluster meta data from the config server (page 488). For queries that dont include the
shard key, mongos must query all shards, wait for their response and then return the result to the application. These
scatter/gather queries can be long running operations.
If your query includes the first component of a compound shard key 3 , the mongos can route the query directly to a
single shard, or a small number of shards, which provides better performance. Even if you query values of the shard
key reside in different chunks, the mongos will route queries directly to specific shards.
To select a shard key for a collection:
determine the most commonly included fields in queries for a given application
find which of these operations are most performance dependent.
If this field has low cardinality (i.e not sufficiently selective) you should add a second field to the shard key making a
compound shard key. The data may become more splittable with a compound shard key.
See
Sharded Cluster Query Routing (page 496) for more information on query operations in the context of sharded clusters.
Sorting In sharded systems, the mongos performs a merge-sort of all sorted query results from the shards. See
Sharded Cluster Query Routing (page 496) and Use Indexes to Sort Query Results (page 371) for more information.
Sharded Cluster High Availability
A production (page 490) cluster has no single point of failure. This section introduces the availability concerns for
MongoDB deployments in general and highlights potential failure scenarios and available resolutions.
3 In many ways, you can think of the shard key a cluster-wide unique index. However, be aware that sharded systems cannot enforce clusterwide unique indexes unless the unique field is in the shard key. Consider the Index Concepts (page 318) page for more information on indexes and
compound indexes.
494
Chapter 9. Sharding
If each application server has its own mongos instance, other application servers can continue access the database.
Furthermore, mongos instances do not maintain persistent state, and they can restart and become unavailable without
losing any state or data. When a mongos instance starts, it retrieves a copy of the config database and can begin
routing queries.
A Single mongod Becomes Unavailable in a Shard
Replica sets (page 377) provide high availability for shards. If the unavailable mongod is a primary, then the replica
set will elect (page 397) a new primary. If the unavailable mongod is a secondary, and it disconnects the primary and
secondary will continue to hold all data. In a three member replica set, even if a single member of the set experiences
catastrophic failure, two other members have full copies of the data. 4
Always investigate availability interruptions and failures. If a system is unrecoverable, replace it and create a new
member of the replica set as soon as possible to replace the lost redundancy.
All Members of a Replica Set Become Unavailable
If all members of a replica set within a shard are unavailable, all data held in that shard is unavailable. However, the
data on all other shards will remain available, and its possible to read and write data to the other shards. However,
your application must be able to deal with partial results, and you should investigate the cause of the interruption and
attempt to recover the shard as soon as possible.
One or Two Config Databases Become Unavailable
Three distinct mongod instances provide the config database using a special two-phase commits to maintain consistent
state between these mongod instances. Cluster operation will continue as normal but chunk migration (page 501) and
the cluster can create no new chunk splits (page 537). Replace the config server as soon as possible. If all config
databases become unavailable, the cluster can become inoperable.
Note: All config servers must be running and available when you first initiate a sharded cluster.
If the name or address that a sharded cluster uses to connect to a config server changes, you must restart every mongod
and mongos instance in the sharded cluster. Avoid downtime by using CNAMEs to identify config servers within the
MongoDB deployment.
To avoid downtime when renaming config servers, use DNS names unrelated to physical or virtual hostnames to refer
to your config servers (page 488).
Generally, refer to each config server using the DNS alias (e.g. a CNAME record). When specifying the config server
connection string to mongos, use these names. These records make it possible to change the IP address or rename
config servers without changing the connection string and without having to restart the entire cluster.
4 If an unavailable secondary becomes available while it still has current oplog entries, it can catch up to the latest state of the set using the
normal replication process, otherwise it must perform an initial sync.
495
Routing Process
A mongos instance uses the following processes to route queries and return results.
How mongos Determines which Shards Receive a Query A mongos instance routes a query to a cluster by:
1. Determining the list of shards that must receive the query.
2. Establishing a cursor on all targeted shards.
In some cases, when the shard key or a prefix of the shard key is a part of the query, the mongos can route the query
to a subset of the shards. Otherwise, the mongos must direct the query to all shards that hold documents for that
collection.
496
Chapter 9. Sharding
Example
Given the following shard key:
{ zipcode: 1, u_id: 1, c_date: 1 }
Depending on the distribution of chunks in the cluster, the mongos may be able to target the query at a subset of
shards, if the query contains the following fields:
{ zipcode: 1 }
{ zipcode: 1, u_id: 1 }
{ zipcode: 1, u_id: 1, c_date: 1 }
How mongos Handles Query Modifiers If the result of the query is not sorted, the mongos instance opens a result
cursor that round robins results from all cursors on the shards.
Changed in version 2.0.5: In versions prior to 2.0.5, the mongos exhausted each cursor, one by one.
If the query specifies sorted results using the sort() cursor method, the mongos instance passes the $orderby
option to the shards. When the mongos receives results it performs an incremental merge sort of the results while
returning them to the client.
If the query limits the size of the result set using the limit() cursor method, the mongos instance passes that limit
to the shards and then re-applies the limit to the result before returning the result to the client.
If the query specifies a number of records to skip using the skip() cursor method, the mongos cannot pass the skip
to the shards, but rather retrieves unskipped results from the shards and skips the appropriate number of documents
when assembling the complete result. However, when used in conjunction with a limit(), the mongos will pass
the limit plus the value of the skip() to the shards to improve the efficiency of these operations.
Detect Connections to mongos Instances
To detect if the MongoDB instance that your client is connected to is mongos, use the isMaster command. When
a client connects to a mongos, isMaster returns a document with a msg field that holds the string isdbgrid. For
example:
{
"ismaster" : true,
"msg" : "isdbgrid",
"maxBsonObjectSize" : 16777216,
"ok" : 1
}
If the application is instead connected to a mongod, the returned document does not include the isdbgrid string.
Broadcast Operations and Targeted Operations
497
Broadcast Operations mongos instances broadcast queries to all shards for the collection unless the mongos can
determine which shard or subset of shards stores this data.
Figure 9.12: Read operations to a sharded cluster. Query criteria does not include the shard key. The query router
mongos must broadcast query to all shards for the collection.
Multi-update operations are always broadcast operations.
The remove() operation is always a broadcast operation, unless the operation specifies the shard key in full.
Targeted Operations All insert() operations target to one shard.
All single update() (including upsert operations) and remove() operations must target to one shard.
Important: All single update() and remove() operations must include the shard key or the _id field in
the query specification. update() or remove() operations that affect a single document in a sharded collection
without the shard key or the _id field return an error.
For queries that include the shard key or portion of the shard key, mongos can target the query at a specific shard or
set of shards. This is the case only if the portion of the shard key included in the query is a prefix of the shard key. For
example, if the shard key is:
498
Chapter 9. Sharding
{ a: 1, b: 1, c: 1 }
The mongos program can route queries that include the full shard key or either of the following shard key prefixes at
a specific shard or set of shards:
{ a: 1 }
{ a: 1, b: 1 }
Figure 9.13: Read operations to a sharded cluster. Query criteria includes the shard key. The query router mongos
can target the query to the appropriate shard or shards.
Depending on the distribution of data in the cluster and the selectivity of the query, mongos may still have to contact
multiple shards 5 to fulfill these queries.
Sharded and Non-Sharded Data
Sharding operates on the collection level. You can shard multiple collections within a database or have multiple
databases with sharding enabled. 6 However, in production deployments, some databases and collections will use
sharding, while other databases and collections will only reside on a single shard.
5
mongos will route some queries, even some that include the shard key, to all shards, if needed.
As you configure sharding, you will use the enableSharding command to enable sharding for a database. This simply makes it possible
to use the shardCollection command on a collection within that database.
6
499
Figure 9.14: Diagram of a primary shard. A primary shard contains non-sharded collections as well as chunks of
documents from sharded collections. Shard A is the primary shard.
Regardless of the data architecture of your sharded cluster, ensure that all queries and operations use the mongos
router to access the data cluster. Use the mongos even for operations that do not impact the sharded data.
Related
Sharded Cluster Security (page 241)
500
Chapter 9. Sharding
Figure 9.15: Diagram of applications/drivers issuing queries to mongos for unsharded collection as well as sharded
collection. Config servers not shown.
Sharded Collection Balancing
Balancing is the process MongoDB uses to distribute data of a sharded collection evenly across a sharded cluster.
When a shard has too many of a sharded collections chunks compared to other shards, MongoDB automatically
balances the the chunks across the shards. The balancing procedure for sharded clusters is entirely transparent to the
user and application layer.
Cluster Balancer
The balancer process is responsible for redistributing the chunks of a sharded collection evenly among the shards for
every sharded collection. By default, the balancer process is always enabled.
Any mongos instance in the cluster can start a balancing round. When a balancer process is active, the responsible
mongos acquires a lock by modifying a document in the lock collection in the Config Database (page 547).
Note: Changed in version 2.0: Before MongoDB version 2.0, large differences in timekeeping (i.e. clock skew)
between mongos instances could lead to failed distributed locks. This carries the possibility of data loss, particularly
with skews larger than 5 minutes. Always use the network time protocol (NTP) by running ntpd on your servers to
minimize clock skew.
To address uneven chunk distribution for a sharded collection, the balancer migrates chunks (page 502) from shards
with more chunks to shards with a fewer number of chunks. The balancer migrates the chunks, one at a time, until
there is an even dispersion of chunks for the collection across the shards.
Chunk migrations carry some overhead in terms of bandwidth and workload, both of which can impact database
performance. The balancer attempts to minimize the impact by:
Moving only one chunk at a time. See also Chunk Migration Queuing (page 503).
501
Starting a balancing round only when the difference in the number of chunks between the shard with the greatest
number of chunks for a sharded collection and the shard with the lowest number of chunks for that collection
reaches the migration threshold (page 502).
You may disable the balancer temporarily for maintenance. See Disable the Balancer (page 533) for details.
You can also limit the window during which the balancer runs to prevent it from impacting production traffic. See
Schedule the Balancing Window (page 532) for details.
Note: The specification of the balancing window is relative to the local time zone of all individual mongos instances
in the cluster.
See also:
Manage Sharded Cluster Balancer (page 531).
Migration Thresholds
To minimize the impact of balancing on the cluster, the balancer will not begin balancing until the distribution of
chunks for a sharded collection has reached certain thresholds. The thresholds apply to the difference in number
of chunks between the shard with the most chunks for the collection and the shard with the fewest chunks for that
collection. The balancer has the following thresholds:
Changed in version 2.2: The following thresholds appear first in 2.2. Prior to this release, a balancing round would
only start if the shard with the most chunks had 8 more chunks than the shard with the least number of chunks.
Number of Chunks
Fewer than 20
21-80
Greater than 80
Migration Threshold
2
4
8
Once a balancing round starts, the balancer will not stop until, for the collection, the difference between the number
of chunks on any two shards for that collection is less than two or a chunk migration fails.
Shard Size
By default, MongoDB will attempt to fill all available disk space with data on every shard as the data set grows. To
ensure that the cluster always has the capacity to handle data growth, monitor disk usage as well as other performance
metrics.
When adding a shard, you may set a maximum size for that shard. This prevents the balancer from migrating chunks
to the shard when the value of mapped exceeds the maximum size. Use the maxSize parameter of the addShard
command to set the maximum size for the shard.
See also:
Change the Maximum Storage Size for a Given Shard (page 530) and Monitoring for MongoDB (page 138).
Chunk Migration Across Shards
Chunk migration moves the chunks of a sharded collection from one shard to another and is part of the balancer
(page 501) process.
502
Chapter 9. Sharding
Figure 9.16: Diagram of a collection distributed across three shards. For this collection, the difference in the number
of chunks between the shards reaches the migration thresholds (in this case, 2) and triggers migration.
Chunk Migration
MongoDB migrates chunks in a sharded cluster to distribute the chunks of a sharded collection evenly among shards.
Migrations may be either:
Manual. Only use manual migration in limited cases, such as to distribute data during bulk inserts. See Migrating
Chunks Manually (page 538) for more details.
Automatic. The balancer (page 501) process automatically migrates chunks when there is an uneven distribution
of a sharded collections chunks across the shards. See Migration Thresholds (page 502) for more details.
All chunk migrations use the following procedure:
1. The balancer process sends the moveChunk command to the source shard.
2. The source starts the move with an internal moveChunk command. During the migration process, operations
to the chunk route to the source shard. The source shard is responsible for incoming write operations for the
chunk.
3. The destination shard begins requesting documents in the chunk and starts receiving copies of the data.
4. After receiving the final document in the chunk, the destination shard starts a synchronization process to ensure
that it has the changes to the migrated documents that occurred during the migration.
5. When fully synchronized, the destination shard connects to the config database and updates the cluster metadata
with the new location for the chunk.
6. After the destination shard completes the update of the metadata, and once there are no open cursors on the
chunk, the source shard deletes its copy of the documents.
Changed in version 2.4: If the balancer needs to perform additional chunk migrations from the source shard,
the balancer can start the next chunk migration without waiting for the current migration process to finish this
deletion step. See Chunk Migration Queuing (page 503).
The migration process ensures consistency and maximizes the availability of chunks during balancing.
Chunk Migration Queuing Changed in version 2.4.
To migrate multiple chunks from a shard, the balancer migrates the chunks one at a time. However, the balancer does
not wait for the current migrations delete phase to complete before starting the next chunk migration. See Chunk
Migration (page 503) for the chunk migration process and the delete phase.
503
This queuing behavior allows shards to unload chunks more quickly in cases of heavily imbalanced cluster, such as
when performing initial data loads without pre-splitting and when adding new shards.
This behavior also affect the moveChunk command, and migration scripts that use the moveChunk command may
proceed more quickly.
In some cases, the delete phases may persist longer. If multiple delete phases are queued but not yet complete, a crash
of the replica sets primary can orphan data from multiple migrations.
Chunk Migration Write Concern Changed in version 2.4: While copying and deleting data during migrations, the
balancer waits for replication to secondaries (page 49) for every document. This slows the potential speed of a chunk
migration but ensures that a large number of chunk migrations cannot affect the availability of a sharded cluster.
See also Secondary Throttle in the v2.2 Manual7 .
Chunk Splits in a Sharded Cluster
As chunks grow beyond the specified chunk size (page 504) a mongos instance will attempt to split the chunk in half.
Splits may lead to an uneven distribution of the chunks for a collection across the shards. In such cases, the mongos
instances will initiate a round of migrations to redistribute chunks across shards. See Sharded Collection Balancing
(page 501) for more details on balancing chunks across shards.
Figure 9.17: Diagram of a shard with a chunk that exceeds the default chunk size of 64 MB and triggers a split of the
chunk into two chunks.
Chunk Size
The default chunk size in MongoDB is 64 megabytes. You can increase or reduce the chunk size (page 539), mindful
of its effect on the clusters efficiency.
1. Small chunks lead to a more even distribution of data at the expense of more frequent migrations. This creates
expense at the query routing (mongos) layer.
2. Large chunks lead to fewer migrations. This is more efficient both from the networking perspective and in terms
of internal overhead at the query routing layer. But, these efficiencies come at the expense of a potentially more
uneven distribution of data.
For many deployments, it makes sense to avoid frequent and potentially spurious migrations at the expense of a slightly
less evenly distributed data set.
7 http://docs.mongodb.org/v2.2/tutorial/configure-sharded-cluster-balancer/#sharded-cluster-config-secondary-throttle
504
Chapter 9. Sharding
Limitations
Changing the chunk size affects when chunks split but there are some limitations to its effects.
Automatic splitting only occurs during inserts or updates. If you lower the chunk size, it may take time for all
chunks to split to the new size.
Splits cannot be undone. If you increase the chunk size, existing chunks must grow through inserts or updates
until they reach the new size.
Note: Chunk ranges are inclusive of the lower boundary and exclusive of the upper boundary.
1, username:
1 }:
2. When MongoDB finishes building the index, you can safely drop the existing index on { zipcode:
1 }:
db.people.dropIndex( { zipcode: 1 } );
Since the index on the shard key cannot be a multikey index, the index { zipcode: 1, username:
can only replace the index { zipcode: 1 } if there are no array values for the username field.
1 }
If you drop the last valid index for the shard key, recover by recreating an index on just the shard key.
For restrictions on shard key indexes, see limits-shard-keys.
Sharded Cluster Metadata
Config servers (page 488) store the metadata for a sharded cluster. The metadata reflects state and organization of the
sharded data sets and system. The metadata includes the list of chunks on every shard and the ranges that define the
chunks. The mongos instances cache this data and use it to route read and write operations to shards.
Config servers store the metadata in the Config Database (page 547).
Important: Always back up the config database before doing any maintenance on the config server.
To access the config database, issue the following command from the mongo shell:
use config
505
In general, you should never edit the content of the config database directly. The config database contains the
following collections:
changelog (page 548)
chunks (page 549)
collections (page 550)
databases (page 550)
lockpings (page 550)
locks (page 550)
mongos (page 551)
settings (page 551)
shards (page 552)
version (page 552)
For more information on these collections and their role in sharded clusters, see Config Database (page 547). See Read
and Write Operations on Config Servers (page 488) for more information about reads and updates to the metadata.
Chapter 9. Sharding
Sharded Cluster Data Management (page 536) Practices that address common issues in managing large sharded
data sets.
Troubleshoot Sharded Clusters (page 545) Presents solutions to common issues and concerns relevant to the administration and use of sharded clusters. Refer to FAQ: MongoDB Diagnostics (page 587) for general diagnostic
information.
The config server processes are mongod instances that store the clusters metadata. You designate a mongod as a
config server using the --configsvr option. Each config server stores a complete copy of the clusters metadata.
In production deployments, you must deploy exactly three config server instances, each running on different servers
to assure good uptime and data safety. In test environments, you can run all three instances on a single server.
Important: All members of a sharded cluster must be able to connect to all other members of a sharded cluster,
including all shards and all config servers. Ensure that the network and security systems including all interfaces and
firewalls, allow these connections.
507
1. Create data directories for each of the three config server instances. By default, a config server stores its data
files in the /data/configdb directory. You can choose a different location. To create a data directory, issue a
command similar to the following:
mkdir /data/configdb
2. Start the three config server instances. Start each by issuing a command using the following syntax:
mongod --configsvr --dbpath <path> --port <port>
The default port for config servers is 27019. You can specify a different port. The following example starts a
config server using the default port and default data directory:
mongod --configsvr --dbpath /data/configdb --port 27019
The mongos instances are lightweight and do not require data directories. You can run a mongos instance on a
system that runs other cluster components, such as on an application server or a server running a mongod process. By
default, a mongos instance runs on port 27017.
When you start the mongos instance, specify the hostnames of the three config servers, either in the configuration file
or as command line parameters.
Tip
To avoid downtime, give each config server a logical DNS name (unrelated to the servers physical or virtual hostname). Without logical DNS names, moving or renaming a config server requires shutting down every mongod and
mongos instance in the sharded cluster.
To start a mongos instance, issue a command using the following syntax:
mongos --configdb <config server hostnames>
For example, to start a mongos that connects to config server instance running on the following hosts and on the
default ports:
cfg0.example.net
cfg1.example.net
cfg2.example.net
You would issue the following command:
mongos --configdb cfg0.example.net:27019,cfg1.example.net:27019,cfg2.example.net:27019
Each mongos in a sharded cluster must use the same configdb string, with identical host names listed in identical
order.
If you start a mongos instance with a string that does not exactly match the string used by the other mongos instances
in the cluster, the mongos return a Config Database String Error (page 545) error and refuse to start.
508
Chapter 9. Sharding
A shard can be a standalone mongod or a replica set. In a production environment, each shard should be a replica set.
1. From a mongo shell, connect to the mongos instance. Issue a command using the following syntax:
mongo --host <hostname of machine running mongos> --port <port mongos listens on>
For example, if a mongos is accessible at mongos0.example.net on port 27017, issue the following
command:
mongo --host mongos0.example.net --port 27017
2. Add each shard to the cluster using the sh.addShard() method, as shown in the examples below. Issue
sh.addShard() separately for each shard. If the shard is a replica set, specify the name of the replica set and
specify a member of the set. In production deployments, all shards should be replica sets.
Optional
You can instead use the addShard database command, which lets you specify a name and maximum size for
the shard. If you do not specify these, MongoDB automatically assigns a name and maximum size. To use the
database command, see addShard.
The following are examples of adding a shard with sh.addShard():
To add a shard for a replica set named rs1 with a member running on port 27017 on
mongodb0.example.net, issue the following command:
sh.addShard( "rs1/mongodb0.example.net:27017" )
sh.addShard( "rs1/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net
To add a shard for a standalone mongod on port 27017 of mongodb0.example.net, issue the following command:
sh.addShard( "mongodb0.example.net:27017" )
Note: It might take some time for chunks to migrate to the new shard.
Before you can shard a collection, you must enable sharding for the collections database. Enabling sharding for a
database does not redistribute data but make it possible to shard the collections in that database.
Once you enable sharding for a database, MongoDB assigns a primary shard for that database where MongoDB stores
all data before sharding begins.
1. From a mongo shell, connect to the mongos instance. Issue a command using the following syntax:
mongo --host <hostname of machine running mongos> --port <port mongos listens on>
2. Issue the sh.enableSharding() method, specifying the name of the database for which to enable sharding.
Use the following syntax:
509
sh.enableSharding("<database>")
Optionally, you can enable sharding for a database using the enableSharding command, which uses the following
syntax:
db.runCommand( { enableSharding: <database> } )
Replace the <database>.<collection> string with the full namespace of your database, which consists
of the name of your database, a dot (e.g. .), and the full name of the collection. The shard-key-pattern
represents your shard key, which you specify in the same form as you would an index key pattern.
Example
The following sequence of commands shards four collections:
sh.shardCollection("records.people", { "zipcode": 1, "name": 1 } )
sh.shardCollection("people.addresses", { "state": 1, "_id": 1 } )
sh.shardCollection("assets.chairs", { "type": 1, "_id": 1 } )
db.alerts.ensureIndex( { _id : "hashed" } )
sh.shardCollection("events.alerts", { "_id": "hashed" } )
1, "name":
This shard key distributes documents by the value of the zipcode field. If a number of documents have
the same value for this field, then that chunk will be splittable (page 512) by the values of the name field.
(b) The addresses collection in the people database using the shard key { "state":
1 }.
1, "_id":
This shard key distributes documents by the value of the state field. If a number of documents have the
same value for this field, then that chunk will be splittable (page 512) by the values of the _id field.
(c) The chairs collection in the assets database using the shard key { "type":
}.
1, "_id":
This shard key distributes documents by the value of the type field. If a number of documents have the
same value for this field, then that chunk will be splittable (page 512) by the values of the _id field.
(d) The alerts collection in the events database using the shard key { "_id":
"hashed" }.
Chapter 9. Sharding
This shard key distributes documents by a hash of the value of the _id field. MongoDB computes the hash
of the _id field for the hashed index (page 343), which should provide an even distribution of documents
across a cluster.
Considerations for Selecting Shard Keys
Choosing a Shard Key
For many collections there may be no single, naturally occurring key that possesses all the qualities of a good shard
key. The following strategies may help construct a useful shard key from existing data:
1. Compute a more ideal shard key in your application layer, and store this in all of your documents, potentially in
the _id field.
2. Use a compound shard key that uses two or three values from all documents that provide the right mix of
cardinality with scalable write operations and query isolation.
3. Determine that the impact of using a less than ideal shard key is insignificant in your use case, given:
limited write volume,
expected data size, or
application query patterns.
4. New in version 2.4: Use a hashed shard key. Choose a field that has high cardinality and create a hashed index
(page 343) on that field. MongoDB uses these hashed index values as shard key values, which ensures an even
distribution of documents across the shards.
Tip
MongoDB automatically computes the hashes when resolving queries using hashed indexes. Applications do
not need to compute hashes.
Considerations for Selecting Shard Key Choosing the correct shard key can have a great impact on the performance, capability, and functioning of your database and cluster. Appropriate shard key choice depends on the schema
of your data and the way that your applications query and write data.
Create a Shard Key that is Easily Divisible
An easily divisible shard key makes it easy for MongoDB to distribute content among the shards. Shard keys that have
a limited number of possible values can result in chunks that are unsplittable.
See also:
Cardinality (page 512)
Create a Shard Key that has High Degree of Randomness
A shard key with high degree of randomness prevents any single shard from becoming a bottleneck and will distribute
write operations among the cluster.
See also:
Write Scaling (page 493)
511
A shard key that targets a single shard makes it possible for the mongos program to return most query operations
directly from a single specific mongod instance. Your shard key should be the primary field used by your queries.
Fields with a high degree of randomness make it difficult to target operations to specific shards.
See also:
Query Isolation (page 494)
Shard Using a Compound Shard Key
The challenge when selecting a shard key is that there is not always an obvious choice. Often, an existing field in your
collection may not be the optimal key. In those situations, computing a special purpose shard key into an additional
field or using a compound shard key may help produce one that is more ideal.
Cardinality
Cardinality in the context of MongoDB, refers to the ability of the system to partition data into chunks. For example,
consider a collection of data such as an address book that stores address records:
Consider the use of a state field as a shard key:
The state keys value holds the US state for a given address document. This field has a low cardinality as all
documents that have the same value in the state field must reside on the same shard, even if a particular states
chunk exceeds the maximum chunk size.
Since there are a limited number of possible values for the state field, MongoDB may distribute data unevenly
among a small number of fixed chunks. This may have a number of effects:
If MongoDB cannot split a chunk because all of its documents have the same shard key, migrations involving these un-splittable chunks will take longer than other migrations, and it will be more difficult for your
data to stay balanced.
If you have a fixed maximum number of chunks, you will never be able to use more than that number of
shards for this collection.
Consider the use of a zipcode field as a shard key:
While this field has a large number of possible values, and thus has potentially higher cardinality, its possible
that a large number of users could have the same value for the shard key, which would make this chunk of users
un-splittable.
In these cases, cardinality depends on the data. If your address book stores records for a geographically distributed contact list (e.g. Dry cleaning businesses in America,) then a value like zipcode would be sufficient.
However, if your address book is more geographically concentrated (e.g ice cream stores in Boston Massachusetts,) then you may have a much lower cardinality.
Consider the use of a phone-number field as a shard key:
Phone number has a high cardinality, because users will generally have a unique value for this field, MongoDB
will be able to split as many chunks as needed.
While high cardinality, is necessary for ensuring an even distribution of data, having a high cardinality does not
guarantee sufficient query isolation (page 494) or appropriate write scaling (page 493).
512
Chapter 9. Sharding
To shard a collection using a hashed shard key, use an operation in the mongo that resembles the following:
sh.shardCollection( "records.active", { a: "hashed" } )
This operation shards the active collection in the records database, using a hash of the a field as the shard key.
Specify the Initial Number of Chunks
If you shard an empty collection using a hashed shard key, MongoDB automatically creates and migrates empty chunks
so that each shard has two chunks. To control how many chunks MongoDB creates when sharding the collection, use
shardCollection with the numInitialChunks parameter.
Important: MongoDB 2.4 adds support for hashed shard keys. After sharding a collection with a hashed shard key,
you must use the MongoDB 2.4 or higher mongos and mongod instances in your sharded cluster.
Warning: MongoDB hashed indexes truncate floating point numbers to 64-bit integers before hashing. For
example, a hashed index would store the same value for a field that held a value of 2.3, 2.2, and 2.9. To
prevent collisions, do not use a hashed index for floating point numbers that cannot be reliably converted to
64-bit integers (and then back to floating point). MongoDB hashed indexes do not support floating point values
larger than 253 .
Procedure
513
1. Generate a key file to store authentication information, as described in the Generate a Key File (page 261)
section.
2. On each component in the sharded cluster, enable authentication by doing one of the following:
In the configuration file, set the keyFile option to the key files path and then start the component, as in
the following example:
keyFile = /srv/mongodb/keyfile
When starting the component, set --keyFile option, which is an option for both mongos instances and
mongod instances. Set the --keyFile to the key files path.
Note: The keyFile setting implies auth, which means in most cases you do not need to set auth explicitly.
3. Add the first administrative user and then add subsequent users. See Create a User Administrator (page 258).
Add Shards to a Cluster
You add shards to a sharded cluster after you create the cluster or anytime that you need to add capacity to the cluster.
If you have not created a sharded cluster, see Deploy a Sharded Cluster (page 507).
When adding a shard to a cluster, you should always ensure that the cluster has enough capacity to support the
migration without affecting legitimate production traffic.
In production environments, all shards should be replica sets.
Add a Shard to a Cluster
2. Add a shard to the cluster using the sh.addShard() method, as shown in the examples below. Issue
sh.addShard() separately for each shard. If the shard is a replica set, specify the name of the replica
set and specify a member of the set. In production deployments, all shards should be replica sets.
Optional
You can instead use the addShard database command, which lets you specify a name and maximum size for
the shard. If you do not specify these, MongoDB automatically assigns a name and maximum size. To use the
database command, see addShard.
The following are examples of adding a shard with sh.addShard():
To add a shard for a replica set named rs1 with a member running on port 27017 on
mongodb0.example.net, issue the following command:
sh.addShard( "rs1/mongodb0.example.net:27017" )
514
Chapter 9. Sharding
sh.addShard( "rs1/mongodb0.example.net:27017,mongodb1.example.net:27017,mongodb2.example.net
To add a shard for a standalone mongod on port 27017 of mongodb0.example.net, issue the following command:
sh.addShard( "mongodb0.example.net:27017" )
Note: It might take some time for chunks to migrate to the new shard.
3. Start all three config servers, using the same invocation that you used for the single config server.
mongod --configsvr
Following this tutorial, you will convert a single 3-member replica set to a cluster that consists of 2 shards. Each shard
will consist of an independent 3-member replica set.
The tutorial uses a test environment running on a local system UNIX-like system. You should feel encouraged to
follow along at home. If you need to perform this process in a production environment, notes throughout the
document indicate procedural differences.
The procedure, from a high level, is as follows:
515
1. Create or select a 3-member replica set and insert some data into a collection.
2. Start the config databases and create a cluster with a single shard.
3. Create a second replica set with three new mongod instances.
4. Add the second replica set as a shard in the cluster.
5. Enable sharding on the desired collection or collections.
Process
Install MongoDB according to the instructions in the MongoDB Installation Tutorial (page 3).
Deploy a Replica Set with Test Data If have an existing MongoDB replica set deployment, you can omit the this
step and continue from Deploy Sharding Infrastructure (page 517).
Use the following sequence of steps to configure and deploy a replica set and to insert test data.
1. Create the following directories for the first replica set instance, named firstset:
/data/example/firstset1
/data/example/firstset2
/data/example/firstset3
To create directories, issue the following command:
mkdir -p /data/example/firstset1 /data/example/firstset2 /data/example/firstset3
2. In a separate terminal window or GNU Screen window, start three mongod instances by running each of the
following commands:
mongod --dbpath /data/example/firstset1 --port 10001 --replSet firstset --oplogSize 700 --rest
mongod --dbpath /data/example/firstset2 --port 10002 --replSet firstset --oplogSize 700 --rest
mongod --dbpath /data/example/firstset3 --port 10003 --replSet firstset --oplogSize 700 --rest
Note: The --oplogSize 700 option restricts the size of the operation log (i.e. oplog) for each mongod
instance to 700MB. Without the --oplogSize option, each mongod reserves approximately 5% of the free
disk space on the volume. By limiting the size of the oplog, each instance starts more quickly. Omit this setting
in production environments.
3. In a mongo shell session in a new terminal, connect to the mongodb instance on port 10001 by running the
following command. If you are in a production environment, first read the note below.
mongo localhost:10001/admin
Note: Above and hereafter, if you are running in a production environment or are testing this process with
mongod instances on multiple systems, replace localhost with a resolvable domain, hostname, or the IP
address of your system.
4. In the mongo shell, initialize the first replica set by issuing the following command:
db.runCommand({"replSetInitiate" :
{"_id" : "firstset", "members" : [{"_id" : 1, "host" : "localhost:10001"},
{"_id" : 2, "host" : "localhost:10002"},
{"_id" : 3, "host" : "localhost:10003"}
]}})
516
Chapter 9. Sharding
{
"info" : "Config now saved locally.
"ok" : 1
}
5. In the mongo shell, create and populate a new collection by issuing the following sequence of JavaScript
operations:
use test
switched to db test
people = ["Marc", "Bill", "George", "Eliot", "Matt", "Trey", "Tracy", "Greg", "Steve", "Kristina
for(var i=0; i<1000000; i++){
name = people[Math.floor(Math.random()*people.length)];
user_id = i;
boolean = [true, false][Math.floor(Math.random()*2)];
added_at = new Date();
number = Math.floor(Math.random()*10001);
db.test_collection.save({"name":name, "user_id":user_id, "boolean": boolean, "added_at":added_at
}
The above operations add one million documents to the collection test_collection. This can take several
minutes, depending on your system.
The script adds the documents in the following form:
Deploy Sharding Infrastructure This procedure creates the three config databases that store the clusters metadata.
Note: For development and testing environments, a single config database is sufficient. In production environments,
use three config databases. Because config instances store only the metadata for the sharded cluster, they have minimal
resource requirements.
1. Create the following data directories for three config database instances:
/data/example/config1
/data/example/config2
/data/example/config3
Issue the following command at the system prompt:
mkdir -p /data/example/config1 /data/example/config2 /data/example/config3
2. In a separate terminal window or GNU Screen window, start the config databases by running the following
commands:
mongod --configsvr --dbpath /data/example/config1 --port 20001
mongod --configsvr --dbpath /data/example/config2 --port 20002
mongod --configsvr --dbpath /data/example/config3 --port 20003
3. In a separate terminal window or GNU Screen window, start mongos instance by running the following command:
mongos --configdb localhost:20001,localhost:20002,localhost:20003 --port 27017 --chunkSize 1
Note: If you are using the collection created earlier or are just experimenting with sharding, you can use a
517
small --chunkSize (1MB works well.) The default chunkSize of 64MB means that your cluster must
have 64MB of data before the MongoDBs automatic sharding begins working.
In production environments, do not use a small shard size.
The configdb options specify the configuration databases (e.g.
localhost:20001,
localhost:20002, and localhost:2003). The mongos instance runs on the default MongoDB port (i.e. 27017), while the databases themselves are running on ports in the 30001 series. In the this
example, you may omit the --port 27017 option, as 27017 is the default port.
4. Add the first shard in mongos. In a new terminal window or GNU Screen session, add the first shard, according
to the following procedure:
(a) Connect to the mongos with the following command:
mongo localhost:27017/admin
(b) Add the first shard to the cluster by issuing the addShard command:
db.runCommand( { addShard : "firstset/localhost:10001,localhost:10002,localhost:10003" } )
Deploy a Second Replica Set This procedure deploys a second replica set. This closely mirrors the process used to
establish the first replica set above, omitting the test data.
1. Create the following data directories for the members of the second replica set, named secondset:
/data/example/secondset1
/data/example/secondset2
/data/example/secondset3
2. In three new terminal windows, start three instances of mongod with the following commands:
mongod --dbpath /data/example/secondset1 --port 10004 --replSet secondset --oplogSize 700 --rest
mongod --dbpath /data/example/secondset2 --port 10005 --replSet secondset --oplogSize 700 --rest
mongod --dbpath /data/example/secondset3 --port 10006 --replSet secondset --oplogSize 700 --rest
Note: As above, the second replica set uses the smaller oplogSize configuration. Omit this setting in
production environments.
3. In the mongo shell, connect to one mongodb instance by issuing the following command:
mongo localhost:10004/admin
4. In the mongo shell, initialize the second replica set by issuing the following command:
db.runCommand({"replSetInitiate" :
{"_id" : "secondset",
"members" : [{"_id" : 1, "host" : "localhost:10004"},
{"_id" : 2, "host" : "localhost:10005"},
{"_id" : 3, "host" : "localhost:10006"}
]}})
{
"info" : "Config now saved locally.
518
Chapter 9. Sharding
"ok" : 1
}
5. Add the second replica set to the cluster. Connect to the mongos instance created in the previous procedure and
issue the following sequence of commands:
use admin
db.runCommand( { addShard : "secondset/localhost:10004,localhost:10005,localhost:10006" } )
6. Verify that both shards are properly configured by running the listShards command. View this and example
output below:
db.runCommand({listShards:1})
{
"shards" : [
{
"_id" : "firstset",
"host" : "firstset/localhost:10001,localhost:10003,localhost:10002"
},
{
"_id" : "secondset",
"host" : "secondset/localhost:10004,localhost:10006,localhost:10005"
}
],
"ok" : 1
}
Enable Sharding MongoDB must have sharding enabled on both the database and collection levels.
Enabling Sharding on the Database Level Issue the enableSharding command. The following example enables sharding on the test database:
db.runCommand( { enableSharding : "test" } )
{ "ok" : 1 }
Create an Index on the Shard Key MongoDB uses the shard key to distribute documents between shards. Once
selected, you cannot change the shard key. Good shard keys:
have values that are evenly distributed among all documents,
group documents that are often accessed at the same time into contiguous chunks, and
allow for effective distribution of activity among shards.
Typically shard keys are compound, comprising of some sort of hash and some sort of other primary key. Selecting
a shard key depends on your data set, application architecture, and usage pattern, and is beyond the scope of this
document. For the purposes of this example, we will shard the number key. This typically would not be a good
shard key for production deployments.
Create the index with the following procedure:
use test
db.test_collection.ensureIndex({number:1})
519
See also:
The Shard Key Overview (page 492) and Shard Key (page 492) sections.
Shard the Collection Issue the following command:
use admin
db.runCommand( { shardCollection : "test.test_collection", key : {"number":1} })
{ "collectionsharded" : "test.test_collection", "ok" : 1 }
520
Chapter 9. Sharding
"storageSize" : 152453120,
"numExtents" : 23,
"indexes" : 4,
"indexSize" : 59071600,
"fileSize" : 1207959552,
"ok" : 1
}
In a few moments you can run these commands for a second time to demonstrate that chunks are migrating from
firstset to secondset.
When this procedure is complete, you will have converted a replica set into a cluster where each shard is itself a replica
set.
Convert Sharded Cluster to Replica Set
This tutorial describes the process for converting a sharded cluster to a non-sharded replica set. To convert a replica set
into a sharded cluster Convert a Replica Set to a Replicated Sharded Cluster (page 515). See the Sharding (page 479)
documentation for more information on sharded clusters.
Convert a Cluster with a Single Shard into a Replica Set
In the case of a sharded cluster with only one shard, that shard contains the full data set. Use the following procedure
to convert that cluster into a non-sharded replica set:
1. Reconfigure the application to connect to the primary member of the replica set hosting the single shard that
system will be the new replica set.
2. Optionally remove the --shardsrv option, if your mongod started with this option.
Tip
Changing the --shardsrv option will change the port that mongod listens for incoming connections on.
The single-shard cluster is now a non-sharded replica set that will accept read and write operations on the data set.
You may now decommission the remaining sharding infrastructure.
521
Use the following procedure to transition from a sharded cluster with more than one shard to an entirely new replica
set.
1. With the sharded cluster running, deploy a new replica set (page 420) in addition to your sharded cluster. The
replica set must have sufficient capacity to hold all of the data files from all of the current shards combined. Do
not configure the application to connect to the new replica set until the data transfer is complete.
2. Stop all writes to the sharded cluster. You may reconfigure your application or stop all mongos instances.
If you stop all mongos instances, the applications will not be able to read from the database. If you stop all
mongos instances, start a temporary mongos instance on that applications cannot access for the data migration
procedure.
3. Use mongodump and mongorestore (page 195) to migrate the data from the mongos instance to the new replica
set.
Note: Not all collections on all databases are necessarily sharded. Do not solely migrate the sharded collections.
Ensure that all databases and all collections migrate correctly.
4. Reconfigure the application to use the non-sharded replica set instead of the mongos instance.
The application will now use the un-sharded replica set for reads and writes. You may now decommission the remaining unused sharded cluster infrastructure.
522
Chapter 9. Sharding
To list the databases that have sharding enabled, query the databases collection in the Config Database (page 547).
A database has sharding enabled if the value of the partitioned field is true. Connect to a mongos instance
with a mongo shell, and run the following operation to get a full list of databases with sharding enabled:
use config
db.databases.find( { "partitioned": true } )
Example
You can use the following sequence of commands when to return a list of all databases in the cluster:
use config
db.databases.find()
List Shards
To list the current set of configured shards, use the listShards command, as follows:
use admin
db.runCommand( { listShards : 1 } )
To view cluster details, issue db.printShardingStatus() or sh.status(). Both methods return the same
output.
Example
In the following example output from sh.status()
sharding version displays the version number of the shard metadata.
shards displays a list of the mongod instances used as shards in the cluster.
databases displays all databases in the cluster, including database that do not have sharding enabled.
The chunks information for the foo database displays how many chunks are on each shard and displays the
range of each chunk.
--- Sharding Status --sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "shard0000", "host" : "m0.example.net:30001" }
{ "_id" : "shard0001", "host" : "m3.example2.net:50000" }
databases:
523
4. Start the config server instance on the new system. The default invocation is:
mongod --configsvr
When you start the third config server, your cluster will become writable and it will be able to create new splits and
migrate chunks as needed.
Migrate Config Servers with Different Hostnames
This procedure migrates a config server (page 488) in a sharded cluster (page 484) to a new server that uses a different
hostname. Use this procedure only if the config server will not be accessible via the same hostname.
Changing a config servers (page 488) hostname requires downtime and requires restarting every process in the
sharded cluster. If possible, avoid changing the hostname so that you can instead use the procedure to migrate a config
server and use the same hostname (page 524).
524
Chapter 9. Sharding
To migrate all the config servers in a cluster, perform this procedure for each config server separately and migrate the
config servers in reverse order from how they are listed in the mongos instances configdb string. Start with the
last config server listed in the configdb string.
1. Disable the cluster balancer process temporarily. See Disable the Balancer (page 533) for more information.
2. Shut down the config server.
This renders all config data for the sharded cluster read only.
3. Copy the contents of dbpath from the old config server to the new config server.
Example
To copy the contents of dbpath to a machine named mongodb.config2.example.net, use a command
that resembles the following:
rsync -az /data/configdb mongodb.config2.example.net:/data/configdb
4. Start the config server instance on the new system. The default invocation is:
mongod --configsvr
4. Restart the config server process that you used in the previous step to copy the data files to the new config server
instance.
9.3. Sharded Cluster Tutorials
525
5. Start the new config server instance. The default invocation is:
mongod --configsvr
6. Re-enable the balancer to allow the cluster to resume normal balancing operations. See the Disable the Balancer
(page 533) section for more information on managing the balancer process.
Note: In the course of this procedure never remove a config server from the configdb parameter on any of the
mongos instances. If you need to change the name of a config server, always make sure that all mongos instances
have three config servers specified in the configdb setting at all times.
Disable the balancer to stop chunk migration (page 502) and do not perform any metadata write operations until the
process finishes. If a migration is in progress, the balancer will complete the in-progress migration before stopping.
To disable the balancer, connect to one of the clusters mongos instances and issue the following method:
sh.stopBalancer()
Migrate each config server (page 488) by starting with the last config server listed in the configdb string. Proceed
in reverse order of the configdb string. Migrate and restart a config server before proceeding to the next. Do not
rename a config server during this process.
Note: If the name or address that a sharded cluster uses to connect to a config server changes, you must restart every
mongod and mongos instance in the sharded cluster. Avoid downtime by using CNAMEs to identify config servers
within the MongoDB deployment.
See Migrate Config Servers with Different Hostnames (page 524) for more information.
Important: Start with the last config server listed in configdb.
526
Chapter 9. Sharding
4. Start the config server instance on the new system. The default invocation is:
mongod --configsvr
If the configdb string will change as part of the migration, you must shut down all mongos instances before
changing the configdb string. This avoids errors in the sharded cluster over configdb string conflicts.
If the configdb string will remain the same, you can migrate the mongos instances sequentially or all at once.
1. Shut down the mongos instances using the shutdown command. If the configdb string is changing, shut
down all mongos instances.
2. If the hostname has changed for any of the config servers, update the configdb string for each mongos
instance. The mongos instances must all use the same configdb string. The strings must list identical host
names in identical order.
Tip
To avoid downtime, give each config server a logical DNS name (unrelated to the servers physical or virtual
hostname). Without logical DNS names, moving or renaming a config server requires shutting down every
mongod and mongos instance in the sharded cluster.
3. Restart the mongos instances being sure to use the updated configdb string if hostnames have changed.
For more information, see Start the mongos Instances (page 508).
Migrate the Shards
Migrate the shards one at a time. For each shard, follow the appropriate procedure in this section.
Migrate a Replica Set Shard To migrate a sharded cluster, migrate each member separately. First migrate the
non-primary members, and then migrate the primary last.
If the replica set has two voting members, add an arbiter (page 389) to the replica set to ensure the set keeps a majority
of its votes available during the migration. You can remove the arbiter after completing the migration.
527
For more information on updating the configuration document, see Example Reconfiguration Operations
(page 470).
6. To confirm the new configuration, issue rs.conf().
7. Wait for the member to recover. To check the members state, issue rs.status().
Migrate the Primary in a Replica Set Shard While migrating the replica sets primary, the set must elect a new
primary. This failover process which renders the replica set unavailable to perform reads or accept writes for the
duration of the election, which typically completes quickly. If possible, plan the migration during a maintenance
window.
1. Step down the primary to allow the normal failover (page 397) process. To step down the primary, connect to
the primary and issue the either the replSetStepDown command or the rs.stepDown() method. The
following example shows the rs.stepDown() method:
rs.stepDown()
2. Once the primary has stepped down and another member has become PRIMARY (page 474) state. To migrate
the stepped-down primary, follow the Migrate a Member of a Replica Set Shard (page 528) procedure
You can check the output of rs.status() to confirm the change in status.
Migrate a Standalone Shard The ideal procedure for migrating a standalone shard is to convert the standalone to a
replica set (page 432) and then use the procedure for migrating a replica set shard (page 527). In production clusters,
all shards should be replica sets, which provides continued availability during maintenance windows.
Migrating a shard as standalone is a multi-step process during which part of the shard may be unavailable. If the shard
is the primary shard for a database,the process includes the movePrimary command. While the movePrimary
runs, you should stop modifying data in that database. To migrate the standalone shard, use the Remove Shards from
an Existing Sharded Cluster (page 534) procedure.
Re-Enable the Balancer
To complete the migration, re-enable the balancer to resume chunk migrations (page 502).
Connect to one of the clusters mongos instances and pass true to the sh.setBalancerState() method:
528
Chapter 9. Sharding
sh.setBalancerState(true)
You can schedule a window of time during which the balancer can migrate chunks, as described in the following
procedures:
Schedule the Balancing Window (page 532)
Remove a Balancing Window Schedule (page 532).
The mongos instances user their own local timezones to when respecting balancer window.
8 While one of the three config servers is unavailable, the cluster cannot split any chunks nor can it migrate chunks between shards. Your
application will be able to write data to the cluster. See Config Servers (page 488) for more information.
529
The default chunk size for a sharded cluster is 64 megabytes. In most situations, the default size is appropriate for
splitting and migrating chunks. For information on how chunk size affects deployments, see details, see Chunk Size
(page 504).
Changing the default chunk size affects chunks that are processes during migrations and auto-splits but does not
retroactively affect all chunks.
To configure default chunk size, see Modify Chunk Size in a Sharded Cluster (page 539).
Change the Maximum Storage Size for a Given Shard
The maxSize field in the shards (page 552) collection in the config database (page 547) sets the maximum size
for a shard, allowing you to control whether the balancer will migrate chunks to a shard. If mapped size 9 is above
a shards maxSize, the balancer will not move chunks to the shard. Also, the balancer will not move chunks off an
overloaded shard. This must happen manually. The maxSize value only affects the balancers selection of destination
shards.
By default, maxSize is not specified, allowing shards to consume the total amount of available space on their machines if necessary.
You can set maxSize both when adding a shard and once a shard is running.
To set maxSize when adding a shard, set the addShard commands maxSize parameter to the maximum size in
megabytes. For example, the following command run in the mongo shell adds a shard with a maximum size of 125
megabytes:
db.runCommand( { addshard : "example.net:34008", maxSize : 125 } )
To set maxSize on an existing shard, insert or update the maxSize field in the shards (page 552) collection in the
config database (page 547). Set the maxSize in megabytes.
Example
Assume you have the following shard without a maxSize field:
{ "_id" : "shard0000", "host" : "example.net:34001" }
Run the following sequence of commands in the mongo shell to insert a maxSize of 125 megabytes:
use config
db.shards.update( { _id : "shard0000" }, { $set : { maxSize : 125 } } )
To later increase the maxSize setting to 250 megabytes, run the following:
use config
db.shards.update( { _id : "shard0000" }, { $set : { maxSize : 250 } } )
New in version 2.2.1: _secondaryThrottle became an option to the balancer and to command moveChunk.
_secondaryThrottle makes it possible to require the balancer wait for replication to secondaries during migrations.
9
This value includes the mapped size of all data files including thelocal and admin databases. Account for this when setting maxSize.
530
Chapter 9. Sharding
Changed in version 2.4: _secondaryThrottle became the default mode for all balancer and moveChunk operations.
Before 2.2.1, the write operations required to migrate chunks between shards do not need to replicate to secondaries in
order to succeed. However, you can configure the balancer to require migration related write operations to replicate to
secondaries. This throttles or slows the migration process and in doing so reduces the potential impact of migrations
on a sharded cluster.
You can throttle migrations by enabling the balancers _secondaryThrottle parameter. When enabled, secondary throttle requires a { w : 2 } write concern on delete and insertion operations, so that every operation
propagates to at least one secondary before the balancer issues the next operation.
Starting with version 2.4 the default secondaryThrottle value is true.
_secondaryThrottle to false.
You enable or disable _secondaryThrottle directly in the settings (page 551) collection in the config
database (page 547) by running the following commands from a mongo shell, connected to a mongos instance:
use config
db.settings.update( { "_id" : "balancer" } , { $set : { "_secondaryThrottle" : true } } , { upsert :
You also can enable secondary throttle when issuing the moveChunk command by setting _secondaryThrottle
to true. For more information, see moveChunk.
Manage Sharded Cluster Balancer
This page describes common administrative procedures related to balancing. For an introduction to balancing, see
Sharded Collection Balancing (page 501). For lower level information on balancing, see Cluster Balancer (page 501).
See also:
Configure Behavior of Balancer Process in Sharded Clusters (page 529)
Check the Balancer State
The following command checks if the balancer is enabled (i.e. that the balancer is allowed to run). The command does
not check if the balancer is active (i.e. if it is actively balancing chunks).
To see if the balancer is enabled in your cluster, issue the following command, which returns a boolean:
sh.getBalancerState()
When this command returns, you will see output like the following:
531
{
"_id" : "balancer",
"process" : "mongos0.example.net:1292810611:1804289383",
"state" : 2,
"ts" : ObjectId("4d0f872630c42d1978be8a2e"),
"when" : "Mon Dec 20 2010 11:41:10 GMT-0500 (EST)",
"who" : "mongos0.example.net:1292810611:1804289383:Balancer:846930886",
"why" : "doing balance round" }
the
mongos
running
on
the
system
with
the
hostname
The value in the state field indicates that a mongos has the lock. For version 2.0 and later, the value of an
active lock is 2; for earlier versions the value is 1.
Schedule the Balancing Window
In some situations, particularly when your data set grows slowly and a migration can impact performance, its useful
to be able to ensure that the balancer is active only at certain times. Use the following procedure to specify a window
during which the balancer will be able to migrate chunks:
1. Connect to any mongos in the cluster using the mongo shell.
2. Issue the following command to switch to the Config Database (page 547):
use config
3. Use an operation modeled on the following example update() operation to modify the balancers window:
Replace <start-time> and <end-time> with time values using two digit hour and minute values (e.g
HH:MM) that describe the beginning and end boundaries of the balancing window. These times will be evaluated
relative to the time zone of each individual mongos instance in the sharded cluster. If your mongos instances
are physically located in different time zones, use a common time zone (e.g. GMT) to ensure that the balancer
window is interpreted correctly.
For instance, running the following will force the balancer to run between 11PM and 6AM local time only:
Note: The balancer window must be sufficient to complete the migration of all data inserted during the day.
As data insert rates can change based on activity and usage patterns, it is important to ensure that the balancing window
you select will be sufficient to support the needs of your deployment.
If you have set the balancing window (page 532) and wish to remove the schedule so that the balancer is always
running, issue the following sequence of operations:
use config
db.settings.update({ _id : "balancer" }, { $unset : { activeWindow : true } })
532
Chapter 9. Sharding
By default the balancer may run at any time and only moves chunks as needed. To disable the balancer for a short
period of time and prevent all migration, use the following procedure:
1. Connect to any mongos in the cluster using the mongo shell.
2. Issue the following operation to disable the balancer:
sh.setBalancerState(false)
If a migration is in progress, the system will complete the in-progress migration before stopping.
3. To verify that the balancer has stopped, issue the following command, which returns false if the balancer is
stopped:
sh.getBalancerState()
Optionally, to verify no migrations are in progress after disabling, issue the following operation in the mongo
shell:
use config
while( sh.isBalancerRunning() ) {
print("waiting...");
sleep(1000);
}
Note: To disable the balancer from a driver that does not have the sh.startBalancer() helper, issue the
following command from the config database:
db.settings.update( { _id: "balancer" }, { $set : { stopped: true } } , true )
Use this procedure if you have disabled the balancer and are ready to re-enable it:
1. Connect to any mongos in the cluster using the mongo shell.
2. Issue one of the following operations to enable the balancer:
From the mongo shell, issue:
sh.setBalancerState(true)
From a driver that does not have the sh.startBalancer() helper, issue the following from the config
database:
db.settings.update( { _id: "balancer" }, { $set : { stopped: false } } , true )
If MongoDB migrates a chunk during a backup (page 136), you can end with an inconsistent snapshot of your sharded
cluster. Never run a backup while the balancer is active. To ensure that the balancer is inactive during your backup
operation:
Set the balancing window (page 532) so that the balancer is inactive during the backup. Ensure that the backup
can complete while you have the balancer disabled.
533
manually disable the balancer (page 533) for the duration of the backup procedure.
If you turn the balancer off while it is in the middle of a balancing round, the shut down is not instantaneous. The
balancer completes the chunk move in-progress and then ceases all further balancing rounds.
Before starting a backup operation, confirm that the balancer is not active. You can use the following command to
determine if the balancer is active:
!sh.getBalancerState() && !sh.isBalancerRunning()
When the backup procedure is complete you can reactivate the balancer process.
Remove Shards from an Existing Sharded Cluster
To remove a shard you must ensure the shards data is migrated to the remaining shards in the cluster. This procedure
describes how to safely migrate data and how to remove a shard.
This procedure describes how to safely remove a single shard. Do not use this procedure to migrate an entire cluster
to new hardware. To migrate an entire shard to new hardware, migrate individual shards as if they were independent
replica sets.
To remove a shard, first connect to one of the clusters mongos instances using mongo shell. Then use the sequence
of tasks in this document to remove a shard from the cluster.
Ensure the Balancer Process is Enabled
To successfully migrate data from a shard, the balancer process must be enabled. Check the balancer state using
the sh.getBalancerState() helper in the mongo shell. For more information, see the section on balancer
operations (page 533).
Determine the Name of the Shard to Remove
To determine the name of the shard, connect to a mongos instance with the mongo shell and either:
Use the listShards command, as in the following:
db.adminCommand( { listShards: 1 } )
Run the removeShard command. This begins draining chunks from the shard you are removing to other shards
in the cluster. For example, for a shard named mongodb0, run:
db.runCommand( { removeShard: "mongodb0" } )
Depending on your network capacity and the amount of data, this operation can take from a few minutes to several
days to complete.
534
Chapter 9. Sharding
To check the progress of the migration at any stage in the process, run removeShard. For example, for a shard
named mongodb0, run:
db.runCommand( { removeShard: "mongodb0" } )
{ msg: "draining ongoing" , state: "ongoing" , remaining: { chunks: NumberLong(42), dbs : NumberLong(
In the output, the remaining document displays the remaining number of chunks that MongoDB must migrate to
other shards and the number of MongoDB databases that have primary status on this shard.
Continue checking the status of the removeShard command until the number of chunks remaining is 0. Then proceed
to the next step.
Move Unsharded Data
If the shard is the primary shard for one or more databases in the cluster, then the shard will have unsharded data. If
the shard is not the primary shard for any databases, skip to the next task, Finalize the Migration (page 535).
In a cluster, a database with unsharded collections stores those collections only on a single shard. That shard becomes
the primary shard for that database. (Different databases in a cluster can have different primary shards.)
Warning: Do not perform this procedure until you have finished draining the shard.
1. To determine if the shard you are removing is the primary shard for any of the clusters databases, issue one of
the following methods:
sh.status()
db.printShardingStatus()
In the resulting document, the databases field lists each database and its primary shard. For example, the
following database field shows that the products database uses mongodb0 as the primary shard:
{
"_id" : "products",
"partitioned" : true,
"primary" : "mongodb0" }
2. To move a database to another shard, use the movePrimary command. For example, to migrate all remaining
unsharded data from mongodb0 to mongodb1, issue the following command:
db.runCommand( { movePrimary: "products", to: "mongodb1" })
This command does not return until MongoDB completes moving all data, which may take a long time. The
response from this command will resemble the following:
{ "primary" : "mongodb1", "ok" : 1 }
To clean up all metadata information and finalize the removal, run removeShard again. For example, for a shard
named mongodb0, run:
db.runCommand( { removeShard: "mongodb0" } )
535
Once the value of the stage field is completed, you may safely stop the processes comprising the mongodb0
shard.
See also:
Backup and Restore Sharded Clusters (page 198)
536
Chapter 9. Sharding
1. Split empty chunks in your collection by manually performing the split command on chunks.
Example
To create chunks for documents in the myapp.users collection using the email field as the shard key, use
the following operation in the mongo shell:
for ( var x=97; x<97+26; x++ ){
for( var y=97; y<97+26; y+=6 ) {
var prefix = String.fromCharCode(x) + String.fromCharCode(y);
db.runCommand( { split : "myapp.users" , middle : { email : prefix } } );
}
}
537
The following command splits the chunk that contains the value of 63109 for the zipcode field in the people
collection of the records database:
sh.splitFind( "records.people", { "zipcode": 63109 } )
Use splitAt() to split a chunk in two, using the queried document as the lower bound in the new chunk:
Example
The following command splits the chunk that contains the value of 63109 for the zipcode field in the people
collection of the records database.
sh.splitAt( "records.people", { "zipcode": 63109 } )
Note: splitAt() does not necessarily split the chunk into two equally sized chunks. The split occurs at the location
of the document matching the query, regardless of where that document is in the chunk.
This command moves the chunk that includes the shard key value smith to the shard named
mongodb-shard3.example.net. The command will block until the migration is complete.
Tip
To return a list of shards, use the listShards command.
Example
Evenly migrate chunks
To evenly migrate chunks for the myapp.users collection, put each prefix chunk on the next shard from the other
and run the following commands in the mongo shell:
538
Chapter 9. Sharding
See Create Chunks in a Sharded Cluster (page 536) for an introduction to pre-splitting.
New in version 2.2: The moveChunk command has the: _secondaryThrottle parameter. When set to true,
MongoDB ensures that changes to shards as part of chunk migrations replicate to secondaries throughout the migration
operation. For more information, see Require Replication before Chunk Migration (Secondary Throttle) (page 530).
Changed in version 2.4: In 2.4, _secondaryThrottle is true by default.
Warning: The moveChunk command may produce the following error message:
The collection's metadata lock is already taken.
This occurs when clients have too many open cursors that access the migrating chunk. You may either wait until
the cursors complete their operations or close the cursors manually.
3. Issue the following save() operation to store the global chunk size configuration value:
db.settings.save( { _id:"chunksize", value: <size> } )
Note: The chunkSize and --chunkSize options, passed at runtime to the mongos, do not affect the chunk size
after you have initialized the cluster.
To avoid confusion, always set the chunk size using the above procedure instead of the runtime options.
Modifying the chunk size has several limitations:
Automatic splitting only occurs on insert or update.
If you lower the chunk size, it may take time for all chunks to split to the new size.
Splits cannot be undone.
If you increase the chunk size, existing chunks grow only through insertion or updates until they reach the new
size.
539
Shard key range tags are distinct from replica set member tags (page 408).
Hash-based sharding does not support tag-aware sharding.
Behavior and Operations
The balancer migrates chunks of documents in a sharded collections to the shards associated with a tag that has a shard
key range with an upper bound greater than the chunks lower bound.
During balancing rounds, if the balancer detects that any chunks violate configured tags, the balancer migrates chunks
in tagged ranges to shards associated with those tags.
After configuring tags with a shard key range, and associating it with a shard or shards, the cluster may take some time
to balance the data among the shards. This depends on the division of chunks and the current distribution of data in
the cluster.
Once configured, the balancer respects tag ranges during future balancing rounds (page 501).
See also:
Manage Shard Tags (page 541)
Chunks that Span Multiple Tag Ranges
A single chunk may contain data with a shard key values that falls into ranges associated with more than one tag. To
accommodate these situations, the balancer may migrate chunks to shards that contain shard key values that exceed
the upper bound of the selected tag range.
Example
Given a sharded collection with two configured tag ranges:
Shard key values between 100 and 200 have tags to direct corresponding chunks to shards tagged NYC.
Shard key values between 200 and 300 have tags to direct corresponding chunks to shards tagged SFO.
For this collection cluster, the balancer will migrate a chunk with shard key values ranging between 150 and 220 to
a shard tagged NYC, since 150 is closer to 200 than 300.
To ensure that your collection has no potentially ambiguously tagged chunks, create splits on your tag boundaries
(page 537). You can then manually migrate chunks to the appropriate shards, or wait for the balancer to automatically
migrate these chunks.
540
Chapter 9. Sharding
Associate tags with a particular shard using the sh.addShardTag() method when connected to a mongos instance. A single shard may have multiple tags, and multiple shards may also have the same tag.
Example
The following example adds the tag NYC to two shards, and the tags SFO and NRT to a third shard:
sh.addShardTag("shard0000",
sh.addShardTag("shard0001",
sh.addShardTag("shard0002",
sh.addShardTag("shard0002",
"NYC")
"NYC")
"SFO")
"NRT")
You may remove tags from a particular shard using the sh.removeShardTag() method when connected to a
mongos instance, as in the following example, which removes the NRT tag from a shard:
sh.removeShardTag("shard0002", "NRT")
To assign a tag to a range of shard keys use the sh.addTagRange() method when connected to a mongos instance.
Any given shard key range may only have one assigned tag. You cannot overlap defined ranges, or tag the same range
more than once.
Example
Given a collection named users in the records database, sharded by the zipcode field. The following operations
assign:
two ranges of zip codes in Manhattan and Brooklyn the NYC tag
one range of zip codes in San Francisco the SFO tag
sh.addTagRange("records.users", { zipcode: "10001" }, { zipcode: "10281" }, "NYC")
sh.addTagRange("records.users", { zipcode: "11201" }, { zipcode: "11240" }, "NYC")
sh.addTagRange("records.users", { zipcode: "94102" }, { zipcode: "94135" }, "SFO")
Note: Shard ranges are always inclusive of the lower value and exclusive of the upper boundary.
The mongod does not provide a helper for removing a tag range. You may delete tag assignment from a shard key
range by removing the corresponding document from the tags (page 552) collection of the config database.
Each document in the tags (page 552) holds the namespace of the sharded collection and a minimum shard key
value.
541
Example
The following example removes the NYC tag assignment for the range of zip codes within Manhattan:
use config
db.tags.remove({ _id: { ns: "records.users", min: { zipcode: "10001" }}, tag: "NYC" })
The output from sh.status() lists tags associated with a shard, if any, for each shard. A shards tags exist in the
shards document in the shards (page 552) collection of the config database. To return all shards with a specific
tag, use a sequence of operations that resemble the following, which will return only those shards tagged with NYC:
use config
db.shards.find({ tags: "NYC" })
You can find tag ranges for all namespaces in the tags (page 552) collection of the config database. The output of
sh.status() displays all tag ranges. To return all shard key ranges tagged with NYC, use the following sequence
of operations:
use config
db.tags.find({ tags: "NYC" })
The unique constraint on indexes ensures that only one document can have a value for a field in a collection. For
sharded collections these unique indexes cannot enforce uniqueness because insert and indexing operations are local
to each shard.
MongoDB does not support creating new unique indexes in sharded clusters and will not allow you to shard collections
with unique indexes on fields other than the _id field.
If you need to ensure that a field is always unique in all collections in a sharded environment, there are three options:
1. Enforce uniqueness of the shard key (page 492).
MongoDB can enforce uniqueness for the shard key. For compound shard keys, MongoDB will enforce uniqueness on the entire key combination, and not for a specific component of the shard key.
You cannot specify a unique constraint on a hashed index (page 333).
2. Use a secondary collection to enforce uniqueness.
Create a minimal collection that only contains the unique field and a reference to a document in the main
collection. If you always insert into a secondary collection before inserting to the main collection, MongoDB
will produce an error if you attempt to use a duplicate key.
If you have a small data set, you may not need to shard this collection and you can create multiple unique
indexes. Otherwise you can shard on a single unique key.
3. Use guaranteed unique identifiers.
Universally unique identifiers (i.e. UUID) like the ObjectId are guaranteed to be unique.
542
Chapter 9. Sharding
Procedures
Remember that the _id field index is always unique. By default, MongoDB inserts an ObjectId into the _id field.
However, you can manually insert your own value into the _id field and use this as the shard key. To use the _id
field as the shard key, use the following operation:
db.runCommand( { shardCollection : "test.users" } )
Limitations
You can only enforce uniqueness on one single field in the collection using this method.
If you use a compound shard key, you can only enforce uniqueness on the combination of component keys in
the shard key.
In most cases, the best shard keys are compound keys that include elements that permit write scaling (page 493) and
query isolation (page 494), as well as high cardinality (page 512). These ideal shard keys are not often the same keys
that require uniqueness and enforcing unique values in these collections requires a different approach.
Unique Constraints on Arbitrary Fields If you cannot use a unique field as the shard key or if you need to enforce
uniqueness over multiple fields, you must create another collection to act as a proxy collection. This collection must
contain both a reference to the original document (i.e. its ObjectId) and the unique key.
If you must shard this proxy collection, then shard on the unique key using the above procedure (page 543); otherwise, you can simply create multiple unique indexes on the collection.
Process Consider the following for the proxy collection:
{
"_id" : ObjectId("...")
"email" ": "..."
}
The _id field holds the ObjectId of the document it reflects, and the email field is the field on which you want to
ensure uniqueness.
To shard this collection, use the following operation using the email field as the shard key:
db.runCommand( { shardCollection : "records.proxy" , key : { email : 1 } , unique : true } );
If you do not need to shard the proxy collection, use the following command to create a unique index on the email
field:
db.proxy.ensureIndex( { "email" : 1 }, {unique : true} )
You may create multiple unique indexes on this collection if you do not plan to shard the proxy collection.
To insert documents, use the following procedure in the JavaScript shell:
543
use records;
var primary_id = ObjectId();
db.proxy.insert({
"_id" : primary_id
"email" : "[email protected]"
})
// if: the above operation returns successfully,
// then continue:
db.information.insert({
"_id" : primary_id
"email": "[email protected]"
// additional information...
})
You must insert a document into the proxy collection first. If this operation succeeds, the email field is unique, and
you may continue by inserting the actual document into the information collection.
See
The full documentation of: ensureIndex() and shardCollection.
Considerations
Your application must catch errors when inserting documents into the proxy collection and must enforce
consistency between the two collections.
If the proxy collection requires sharding, you must shard on the single field on which you want to enforce
uniqueness.
To enforce uniqueness on more than one field using sharded proxy collections, you must have one proxy collection for every field for which to enforce uniqueness. If you create multiple unique indexes on a single proxy
collection, you will not be able to shard proxy collections.
Use Guaranteed Unique Identifier The best way to ensure a field has unique values is to generate universally
unique identifiers (UUID,) such as MongoDBs ObjectId values.
This approach is particularly useful for the_id field, which must be unique: for collections where you are not
sharding by the _id field the application is responsible for ensuring that the _id field is unique.
Shard GridFS Data Store
When sharding a GridFS store, consider the following:
files Collection
Most deployments will not need to shard the files collection. The files collection is typically small, and only
contains metadata. None of the required keys for GridFS lend themselves to an even distribution in a sharded situation.
If you must shard the files collection, use the _id field possibly in combination with an application field.
Leaving files unsharded means that all the file metadata documents live on one shard. For production GridFS stores
you must store the files collection on a replica set.
544
Chapter 9. Sharding
chunks Collection
1 , n :
db.fs.chunks.ensureIndex( { files_id : 1 , n : 1 } )
db.runCommand( { shardCollection : "test.fs.chunks" , key : { files_id : 1 , n : 1 } } )
You may also want to shard using just the file_id field, as in the following operation:
db.runCommand( { shardCollection : "test.fs.chunks" , key : {
Important: { files_id : 1 , n :
for the chunks collection of a GridFS store.
1 } and { files_id :
files_id : 1 } } )
And:
mongos specified a different config database string
To solve the issue, restart the mongos with the correct string.
Cursor Fails Because of Stale Config Data
A query returns the following warning when one or more of the mongos instances has not yet updated its cache of
the clusters metadata from the config database:
could not initialize cursor across all shards because : stale config detected
This warning should not propagate back to your application. The warning will repeat until all the mongos instances
refresh their caches. To force an instance to refresh its cache, run the flushRouterConfig command.
545
546
Chapter 9. Sharding
Name
Description
flushRouterConfig Forces an update to the cluster metadata cached by a mongos.
addShard
Adds a shard to a sharded cluster.
checkShardingIndexInternal command that validates index on shard key.
enableSharding
Enables sharding on a specific database.
listShards
Returns a list of configured shards.
removeShard
Starts the process of removing a shard from a sharded cluster.
getShardMap
Internal command that reports on the state of a sharded cluster.
getShardVersion
Internal command that returns the config server version.
setShardVersion
Internal command to sets the config server version.
shardCollection
Enables the sharding functionality for a collection, allowing the collection to be
sharded.
shardingState
Reports whether the mongod is a member of a sharded cluster.
unsetSharding
Internal command that affects connections between instances in a MongoDB
deployment.
split
Creates a new chunk.
splitChunk
Internal command to split chunk. Instead use the methods sh.splitFind() and
sh.splitAt().
splitVector
Internal command that determines split points.
medianKey
Deprecated internal command. See splitVector.
moveChunk
Internal command that migrates chunks between shards.
movePrimary
Reassigns the primary shard when removing a shard from a sharded cluster.
isdbgrid
Verifies that a process is a mongos.
You can return a list of the collections, with the following helper:
show collections
547
Collections
config
config.changelog
548
Chapter 9. Sharding
Each document in the changelog (page 548) collection contains the following fields:
config.changelog._id
The value of changelog._id is: <hostname>-<timestamp>-<increment>.
config.changelog.server
The hostname of the server that holds this data.
config.changelog.clientAddr
A string that holds the address of the client, a mongos instance that initiates this change.
config.changelog.time
A ISODate timestamp that reflects when the change occurred.
config.changelog.what
Reflects the type of change recorded. Possible values are:
dropCollection
dropCollection.start
dropDatabase
dropDatabase.start
moveChunk.start
moveChunk.commit
split
multi-split
config.changelog.ns
Namespace where the change occurred.
config.changelog.details
A document that contains additional details regarding the change.
(page 549) document depends on the type of change.
config.chunks
549
"shard" : "shard0004"
}
These documents store the range of values for the shard key that describe the chunk in the min and max fields.
Additionally the shard field identifies the shard in the cluster that owns the chunk.
config.collections
config.databases
config.lockpings
550
Chapter 9. Sharding
config.locks
If a mongos holds the balancer lock, the state field has a value of 2, which means that balancer is active.
The when field indicates when the balancer began the current operation.
Changed in version 2.0: The value of the state field was 1 before MongoDB 2.0.
config.mongos
config.settings
551
config.shards
If the shard has tags (page 540) assigned, this document has a tags field, that holds an array of the tags, as in
the following example:
{ "_id" : "shard0001", "host" : "localhost:30001", "tags": [ "NYC" ] }
config.tags
config.version
552
Chapter 9. Sharding
Note: Like all databases in MongoDB, the config database contains a system.indexes (page 227) collection
contains metadata for all indexes in the database for information on indexes, see Indexes (page 313).
553
554
Chapter 9. Sharding
CHAPTER 10
555
556
557
Collections are containers for documents that share one or more indexes. Databases are groups of collections stored
on disk using a single set of data files. 6
For an example acme.users namespace, acme is the database name and users is the collection name. Period
characters can occur in collection names, so that acme.user.history is a valid namespace, with acme as the
database name, and user.history as the collection name.
While data models like this appear to support nested collections, the collection namespace is flat, and there is no
difference from the perspective of MongoDB between acme, acme.users, and acme.records.
10.2.2 How do you copy all objects from one collection to another?
In the mongo shell, you can use the following operation to duplicate the entire collection:
db.source.copyTo(newCollection)
3 http://blog.mongodb.org/post/137788967/32-bit-limitations
4 https://groups.google.com/forum/?fromgroups#!forum/mongodb-user
5
6
558
Warning: When using db.collection.copyTo() check field types to ensure that the operation does
not remove type information from documents during the translation from BSON to JSON. Consider using
cloneCollection() to maintain type fidelity.
Also consider the cloneCollection command that may provide some of this functionality.
559
560
If you shorten the filed named last_name to lname and the field name best_score to score, as follows,
you could save 9 bytes per document.
{ lname : "Smith", score : 3.9 }
Shortening field names reduces expressiveness and does not provide considerable benefit on for larger documents and where document overhead is not significant concern. Shorter field names do not reduce the size of
indexes, because indexes have a predefined structure.
In general it is not necessary to use short field names.
Embed documents.
In some cases you may want to embed documents in other documents and save on the per-document overhead.
Here, my_query then will have a value such as { name : "Joe" }. If my_query contained special characters, for example ,, :, and {, the query simply wouldnt match any documents. For example, users cannot hijack a
query and convert it to a delete.
10.2. FAQ: MongoDB for Application Developers
561
JavaScript
Note: You can disable all server-side execution of JavaScript, by passing the --noscripting option on the
command line or setting noscripting in a configuration file.
All of the following MongoDB operations permit you to run arbitrary JavaScript expressions directly on the server:
$where
db.eval()
mapReduce
group
You must exercise care in these cases to prevent users from submitting malicious JavaScript.
Fortunately, you can express most queries in MongoDB without JavaScript and for queries that require JavaScript, you
can mix JavaScript and non-JavaScript in a single query. Place all the user-supplied fields directly in a BSON field and
pass JavaScript code to the $where field.
If you need to pass user-supplied values in a $where clause, you may escape these values with the
CodeWScope mechanism. When you set user-submitted values as variables in the scope document, you can
avoid evaluating them on the database server.
If you need to use db.eval() with user supplied values, you can either use a CodeWScope or you can supply
extra arguments to your function. For instance:
db.eval(function(userVal){...},
user_value);
This will ensure that your application sends user_value to the database server as data rather than code.
Dollar Sign Operator Escaping
Field names in MongoDBs query language have semantic meaning. The dollar sign (i.e $) is a reserved character used
to represent operators (i.e. $inc.) Thus, you should ensure that your applications users cannot inject operators
into their inputs.
In some cases, you may wish to build a BSON object with a user-provided key. In these situations, keys will need
to substitute the reserved $ and . characters. Any character is sufficient, but consider using the Unicode full width
equivalents: U+FF04 (i.e. $) and U+FF0E (i.e. .).
Consider the following example:
BSONObj my_object = BSON( a_key << a_name );
The user may have supplied a $ value in the a_key value. At the same time, my_object might be { $where :
"things" }. Consider the following cases:
Insert. Inserting this into the database does no harm. The insert process does not evaluate the object as a query.
Note: MongoDB client drivers, if properly implemented, check for reserved characters in keys on inserts.
Update. The update() operation permits $ operators in the update argument but does not support the
$where operator. Still, some users may be able to inject operators that can manipulate a single document
only. Therefore your application should escape keys, as mentioned above, if reserved characters are possible.
562
Query Generally this is not a problem for queries that resemble { x : user_obj }: dollar signs are
not top level and have no effect. Theoretically it may be possible for the user to build a query themselves.
But checking the user-submitted content for $ characters in key names may help protect against this kind of
injection.
Driver-Specific Issues
See the PHP MongoDB Driver Security Notes8 page in the PHP driver documentation for more information
When comparing values of different BSON types, MongoDB uses the following comparison order, from lowest to
highest:
1. MinKey (internal type)
2. Null
3. Numbers (ints, longs, doubles)
4. Symbol, String
5. Object
6. Array
7. BinData
8. ObjectID
9. Boolean
10. Date, Timestamp
11. Regular Expression
12. MaxKey (internal type)
MongoDB treats some types as equivalent for comparison purposes. For instance, numeric types undergo conversion
before comparison.
The comparison treats a non-existent field as it would an empty BSON Object. As such, a sort on the a field in
documents { } and { a: null } would treat the documents as equivalent in sort order.
8 http://us.php.net/manual/en/mongo.security.php
563
With arrays, a less-than comparison or an ascending sort compares the smallest element of arrays, and a greater-than
comparison or a descending sort compares the largest element of the arrays. As such, when comparing a field whose
value is a single-element array (e.g. [ 1 ]) with non-array fields (e.g. 2), the comparison is between 1 and 2. A
comparison of an empty array (e.g. [ ]) treats the empty array as less than null or a missing field.
Consider the following mongo example:
db.test.insert(
db.test.insert(
db.test.insert(
db.test.insert(
{x
{x
{x
{x
:
:
:
:
3 } );
2.9 } );
new Date() } );
true } );
db.test.find().sort({x:1});
{ "_id" : ObjectId("4b03155dce8de6586fb002c7"),
{ "_id" : ObjectId("4b03154cce8de6586fb002c6"),
{ "_id" : ObjectId("4b031566ce8de6586fb002c9"),
{ "_id" : ObjectId("4b031563ce8de6586fb002c8"),
"x"
"x"
"x"
"x"
:
:
:
:
2.9 }
3 }
true }
"Tue Nov 17 2009 16:28:03 GMT-0500 (EST)" }
The $type operator provides access to BSON type comparison in the MongoDB query syntax. See the documentation
on BSON types and the $type operator for additional information.
Warning: Storing values of the different types in the same field in a collection is strongly discouraged.
See also:
The Tailable Cursors (page 74) page for an example of a C++ use of MinKey.
The { cancelDate : { $type: 10 } } query matches documents that contains the cancelDate
field whose value is null only; i.e. the value of the cancelDate field is of BSON Type Null (i.e. 10) :
db.test.find( { cancelDate : { $type: 10 } } )
The query returns only the document that contains the null value:
{ "_id" : 1, "cancelDate" : null }
The { cancelDate :
cancelDate field:
564
{ $exists:
The query returns only the document that does not contain the cancelDate field:
{ "_id" : 2 }
See also:
The reference documentation for the $type and $exists operators.
db.getCollection("_foo").insert( { a : 1 } )
As a cursor returns documents other operations may interleave with the query: if some of these operations are updates (page 42) that cause the
document to move (in the case of a table scan, caused by document growth) or that change the indexed field on the index used by the query; then
the cursor will return the same document more than once.
11 MongoDB does not permit changes to the value of the _id field; it is not possible for a cursor that transverses this index to pass the same
document more than once.
565
Warning:
You cannot use snapshot() with sharded collections.
You cannot use snapshot() with sort() or hint() cursor methods.
As an alternative, if your collection has a field or fields that are never modified, you can use a unique index on this
field or these fields to achieve a similar result as the snapshot(). Query with hint() to explicitly force the query
to use that index.
566
See also:
Padding Factor (page 57)
You can exit the line continuation mode if you enter two blank lines, as in the following example:
> if (x > 0
...
...
>
567
10.3.3 Does the mongo shell support tab completion and other keyboard shortcuts?
The mongo shell supports keyboard shortcuts. For example,
Use the up/down arrow keys to scroll through command history. See .dbshell documentation for more information on the .dbshell file.
Use <Tab> to autocomplete or to list the completion possibilities, as in the following example which uses
<Tab> to complete the method name starting with the letter c:
db.myCollection.c<Tab>
Because there are many collection methods starting with the letter c, the <Tab> will list the various methods
that start with c.
For a full list of the shortcuts, see Shell Keyboard Shortcuts
The mongo shell prompt should now reflect the new prompt:
[email protected]>
The mongo shell prompt should now reflect the new prompt:
Uptime:1052 Documents:25024787 >
You can add the logic for the prompt in the .mongorc.js file to set the prompt each time you start up the mongo shell.
568
10.3.5 Can I edit long shell operations with an external text editor?
You can use your own editor in the mongo shell by setting the EDITOR environment variable before starting the
mongo shell. Once in the mongo shell, you can edit with the specified editor by typing edit <variable> or
edit <function>, as in the following example:
1. Set the EDITOR variable from the command line prompt:
EDITOR=vim
The command should open the vim edit session. Remember to save your changes.
5. Type myFunction to see the function definition:
myFunction
You may be familiar with a readers-writer lock as multi-reader or shared exclusive lock. See the Wikipedia page on Readers-Writer
Locks (http://en.wikipedia.org/wiki/Readers%E2%80%93writer_lock) for more information.
569
570
Operation
Issue a query
Get more data from
a cursor
Insert data
Remove data
Update data
Map-reduce
Create an index
db.eval()
eval
aggregate()
Lock Type
Read lock
Read lock
Write lock
Write lock
Write lock
Read lock and write lock, unless operations are specified as non-atomic. Portions of
map-reduce jobs can run concurrently.
Building an index in the foreground, which is the default, locks the database for extended
periods of time.
Write lock. db.eval() blocks all other JavaScript processes.
Write lock. If used with the nolock lock option, the eval option does not take a write
lock and cannot write data to the database.
Read lock
10.4.7 Does a MongoDB operation ever lock more than one database?
The following MongoDB operations lock multiple databases:
db.copyDatabase() must lock the entire mongod instance at once.
10.4. FAQ: Concurrency
571
Journaling, which is an internal operation, locks all databases for short intervals. All databases share a single
journal.
User authentication (page 239) locks the admin database as well as the database the user is accessing.
All writes to a replica sets primary lock both the database receiving the writes and then the local database for
a short time. The lock for the local database allows the mongod to write to the primarys oplog and accounts
for a small portion of the total time of the operation.
10.4.11 What kind of concurrency does MongoDB provide for JavaScript operations?
Changed in version 2.4: The V8 JavaScript engine added in 2.4 allows multiple JavaScript operations to run at the
same time. Prior to 2.4, a single mongod could only run a single JavaScript operation at once.
572
573
10.5.10 How does MongoDB ensure unique _id field values when using a shard
key other than _id?
If you do not use _id as the shard key, then your application/client layer must be responsible for keeping the _id
field unique. It is problematic for collections to have duplicate _id values.
If youre not sharding your collection by the _id field, then you should be sure to store a globally unique identifier in
that field. The default BSON ObjectId (page 129) works well in this case.
10.5.11 Ive enabled sharding and added a second shard, but all the data is still on
one server. Why?
First, ensure that youve declared a shard key for your collection. Until you have configured the shard key, MongoDB
will not create chunks, and sharding will not occur.
Next, keep in mind that the default chunk size is 64 MB. As a result, in most situations, the collection needs to have at
least 64 MB of data before a migration will occur.
574
Additionally, the system which balances chunks among the servers attempts to avoid superfluous migrations. Depending on the number of shards, your shard key, and the amount of data, systems often require at least 10 chunks of data
to trigger migrations.
You can run db.printShardingStatus() to see all the chunks present in your cluster.
575
10.5.18 What is the process for moving, renaming, or changing the number of config servers?
See Sharded Cluster Tutorials (page 506) for information on migrating and replacing config servers.
10.5.20 Is it possible to quickly update mongos servers after updating a replica set
configuration?
The mongos instances will detect these changes without intervention over time. However, if you want to force the
mongos to reload its configuration, run the flushRouterConfig command against to each mongos directly.
576
When this happens, the primary member of the shards replica set then terminates to protect data consistency. If a
secondary member can access the config database, data on the shard becomes accessible again after an election.
The user will need to resolve the chunk migration failure independently. If you encounter this issue, contact the
MongoDB User Group23 or MongoDB Support24 to address this issue.
10.5.27 How does draining a shard affect the balancing of uneven chunk distribution?
The sharded cluster balancing process controls both migrating chunks from decommissioned shards (i.e. draining) and
normal cluster balancing activities. Consider the following behaviors for different versions of MongoDB in situations
where you remove a shard in a cluster with an uneven chunk distribution:
After MongoDB 2.2, the balancer first removes the chunks from the draining shard and then balances the remaining uneven chunk distribution.
Before MongoDB 2.2, the balancer handles the uneven chunk distribution and then removes the chunks from
the draining shard.
23 http://groups.google.com/group/mongodb-user
24 http://www.mongodb.org/about/support
577
10.6.5 Does replication work over the Internet and WAN connections?
Yes.
For example, a deployment may maintain a primary and secondary in an East-coast data center along with a secondary
member for disaster recovery in a West-coast data center.
25 https://groups.google.com/forum/?fromgroups#!forum/mongodb-user
578
See also:
Deploy a Geographically Redundant Replica Set (page 425)
10.6.8 What is the preferred replication method: replica sets or replica pairs?
Deprecated since version 1.6.
Replica sets replaced replica pairs in version 1.6. Replica sets are the preferred replication mechanism in MongoDB.
10.6.10 Are write operations durable if write concern does not acknowledge
writes?
Yes.
However, if you want confirmation that a given write has arrived at the server, use write concern (page 47). The
getLastError command provides the facility for write concern. However, after the default write concern change
(page 634), the default write concern acknowledges all write operations, and unacknowledged writes must be explicitly configured. See the MongoDB Drivers and Client Libraries (page 95) documentation for your driver for more
information.
579
10.6.12 What information do arbiters exchange with the rest of the replica set?
Arbiters never receive the contents of a collection but do exchange the following data with the rest of the replica set:
Credentials used to authenticate the arbiter with the replica set. All MongoDB processes within a replica set use
keyfiles. These exchanges are encrypted.
Replica set configuration data and voting data. This information is not encrypted. Only credential exchanges
are encrypted.
If your MongoDB deployment uses SSL, then all communications between arbiters and the other members of the
replica set are secure. See the documentation for Connect to MongoDB with SSL (page 252) for more information.
Run all arbiters on secure networks, as with all MongoDB components.
See
The overview of Arbiter Members of Replica Sets (page ??).
580
See also:
Replica Set Elections (page 397)
10.6.15 Is it normal for replica set members to use different amounts of disk space?
Yes.
Factors including: different oplog sizes, different levels of storage fragmentation, and MongoDBs data file preallocation can lead to some variation in storage utilization between nodes. Storage use disparities will be most pronounced when you add members at different times.
581
10.7.5 What is the difference between soft and hard page faults?
Page faults occur when MongoDB needs access to data that isnt currently in active memory. A hard page fault
refers to situations when MongoDB must access a disk to access the data. A soft page fault, by contrast, merely
moves memory pages from one list to another, such as from an operating system file cache. In production, MongoDB
will rarely encounter soft page faults.
10.7.8 Why are the files in my data directory larger than the data in my database?
The data files in your data directory, which is the /data/db directory in default configurations, might be larger than
the data set inserted into the database. Consider the following possible causes:
Preallocated data files.
In the data directory, MongoDB preallocates data files to a particular size, in part to prevent file system fragmentation. MongoDB names the first data file <databasename>.0, the next <databasename>.1, etc.
The first file mongod allocates is 64 megabytes, the next 128 megabytes, and so on, up to 2 gigabytes, at which
point all subsequent files are 2 gigabytes. The data files include files with allocated space but that hold no data.
mongod may allocate a 1 gigabyte data file that may be 90% empty. For most larger databases, unused allocated
space is small compared to the database.
582
On Unix-like systems, mongod preallocates an additional data file and initializes the disk space to 0. Preallocating data files in the background prevents significant delays when a new database file is next allocated.
You can disable preallocation with the noprealloc run time option. However noprealloc is not intended
for use in production environments: only use noprealloc for testing and with small data sets where you
frequently drop databases.
On Linux systems you can use hdparm to get an idea of how costly allocation might be:
time hdparm --fallocate $((1024*1024)) testfile
The oplog.
If this mongod is a member of a replica set, the data directory includes the oplog.rs file, which is a preallocated
capped collection in the local database. The default allocation is approximately 5% of disk space on 64-bit
installations, see Oplog Sizing (page 411) for more information. In most cases, you should not need to resize
the oplog. However, if you do, see Change the Size of the Oplog (page 445).
The journal.
The data directory contains the journal files, which store write operations on disk prior to MongoDB applying
them to databases. See Journaling Mechanics (page 234).
Empty records.
MongoDB maintains lists of empty records in data files when deleting documents and collections. MongoDB
can reuse this space, but will never return this space to the operating system.
To de-fragment allocated storage, use compact, which de-fragments allocated space. By de-fragmenting storage, MongoDB can effectively use the allocated space. compact requires up to 2 gigabytes of extra disk space
to run. Do not use compact if you are critically low on disk space.
Important: compact only removes fragmentation from MongoDB data files and does not return any disk
space to the operating system.
To reclaim deleted space, use repairDatabase, which rebuilds the database which de-fragments the storage
and may release space to the operating system. repairDatabase requires up to 2 gigabytes of extra disk
space to run. Do not use repairDatabase if you are critically low on disk space.
Warning: repairDatabase requires enough free disk space to hold both the old and new database files
while the repair is running. Be aware that repairDatabase will block all other operations and may take
a long time to complete.
583
Also, the following scripts print the statistics for each database and collection:
View the size of the data allocated for the orders.$_id_ index with the following sequence of operations:
use test
db.orders.$_id_.stats().indexSizes
10.7.11 How do I know when the server runs out of disk space?
If your server runs out of disk space for data files, you will see something like this in the log:
Thu
Thu
Thu
Thu
Thu
Thu
Aug
Aug
Aug
Aug
Aug
Aug
11
11
11
11
11
11
13:06:09
13:06:09
13:06:09
13:06:19
13:06:19
13:06:19
[FileAllocator]
[FileAllocator]
[FileAllocator]
[FileAllocator]
[FileAllocator]
[FileAllocator]
The server remains in this state forever, blocking all writes including deletes. However, reads still work. To delete
some data and compact, using the compact command, you must restart the server first.
If your server runs out of disk space for journal files, the server process will exit. By default, mongod creates journal
files in a sub-directory of dbpath named journal. You may elect to put the journal files on another storage device
using a filesystem mount or a symlink.
Note: If you place the journal files on a separate storage device you will not be able to use a file system snapshot tool
to capture a valid snapshot of your data files and journal files.
584
585
10.8.11 Can I use a multi-key index to support a query for a whole array?
Not entirely. The index can partially support these queries because it can speed the selection of the first element of
the array; however, comparing all subsequent items in the array cannot use the index and must scan the documents
individually.
10.8.12 How can I effectively use indexes strategy for attribute lookups?
For simple attribute lookups that dont require sorted result sets or range queries, consider creating a field that contains
an array of documents where each document has a field (e.g. attrib ) that holds a specific type of attribute. You can
index this attrib field.
For example, the attrib field in the following document allows you to add an unlimited number of attributes types:
{ _id : ObjectId(...),
attrib : [
{ k: "color", v: "red" },
{ k: "shape": v: "rectangle" },
{ k: "color": v: "blue" },
{ k: "avail": v: true }
]
}
586
1, "attrib.v":
1 } index:
10.9.1 Where can I find information about a mongod process that stopped running
unexpectedly?
If mongod shuts down unexpectedly on a UNIX or UNIX-based platform, and if mongod fails to log a shutdown or
error message, then check your system logs for messages pertaining to MongoDB. For example, for logs located in
/var/log/messages, use the following commands:
sudo grep mongod /var/log/messages
sudo grep score /var/log/messages
10.9.2 Does TCP keepalive time affect sharded clusters and replica sets?
If you experience socket errors between members of a sharded cluster or replica set, that do not have other reasonable causes, check the TCP keep alive value, which Linux systems store as the tcp_keepalive_time value. A
common keep alive period is 7200 seconds (2 hours); however, different distributions and OS X may have different
settings. For MongoDB, you will have better experiences with shorter keepalive periods, on the order of 300 seconds
(five minutes).
On Linux systems you can use the following operation to check the value of tcp_keepalive_time:
cat /proc/sys/net/ipv4/tcp_keepalive_time
You can change the tcp_keepalive_time value with the following operation:
echo 300 > /proc/sys/net/ipv4/tcp_keepalive_time
The new tcp_keepalive_time value takes effect without requiring you to restart the mongod or mongos
servers. When you reboot or restart your system you will need to set the new tcp_keepalive_time value, or
see your operating systems documentation for setting the TCP keepalive value persistently.
For OS X systems, issue the following command to view the keep alive setting:
sysctl net.inet.tcp.keepinit
If your replica set or sharded cluster experiences keepalive-related issues, you must alter the
tcp_keepalive_time value on all machines hosting MongoDB processes. This includes all machines
hosting mongos or mongod servers.
29 https://groups.google.com/forum/?fromgroups#!forum/mongodb-user
587
Windows users should consider the Windows Server Technet Article on KeepAliveTime configuration30 for more
information on setting keep alive for MongoDB deployments on Windows systems.
588
The operating systems cache strategy for LRU (Least Recently Used)
The impact of journaling (page 234)
The number or rate of page faults and other MMS gauges to detect when you need more RAM
MongoDB defers to the operating system when loading data into memory from disk. It simply memory maps
(page 581) all its data files and relies on the operating system to cache data. The OS typically evicts the leastrecently-used data from RAM when it runs low on memory. For example if clients access indexes more frequently
than documents, then indexes will more likely stay in RAM, but it depends on your particular usage.
To calculate how much RAM you need, you must calculate your working set size, or the portion of your data that
clients use most often. This depends on your access patterns, what indexes you have, and the size of your documents.
If page faults are infrequent, your working set fits in RAM. If fault rates rise higher than that, you risk performance
degradation. This is less critical with SSD drives than with spinning disks.
How do I read memory statistics in the UNIX top command
Because mongod uses memory-mapped files (page 581), the memory statistics in top require interpretation in a
special way. On a large database, VSIZE (virtual bytes) tends to be the size of the entire database. If the mongod
doesnt have other processes running, RSIZE (resident bytes) is the total memory of the machine, as this counts file
system cache contents.
For Linux systems, use the vmstat command to help determine how the system uses memory. On OS X systems use
vm_stat.
589
Finally, if your shard key has a low cardinality (page 512), MongoDB may not be able to create sufficient splits among
the data.
Why would one shard receive a disproportion amount of traffic in a sharded cluster?
In some situations, a single shard or a subset of the cluster will receive a disproportionate portion of the traffic and
workload. In almost all cases this is the result of a shard key that does not effectively allow write scaling (page 493).
Its also possible that you have hot chunks. In this case, you may be able to solve the problem by splitting and then
migrating parts of these chunks.
In the worst case, you may have to consider re-sharding your data and choosing a different shard key (page 511) to
correct this pattern.
What can prevent a sharded cluster from balancing?
If you have just deployed your sharded cluster, you may want to consider the troubleshooting suggestions for a new
cluster where data remains on a single shard (page 589).
If the cluster was initially balanced, but later developed an uneven distribution of data, consider the following possible
causes:
You have deleted or removed a significant amount of data from the cluster. If you have added additional data, it
may have a different distribution with regards to its shard key.
Your shard key has low cardinality (page 512) and MongoDB cannot split the chunks any further.
Your data set is growing faster than the balancer can distribute data around the cluster. This is uncommon and
typically is the result of:
a balancing window (page 532) that is too short, given the rate of data growth.
an uneven distribution of write operations (page 493) that requires more data migration. You may have to
choose a different shard key to resolve this issue.
poor network connectivity between shards, which may lead to chunk migrations that take too long to
complete. Investigate your network configuration and interconnections between shards.
Why do chunk migrations affect sharded cluster performance?
If migrations impact your cluster or applications performance, consider the following options, depending on the nature
of the impact:
1. If migrations only interrupt your clusters sporadically, you can limit the balancing window (page 532) to prevent
balancing activity during peak hours. Ensure that there is enough time remaining to keep the data from becoming
out of balance again.
2. If the balancer is always migrating chunks to the detriment of overall cluster performance:
You may want to attempt decreasing the chunk size (page 539) to limit the size of the migration.
Your cluster may be over capacity, and you may want to attempt to add one or two shards (page 514) to
the cluster to distribute load.
Its also possible that your shard key causes your application to direct all writes to a single shard. This kind of activity
pattern can require the balancer to migrate most data soon after writing it. Consider redeploying your cluster with a
shard key that provides better write scaling (page 493).
590
CHAPTER 11
Release Notes
Always install the latest, stable version of MongoDB. See MongoDB Version Numbers (page 634) for more information.
See the following release notes for an account of the changes in major versions. Release notes also include instructions
for upgrade.
Fix for instances where mongos incorrectly reports a successful write SERVER-121461 .
Make non-primary read preferences consistent with slaveOK versioning logic SERVER-119712 .
Allow new sharded cluster connections to read from secondaries when primary is down SERVER-72463 .
All 2.4.9 improvements4 .
2.4.8 November 1, 2013
591
Fix for possible loss of documents during the chunk migration process if a document in the chunk is very large
SERVER-1047813 .
Fix for C++ client shutdown issues SERVER-889114 .
Improved replication robustness in presence of high network latency SERVER-1008515 .
Improved Solaris support SERVER-983216 , SERVER-978617 , and SERVER-708018 .
All 2.4.6 improvements19 .
2.4.5 July 3, 2013
Fix for CVE-2013-4650 Improperly grant user system privileges on databases other than local SERVER-998320 .
Fix for CVE-2013-3969 Remotely triggered segmentation fault in Javascript engine SERVER-987821 .
Fix to prevent identical background indexes from being built SERVER-985622 .
Config server performance improvements SERVER-986423 and SERVER-544224 .
Improved initial sync resilience to network failure SERVER-985325 .
6 https://jira.mongodb.org/browse/SERVER-11421
7 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.4.8%22%20AND%20project%20%3D%20SERVER
8 https://jira.mongodb.org/browse/SERVER-10596
9 https://jira.mongodb.org/browse/SERVER-9907
10 https://jira.mongodb.org/browse/SERVER-11021
11 https://jira.mongodb.org/browse/SERVER-10554
12 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.4.7%22%20AND%20project%20%3D%20SERVER
13 https://jira.mongodb.org/browse/SERVER-10478
14 https://jira.mongodb.org/browse/SERVER-8891
15 https://jira.mongodb.org/browse/SERVER-10085
16 https://jira.mongodb.org/browse/SERVER-9832
17 https://jira.mongodb.org/browse/SERVER-9786
18 https://jira.mongodb.org/browse/SERVER-7080
19 https://jira.mongodb.org/issues/?jql=fixVersion%20%3D%20%222.4.6%22%20AND%20project%20%3D%20SERVER
20 https://jira.mongodb.org/browse/SERVER-9983
21 https://jira.mongodb.org/browse/SERVER-9878
22 https://jira.mongodb.org/browse/SERVER-9856
23 https://jira.mongodb.org/browse/SERVER-9864
24 https://jira.mongodb.org/browse/SERVER-5442
25 https://jira.mongodb.org/browse/SERVER-9853
592
Fix for mongo shell ignoring modified objects _id field SERVER-938531 .
Fix for race condition in log rotation SERVER-473932 .
Fix for copydb command with authorization in a sharded cluster SERVER-909333 .
All 2.4.3 improvements34 .
2.4.2 April 17, 2013
593
Text Search
Add support for text search of content in MongoDB databases as a beta feature. See Text Indexes (page 332) for more
information.
Geospatial Support Enhancements
Add new 2dsphere index (page 328). The new index supports GeoJSON42 objects Point, LineString, and
Polygon. See 2dsphere Indexes (page 328) and Geospatial Indexes and Queries (page 326).
Introduce operators $geometry, $geoWithin and $geoIntersects to work with the GeoJSON data.
Hashed Index
Add new hashed index (page 333) to index documents using hashes of field values. When used to index a shard key,
the hashed index ensures an evenly distributed shard key. See also Hashed Shard Keys (page 493).
Improvements to the Aggregation Framework
Improve support for geospatial queries. See the $geoWithin operator and the $geoNear pipeline stage.
Improve sort efficiency when the $sort stage immediately precedes a $limit in the pipeline.
Add new operators $millisecond and $concat and modify how $min operator processes null values.
Changes to Update Operators
The mapReduce command, group command, and the $where operator expressions cannot access certain global
functions or properties, such as db, that are available in the mongo shell. See the individual command or operator for
details.
Improvements to serverStatus Command
Provide additional metrics and customization for the serverStatus command. See db.serverStatus() and
serverStatus for more information.
42 http://geojson.org/geojson-spec.html
594
Security Enhancements
Introduce a role-based access control system User Privilege Roles in MongoDB (page 267) using new system.users Privilege Documents (page 272).
Enforce uniqueness of the user in user privilege documents per database. Previous versions of MongoDB did
not enforce this requirement, and existing databases may have duplicates.
Support encrypted connections using SSL certificates signed by a Certificate Authority. See Connect to MongoDB with SSL (page 252).
For more information on security and risk management strategies, see MongoDB Security Practices and Procedures
(page 237).
Performance Improvements
V8 JavaScript Engine
JavaScript Changes in MongoDB 2.4 Consider the following impacts of V8 JavaScript Engine (page 595) in MongoDB 2.4:
Tip
Use the new interpreterVersion() method in the mongo shell and the javascriptEngine field in the
output of db.serverBuildInfo() to determine which JavaScript engine a MongoDB binary uses.
Improved Concurrency Previously, MongoDB operations that required the JavaScript interpreter had to acquire
a lock, and a single mongod could only run a single JavaScript operation at a time. The switch to V8 improves
concurrency by permitting multiple JavaScript operations to run at the same time.
Modernized JavaScript Implementation (ES5) The 5th edition of ECMAscript43 , abbreviated as ES5, adds many
new language features, including:
standardized JSON44 ,
strict mode45 ,
function.bind()46 ,
array extensions47 , and
getters and setters.
With V8, MongoDB supports the ES5 implementation of Javascript with the following exceptions.
Note: The following features do not work as expected on documents returned from MongoDB queries:
Object.seal() throws an exception on documents returned from MongoDB queries.
Object.freeze() throws an exception on documents returned from MongoDB queries.
Object.preventExtensions() incorrectly allows the addition of new properties on documents returned
from MongoDB queries.
43 http://www.ecma-international.org/publications/standards/Ecma-262.htm
44 http://www.ecma-international.org/ecma-262/5.1/#sec-15.12.1
45 http://www.ecma-international.org/ecma-262/5.1/#sec-4.2.2
46 http://www.ecma-international.org/ecma-262/5.1/#sec-15.3.4.5
47 http://www.ecma-international.org/ecma-262/5.1/#sec-15.4.4.16
595
enumerable properties, when added to documents returned from MongoDB queries, are not saved during
write operations.
See SERVER-821648 , SERVER-822349 , SERVER-821550 , and SERVER-821451 for more information.
For objects that have not been returned from MongoDB queries, the features work as expected.
Removed Non-Standard SpiderMonkey Features V8 does not support the following non-standard SpiderMonkey52 JavaScript extensions, previously supported by MongoDBs use of SpiderMonkey as its JavaScript engine.
E4X Extensions V8 does not support the non-standard E4X53 extensions. E4X provides a native XML54 object to
the JavaScript language and adds the syntax for embedding literal XML documents in JavaScript code.
You need to use alternative XML processing if you used any of the following constructors/methods:
XML()
Namespace()
QName()
XMLList()
isXMLName()
Destructuring Assignment V8 does not support the non-standard destructuring assignments. Destructuring assignment extract[s] data from arrays or objects using a syntax that mirrors the construction of array and object literals. Mozilla docs55
Example
The following destructuring assignment is invalid with V8 and throws a SyntaxError:
original = [4, 8, 15];
var [b, ,c] = a; // <== destructuring assignment
print(b) // 4
print(c) // 15
Iterator(), StopIteration(), and Generators V8 does not support Iterator(), StopIteration(), and generators56 .
InternalError() V8 does not support InternalError(). Use Error() instead.
48 https://jira.mongodb.org/browse/SERVER-8216
49 https://jira.mongodb.org/browse/SERVER-8223
50 https://jira.mongodb.org/browse/SERVER-8215
51 https://jira.mongodb.org/browse/SERVER-8214
52 https://developer.mozilla.org/en-US/docs/SpiderMonkey
53 https://developer.mozilla.org/en-US/docs/E4X
54 https://developer.mozilla.org/en-US/docs/E4X/Processing_XML_with_E4X
55 https://developer.mozilla.org/en-US/docs/JavaScript/New_in_JavaScript/1.7#Destructuring_assignment_(Merge_into_own_page.2Fsection)
56 https://developer.mozilla.org/en-US/docs/JavaScript/Guide/Iterators_and_Generators
596
for each...in Construct V8 does not support the use of for each...in57 construct. Use for (var x in y)
construct instead.
Example
The following for each (var x in y) construct is invalid with V8:
var o = { name: 'MongoDB', version: 2.4 };
for each (var value in o) {
print(value);
}
Instead, in version 2.4, you can use the for (var x in y) construct:
var o = { name: 'MongoDB', version: 2.4 };
for (var prop in o) {
var value = o[prop];
print(value);
}
You can also use the array instance method forEach() with the ES5 method Object.keys():
Object.keys(o).forEach(function (key) {
var value = o[key];
print(value);
});
Instead, you can implement using the Array instance method forEach() and the ES5 method Object.keys()
:
var a = { w: 1, x: 2, y: 3, z: 4 }
var arr = [];
Object.keys(a).forEach(function (key) {
var val = a[key];
if (val > 2) arr.push(val * val);
})
printjson(arr)
Note:
The new logic uses the Array instance method forEach() and not the generic method
57 https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Statements/for_each...in
58 https://developer.mozilla.org/en-US/docs/JavaScript/Guide/Predefined_Core_Objects#Array_comprehensions
597
Array.forEach(); V8 does not support Array generic methods. See Array Generic Methods (page 599) for
more information.
Multiple Catch Blocks V8 does not support multiple catch blocks and will throw a SyntaxError.
Example
The following multiple catch blocks is invalid with V8 and will throw "SyntaxError:
if":
Unexpected token
try {
something()
} catch (err if err instanceof SomeError) {
print('some error')
} catch (err) {
print('standard error')
}
Conditional Function Definition V8 will produce different outcomes than SpiderMonkey with conditional function
definitions59 .
Example
The following conditional function definition produces different outcomes in SpiderMonkey versus V8:
function test () {
if (false) {
function go () {};
}
print(typeof go)
}
With SpiderMonkey, the conditional function outputs undefined, whereas with V8, the conditional function outputs
function.
If your code defines functions this way, it is highly recommended that you refactor the code. The following example
refactors the conditional function definition to work in both SpiderMonkey and V8.
function test () {
var go;
if (false) {
go = function () {}
}
print(typeof go)
}
598
if (false) {
function go () {}
}
}
SyntaxError: In strict mode code, functions can only be declared at top level or immediately within a
String Generic Methods V8 does not support String generics61 . String generics are a set of methods on the String
class that mirror instance methods.
Example
The following use of the generic method String.toLowerCase() is invalid with V8:
var name = 'MongoDB';
var lower = String.toLowerCase(name);
With V8, use the String instance method toLowerCase() available through an instance of the String class
instead:
var name = 'MongoDB';
var lower = name.toLowerCase();
print(name + ' becomes ' + lower);
With V8, use the String instance methods instead of following generic methods:
String.charAt()
String.charCodeAt()
String.concat()
String.endsWith()
String.indexOf()
String.lastIndexOf()
String.localeCompare()
String.match()
String.quote()
String.replace()
String.search()
String.slice()
String.split()
String.startsWith()
String.substr()
String.substring()
String.toLocaleLowerCase()
String.toLocaleUpperCase()
String.toLowerCase()
String.toUpperCase()
String.trim()
String.trimLeft()
String.trimRight()
Array Generic Methods V8 does not support Array generic methods62 . Array generics are a set of methods on the
Array class that mirror instance methods.
Example
The following use of the generic method Array.every() is invalid with V8:
var arr = [4, 8, 15, 16, 23, 42];
function isEven (val) {
return 0 === val % 2;
}
61 https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/String#String_generic_methods
62 https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Array#Array_generic_methods
599
With V8, use the Array instance method every() available through an instance of the Array class instead:
var allEven = arr.every(isEven);
print(allEven);
With V8, use the Array instance methods instead of the following generic methods:
Array.concat()
Array.every()
Array.filter()
Array.forEach()
Array.indexOf()
Array.join()
Array.lastIndexOf()
Array.map()
Array.pop()
Array.push()
Array.reverse()
Array.shift()
Array.slice()
Array.some()
Array.sort()
Array.splice()
Array.unshift()
Array Instance Method toSource() V8 does not support the Array instance method toSource()63 . Use the
Array instance method toString() instead.
uneval() V8 does not support the non-standard method uneval(). Use the standardized JSON.stringify()64
method instead.
Change default JavaScript engine from SpiderMonkey to V8. The change provides improved concurrency for
JavaScript operations, modernized JavaScript implementation, and the removal of non-standard SpiderMonkey features, and affects all JavaScript behavior including the commands mapReduce, group, and eval and the query
operator $where.
See JavaScript Changes in MongoDB 2.4 (page 595) for more information about all changes .
BSON Document Validation Enabled by Default for mongod and mongorestore
Enable basic BSON object validation for mongod and mongorestore when writing to MongoDB data files. See
objcheck for details.
Index Build Enhancements
Add support for multiple concurrent index builds in the background by a single mongod instance. See building
indexes in the background (page 336) for more information on background index builds.
Allow the db.killOp() method to terminate a foreground index build.
Improve index validation during index creation. See Compatibility and Index Type Changes in MongoDB 2.4
(page 608) for more information.
Set Parameters as Command Line Options
Provide --setParameter as a command line option for mongos and mongod. See mongod and mongos for
list of available options for setParameter.
63 https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/Array/toSource
64 https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/JSON/stringify
600
The default write concern for insert and delete operations that occur as part of a chunk migration in a sharded cluster
now ensures that at least one secondary acknowledges each insert and deletion operation. See Chunk Migration Write
Concern (page 504).
Improved Chunk Migration Queue Behavior
Increase performance for moving multiple chunks off an overloaded shard. The balancer no longer waits for the
current migrations delete phase to complete before starting the next chunk migration. See Chunk Migration Queuing
(page 503) for details.
Enterprise
The following changes are specific to MongoDB Enterprise Editions:
SASL Library Change
In 2.4.4, MongoDB Enterprise uses Cyrus SASL. Earlier 2.4 Enterprise versions use GNU SASL (libgsasl). To
upgrade to 2.4.4 MongoDB Enterprise or greater, you must install all package dependencies related to this change,
including the appropriate Cyrus SASL GSSAPI library. See Install MongoDB Enterprise (page 16) for details of the
dependencies.
New Modular Authentication System with Support for Kerberos
In 2.4, the MongoDB Enterprise now supports authentication via a Kerberos mechanism. See Deploy MongoDB with
Kerberos Authentication (page 261) for more information. For drivers that provide support for Kerberos authentication
to MongoDB, refer to Use MongoDB Drivers to Authenticate with Kerberos (page 264).
For more information on security and risk management strategies, see MongoDB Security Practices and Procedures
(page 237).
Additional Information
Platform Notes
For OS X, MongoDB 2.4 only supports OS X versions 10.6 (Snow Leopard) and later. There are no other platform
support changes in MongoDB 2.4. See the downloads page65 for more information on platform support.
Upgrade Process
Upgrade MongoDB to 2.4 In the general case, the upgrade from MongoDB 2.2 to 2.4 is a binary-compatible dropin upgrade: shut down the mongod instances and replace them with mongod instances running 2.4. However, before
you attempt any upgrade please familiarize yourself with the content of this document, particularly the procedure for
upgrading sharded clusters (page 602) and the considerations for reverting to 2.2 after running 2.4 (page 607).
65 http://www.mongodb.org/downloads/
601
Overview Upgrading a sharded cluster from MongoDB version 2.2 to 2.4 (or 2.3) requires that you run a 2.4
mongos with the --upgrade option, described in this procedure. The upgrade process does not require downtime.
The upgrade to MongoDB 2.4 adds epochs to the meta-data for all collections and chunks in the existing cluster.
MongoDB 2.2 processes are capable of handling epochs, even though 2.2 did not require them. This procedure applies
66 http://www.mongodb.org/downloads
602
only to upgrades from version 2.2. Earlier versions of MongoDB do not correctly handle epochs. See Cluster Metadata Upgrade (page 603) for more information.
After completing the meta-data upgrade you can fully upgrade the components of the cluster. With the balancer
disabled:
Upgrade all mongos instances in the cluster.
Upgrade all 3 mongod config server instances.
Upgrade the mongod instances for each shard, one at a time.
See Upgrade Sharded Cluster Components (page 606) for more information.
Cluster Meta-data Upgrade
Considerations Beware of the following properties of the cluster upgrade process:
Before you start the upgrade, ensure that the amount of free space on the filesystem for the config database
(page 547) is at least 4 to 5 times the amount of space currently used by the config database (page 547) data
files.
Additionally, ensure that all indexes in the config database (page 547) are {v:1} indexes. If a critical index is
a {v:0} index, chunk splits can fail due to known issues with the {v:0} format. {v:0} indexes are present
on clusters created with MongoDB 2.0 or earlier.
The duration of the metadata upgrade depends on the network latency between the node performing the upgrade
and the three config servers. Ensure low latency between the upgrade process and the config servers.
While the upgrade is in progress, you cannot make changes to the collection meta-data. For example, during the
upgrade, do not perform:
sh.enableSharding(),
sh.shardCollection(),
sh.addShard(),
db.createCollection(),
db.collection.drop(),
db.dropDatabase(),
any operation that creates a database, or
any other operation that modifies the cluster meta-data in any way. See Sharding Reference (page 546) for
a complete list of sharding commands. Note, however, that not all commands on the Sharding Reference
(page 546) page modifies the cluster meta-data.
Once you upgrade to 2.4 and complete the upgrade procedure do not use 2.0 mongod and mongos processes
in your cluster. 2.0 process may re-introduce old meta-data formats into cluster meta-data.
The upgraded config database will require more storage space than before, to make backup and working copies of the
config.chunks (page 549) and config.collections (page 550) collections. As always, if storage requirements increase, the mongod might need to pre-allocate additional data files. See What tools can I use to investigate
storage use in MongoDB? (page 582) for more information.
603
Meta-data Upgrade Procedure Changes to the meta-data format for sharded clusters, stored in the config database
(page 547), require a special meta-data upgrade procedure when moving to 2.4.
Do not perform operations that modify meta-data while performing this procedure. See Upgrade a Sharded Cluster
from MongoDB 2.2 to MongoDB 2.4 (page 602) for examples of prohibited operations.
1. Before you start the upgrade, ensure that the amount of free space on the filesystem for the config database
(page 547) is at least 4 to 5 times the amount of space currently used by the config database (page 547) data
files.
Additionally, ensure that all indexes in the config database (page 547) are {v:1} indexes. If a critical index is
a {v:0} index, chunk splits can fail due to known issues with the {v:0} format. {v:0} indexes are present
on clusters created with MongoDB 2.0 or earlier.
The duration of the metadata upgrade depends on the network latency between the node performing the upgrade
and the three config servers. Ensure low latency between the upgrade process and the config servers.
To check the version of your indexes, use db.collection.getIndexes().
If any index on the config database is {v:0}, you should rebuild those indexes by connecting to the mongos
and either: rebuild all indexes using the db.collection.reIndex() method, or drop and rebuild specific
indexes using db.collection.dropIndex() and then db.collection.ensureIndex(). If you
need to upgrade the _id index to {v:1} use db.collection.reIndex().
You may have {v:0} indexes on other databases in the cluster.
2. Turn off the balancer (page 501) in the sharded cluster, as described in Disable the Balancer (page 533).
Optional
For additional security during the upgrade, you can make a backup of the config database using mongodump
or other backup tools.
3. Ensure there are no version 2.0 mongod or mongos processes still active in the sharded cluster. The automated
upgrade process checks for 2.0 processes, but network availability can prevent a definitive check. Wait 5 minutes
after stopping or upgrading version 2.0 mongos processes to confirm that none are still active.
4. Start a single 2.4 mongos process with configdb pointing to the sharded clusters config servers (page 488)
and with the --upgrade option. The upgrade process happens before the process becomes a daemon (i.e.
before --fork.)
You can upgrade an existing mongos instance to 2.4 or you can start a new mongos instance that can reach all
config servers if you need to avoid reconfiguring a production mongos.
Start the mongos with a command that resembles the following:
mongos --configdb <config servers> --upgrade
Without the --upgrade option 2.4 mongos processes will fail to start until the upgrade process is complete.
The upgrade will prevent any chunk moves or splits from occurring during the upgrade process. If there are
very many sharded collections or there are stale locks held by other failed processes, acquiring the locks for all
collections can take seconds or minutes. See the log for progress updates.
5. When the mongos process starts successfully, the upgrade is complete. If the mongos process fails to start,
check the log for more information.
If the mongos terminates or loses its connection to the config servers during the upgrade, you may always
safely retry the upgrade.
604
However, if the upgrade failed during the short critical section, the mongos will exit and report that the upgrade will require manual intervention. To continue the upgrade process, you must follow the Resync after an
Interruption of the Critical Section (page 605) procedure.
Optional
If the mongos logs show the upgrade waiting for the upgrade lock, a previous upgrade process may still be
active or may have ended abnormally. After 15 minutes of no remote activity mongos will force the upgrade
lock. If you can verify that there are no running upgrade processes, you may connect to a 2.2 mongos process
and force the lock manually:
mongo <mongos.example.net>
db.getMongo().getCollection("config.locks").findOne({ _id : "configUpgrade" })
If the process specified in the process field of this document is verifiably offline, run the following operation
to force the lock.
It is always more safe to wait for the mongos to verify that the lock is inactive, if you have any doubts about
the activity of another upgrade operation. In addition to the configUpgrade, the mongos may need to wait
for specific collection locks. Do not force the specific collection locks.
6. Upgrade and restart other mongos processes in the sharded cluster, without the --upgrade option.
See Upgrade Sharded Cluster Components (page 606) for more information.
7. Re-enable the balancer (page 533). You can now perform operations that modify cluster meta-data.
Once you have upgraded, do not introduce version 2.0 MongoDB processes into the sharded cluster. This can reintroduce old meta-data formats into the config servers. The meta-data change made by this upgrade process will help
prevent errors caused by cross-version incompatibilities in future versions of MongoDB.
Resync after an Interruption of the Critical Section During the short critical section of the upgrade that applies
changes to the meta-data, it is unlikely but possible that a network interruption can prevent all three config servers
from verifying or modifying data. If this occurs, the config servers (page 488) must be re-synced, and there may be
problems starting new mongos processes. The sharded cluster will remain accessible, but avoid all cluster metadata changes until you resync the config servers. Operations that change meta-data include: adding shards, dropping
databases, and dropping collections.
Note: Only perform the following procedure if something (e.g. network, power, etc.) interrupts the upgrade process
during the short critical section of the upgrade. Remember, you may always safely attempt the meta data upgrade
procedure (page 604).
To resync the config servers:
1. Turn off the balancer (page 501) in the sharded cluster and stop all meta-data operations. If you are in the
middle of an upgrade process (Upgrade a Sharded Cluster from MongoDB 2.2 to MongoDB 2.4 (page 602)),
you have already disabled the balancer.
2. Shut down two of the three config servers, preferably the last two listed in the configdb string. For example, if your configdb string is configA:27019,configB:27019,configC:27019, shut down
configB and configC. Shutting down the last two config servers ensures that most mongos instances will
have uninterrupted access to cluster meta-data.
3. mongodump the data files of the active config server (configA).
605
4. Move the data files of the deactivated config servers (configB and configC) to a backup location.
5. Create new, empty data directories.
6. Restart the disabled config servers with --dbpath pointing to the now-empty data directory and --port
pointing to an alternate port (e.g. 27020).
7. Use mongorestore to repopulate the data files on the disabled documents from the active config server
(configA) to the restarted config servers on the new port (configB:27020,configC:27020). These
config servers are now re-synced.
8. Restart the restored config servers on the old port, resetting the port back to the old settings (configB:27019
and configC:27019).
9. In some cases connection pooling may cause spurious failures, as the mongos disables old connections only
after attempted use. 2.4 fixes this problem, but to avoid this issue in version 2.2, you can restart all mongos
instances (one-by-one, to avoid downtime) and use the rs.stepDown() method before restarting each of the
shard replica set primaries.
10. The sharded cluster is now fully resynced; however before you attempt the upgrade process again, you must
manually reset the upgrade state using a version 2.2 mongos. Begin by connecting to the 2.2 mongos with the
mongo shell:
mongo <mongos.example.net>
11. Finally retry the upgrade process, as in Upgrade a Sharded Cluster from MongoDB 2.2 to MongoDB 2.4
(page 602).
Upgrade Sharded Cluster Components After you have successfully completed the meta-data upgrade process
described in Meta-data Upgrade Procedure (page 604), and the 2.4 mongos instance starts, you can upgrade the
other processes in your MongoDB deployment.
While the balancer is still disabled, upgrade the components of your sharded cluster in the following order:
Upgrade all mongos instances in the cluster, in any order.
Upgrade all 3 mongod config server instances, upgrading the first system in the mongos --configdb argument last.
Upgrade each shard, one at a time, upgrading the mongod secondaries before running replSetStepDown
and upgrading the primary of each shard.
When this process is complete, you can now re-enable the balancer (page 533).
Rolling Upgrade Limitation for 2.2.0 Deployments Running with auth Enabled MongoDB cannot support
deployments that mix 2.2.0 and 2.4.0, or greater, components. MongoDB version 2.2.1 and later processes can exist in
mixed deployments with 2.4-series processes. Therefore you cannot perform a rolling upgrade from MongoDB 2.2.0
to MongoDB 2.4.0. To upgrade a cluster with 2.2.0 components, use one of the following procedures.
1. Perform a rolling upgrade of all 2.2.0 processes to the latest 2.2-series release (e.g. 2.2.3) so that there are no
processes in the deployment that predate 2.2.1. When there are no 2.2.0 processes in the deployment, perform a
rolling upgrade to 2.4.0.
2. Stop all processes in the cluster. Upgrade all processes to a 2.4-series release of MongoDB, and start all processes at the same time.
606
Upgrade from 2.3 to 2.4 If you used a mongod from the 2.3 or 2.4-rc (release candidate) series, you can safely
transition these databases to 2.4.0 or later; however, if you created 2dsphere or text indexes using a mongod
before v2.4-rc2, you will need to rebuild these indexes. For example:
db.records.dropIndex( { loc: "2dsphere" } )
db.records.dropIndex( "records_text" )
db.records.ensureIndex( { loc: "2dsphere" } )
db.records.ensureIndex( { records: "text" } )
Downgrade MongoDB from 2.4 to Previous Versions For some cases the on-disk format of data files used by 2.4
and 2.2 mongod is compatible, and you can upgrade and downgrade if needed. However, several new features in 2.4
are incompatible with previous versions:
2dsphere indexes are incompatible with 2.2 and earlier mongod instances.
text indexes are incompatible with 2.2 and earlier mongod instances.
using a hashed index as a shard key are incompatible with 2.2 and earlier mongos instances.
hashed indexes are incompatible with 2.0 and earlier mongod instances.
Important: Collections sharded using hashed shard keys, should not use 2.2 mongod instances, which cannot
correctly support cluster operations for these collections.
If you completed the meta-data upgrade for a sharded cluster (page 602), you can safely downgrade to 2.2 MongoDB
processes. Do not use 2.0 processes after completing the upgrade procedure.
Note: In sharded clusters, once you have completed the meta-data upgrade procedure (page 602), you cannot use 2.0
mongod or mongos instances in the same cluster.
If you complete the meta-data upgrade, you can safely downgrade components in any order. When upgrade again,
always upgrade mongos instances before mongod instances.
Do not create 2dsphere or text indexes in a cluster that has 2.2 components.
Considerations and Compatibility If you upgrade to MongoDB 2.4, and then need to run MongoDB 2.2 with the
same data files, consider the following limitations.
If you use a hashed index as the shard key index, which is only possible under 2.4 you will not be able to
query data in this sharded collection. Furthermore, a 2.2 mongos cannot properly route an insert operation
for a collections sharded using a hashed index for the shard key index: any data that you insert using a 2.2
mongos, will not arrive on the correct shard and will not be reachable by future queries.
If you never create an 2dsphere or text index, you can move between a 2.4 and 2.2 mongod for a given
data set; however, after you create the first 2dsphere or text index with a 2.4 mongod you will need to run
a 2.2 mongod with the --upgrade option and drop any 2dsphere or text index.
Upgrade and Downgrade Procedures
Basic Downgrade and Upgrade
Except as described below, moving between 2.2 and 2.4 is a drop-in replacement:
607
Then, you will need to drop any existing 2dsphere or text indexes using db.collection.dropIndex(),
for example:
db.records.dropIndex( { loc: "2dsphere" } )
db.records.dropIndex( "records_text" )
Warning: --upgrade will run repairDatabase on any database where you have created a 2dsphere or
text index, which will rebuild all indexes.
Troubleshooting Upgrade/Downgrade Operations If you do not use --upgrade, when you attempt to start a
2.2 mongod and you have created a 2dsphere or text index, mongod will return the following message:
'need to upgrade database index_plugin_upgrade with pdfile version 4.6, new version: 4.5 Not upgradin
While running 2.4, to check the data file version of a MongoDB database, use the following operation in the shell:
db.getSiblingDB('<databasename>').stats().dataFileVersion
The major data file 67 version for both 2.2 and 2.4 is 4, the minor data file version for 2.2 is 5 and the minor data file
version for 2.4 is 6 after you create a 2dsphere or text index.
Compatibility and Index Type Changes in MongoDB 2.4 In 2.4 MongoDB includes two new features related to
indexes that users upgrading to version 2.4 must consider, particularly with regard to possible downgrade paths. For
more information on downgrades, see Downgrade MongoDB from 2.4 to Previous Versions (page 607).
New Index Types In 2.4 MongoDB adds two new index types: 2dsphere and text. These index types do not
exist in 2.2, and for each database, creating a 2dsphere or text index, will upgrade the data-file version and make
that database incompatible with 2.2.
If you intend to downgrade, you should always drop all 2dsphere and text indexes before moving to 2.2.
You can use the downgrade procedure (page 607) to downgrade these databases and run 2.2 if needed, however this
will run a full database repair (as with repairDatabase) for all affected databases.
Index Type Validation In MongoDB 2.2 and earlier you could specify invalid index types that did not exist. In
these situations, MongoDB would create an ascending (e.g. 1) index. Invalid indexes include index types specified by
strings that do not refer to an existing index type, and all numbers other than 1 and -1. 68
67
The data file version (i.e. pdfile version) is independent and unrelated to the release version of MongoDB.
In 2.4, indexes that specify a type of "1" or "-1" (the strings "1" and "-1") will continue to exist, despite a warning on start-up. However,
a secondary in a replica set cannot complete an initial sync from a primary that has a "1" or "-1" index. Avoid all indexes with invalid types.
68
608
In 2.4, creating any invalid index will result in an error. Furthermore, you cannot create a 2dsphere or text index
on a collection if its containing database has any invalid index types. 1
Example
If you attempt to add an invalid index in MongoDB 2.4, as in the following:
db.coll.ensureIndex( { field: "1" } )
See Upgrade MongoDB to 2.4 (page 601) for full upgrade instructions.
Other Resources
MongoDB Downloads69 .
All JIRA issues resolved in 2.470 .
All Backwards incompatible changes71 .
All Third Party License Notices72 .
See http://docs.mongodb.org/manualrelease-notes/2.4-changes for an overview of all changes
in 2.4.
See also:
See MongoDB 2.5 Series Development Release Notes for more information about the upcoming release of MongoDB.
70 https://jira.mongodb.org/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+SERVER+AND+fixVersion+in+%28%222.3.2%22,+%222.3.1%22,+%222
rc0%22,+%222.4.0-rc1%22,+%222.4.0-rc2%22,+%222.4.0-rc3%22%29
71 https://jira.mongodb.org/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+SERVER+AND+fixVersion+in+%28%222.3.2%22%2C+%222.3.1%22%2
rc0%22%2C+%222.4.0-rc1%22%2C+%222.4.0-rc2%22%2C+%222.4.0-rc3%22%29+AND+%22Backward+Breaking%22+in+%28+Rarely+%2C+sometimes%2C+yes
72 https://github.com/mongodb/mongo/blob/v2.4/distsrc/THIRD-PARTY-NOTICES
609
Synopsis
1. Download binaries of the latest release in the 2.2 series from the MongoDB Download Page73 .
2. Shutdown your mongod instance. Replace the existing binary with the 2.2 mongod binary and restart MongoDB.
Upgrading a Replica Set
You can upgrade to 2.2 by performing a rolling upgrade of the set by upgrading the members individually while the
other members are available to minimize downtime. Use the following procedure:
1. Upgrade the secondary members of the set one at a time by shutting down the mongod and replacing the 2.0
binary with the 2.2 binary. After upgrading a mongod instance, wait for the member to recover to SECONDARY
state before upgrading the next instance. To check the members state, issue rs.status() in the mongo
shell.
2. Use the mongo shell method rs.stepDown() to step down the primary to allow the normal failover
(page 397) procedure. rs.stepDown() expedites the failover procedure and is preferable to shutting down
the primary directly.
Once the primary has stepped down and another member has assumed PRIMARY state, as observed in the output
of rs.status(), shut down the previous primary and replace mongod binary with the 2.2 binary and start
the new process.
Note: Replica set failover is not instant but will render the set unavailable to read or accept writes until the
failover process completes. Typically this takes 10 seconds or more. You may wish to plan the upgrade during
a predefined maintenance window.
73 http://downloads.mongodb.org/
610
Changes
Major Features
Aggregation Framework The aggregation framework makes it possible to do aggregation operations without needing to use map-reduce. The aggregate command exposes the aggregation framework, and the aggregate()
helper in the mongo shell provides an interface to these operations. Consider the following resources for background
on the aggregation framework and its use:
Documentation: Aggregation Concepts (page 281)
Reference: Aggregation Reference (page 308)
Examples: Aggregation Examples (page 292)
TTL Collections TTL collections remove expired data from a collection, using a special index and a background
thread that deletes expired documents every minute. These collections are useful as an alternative to capped collections
in some cases, such as for data warehousing and caching cases, including: machine generated event data, logs, and
session information that needs to persist in a database for only a limited period of time.
For more information, see the Expire Data from Collections by Setting TTL (page 163) tutorial.
Concurrency Improvements MongoDB 2.2 increases the servers capacity for concurrent operations with the following improvements:
1. DB Level Locking75
2. Improved Yielding on Page Faults76
3. Improved Page Fault Detection on Windows77
To reflect these changes, MongoDB now provides changed and improved reporting for concurrency and use, see locks
and server-status-record-stats in server status and see db.currentOp(), mongotop, and mongostat.
74 https://jira.mongodb.org/browse/SERVER-6902
75 https://jira.mongodb.org/browse/SERVER-4328
76 https://jira.mongodb.org/browse/SERVER-3357
77 https://jira.mongodb.org/browse/SERVER-4538
611
Improved Data Center Awareness with Tag Aware Sharding MongoDB 2.2 adds additional support for geographic distribution or other custom partitioning for sharded collections in clusters. By using this tag aware sharding, you can automatically ensure that data in a sharded database system is always on specific shards. For example,
with tag aware sharding, you can ensure that data is closest to the application servers that use that data most frequently.
Shard tagging controls data location, and is complementary but separate from replica set tagging, which controls
read preference (page 406) and write concern (page 47). For example, shard tagging can pin all USA data to one
or more logical shards, while replica set tagging can control which mongod instances (e.g. production or
reporting) the application uses to service requests.
See the documentation for the following helpers in the mongo shell that support tagged sharding configuration:
sh.addShardTag()
sh.addTagRange()
sh.removeShardTag()
Also, see Tag Aware Sharding (page 540) and Manage Shard Tags (page 541).
Fully Supported Read Preference Semantics All MongoDB clients and drivers now support full read preferences
(page 406), including consistent support for a full range of read preference modes (page 476) and tag sets (page 408).
This support extends to the mongos and applies identically to single replica sets and to the replica sets for each shard
in a sharded cluster.
Additional read preference support now exists in the mongo shell using the readPref() cursor method.
Compatibility Changes
Authentication Changes MongoDB 2.2 provides more reliable and robust support for authentication clients, including drivers and mongos instances.
If your cluster runs with authentication:
For all drivers, use the latest release of your driver and check its release notes.
In sharded environments, to ensure that your cluster remains available during the upgrade process you must use
the upgrade procedure for sharded clusters (page 611).
findAndModify Returns Null Value for Upserts that Perform Inserts In version 2.2, for upsert that perform
inserts with the new option set to false, findAndModify commands will now return the following output:
{ 'ok': 1.0, 'value': null }
In the mongo shell, upsert findAndModify operations that perform inserts (with new set to false.)only output a
null value.
In version 2.0 these operations would return an empty document, e.g. { }.
See: SERVER-622678 for more information.
mongodump 2.2 Output Incompatible with Pre-2.2 mongorestore If you use the mongodump tool from the
2.2 distribution to create a dump of a database, you must use a 2.2 (or later) version of mongorestore to restore
that dump.
See: SERVER-696179 for more information.
78 https://jira.mongodb.org/browse/SERVER-6226
79 https://jira.mongodb.org/browse/SERVER-6961
612
the
toString()
method
on
the
ObjectId("507c7f79bcf86cd7994f6c0e").toString()
the
valueOf()
method
on
the
ObjectId("507c7f79bcf86cd7994f6c0e").valueOf()
The maximum size of a collection name is 128 characters, including the name of the database. However, for
maximum flexibility, collections should have names less than 80 characters.
Collections names may have any other valid UTF-8 string.
See the SERVER-444280 and the Are there any restrictions on the names of Collections? (page 565) FAQ item.
80 https://jira.mongodb.org/browse/SERVER-4442
613
Restrictions on Database Names for Windows Database names running on Windows can no longer contain the
following characters:
/\. "*<>:|?
The names of the data files include the database name. If you attempt to upgrade a database instance with one or more
of these characters, mongod will refuse to start.
Change the name of these databases before upgrading. See SERVER-458481 and SERVER-672982 for more information.
_id Fields and Indexes on Capped Collections All capped collections now have an _id field by default, if they
exist outside of the local database, and now have indexes on the _id field. This change only affects capped
collections created with 2.2 instances and does not affect existing capped collections.
See: SERVER-551683 for more information.
New $elemMatch Projection Operator The $elemMatch operator allows applications to narrow the data
returned from queries so that the query operation will only return the first matching element in an array.
See the http://docs.mongodb.org/manualreference/operator/projection/elemMatch documentation and the SERVER-223884 and SERVER-82885 issues for more information.
Windows Specific Changes
Windows XP is Not Supported As of 2.2, MongoDB does not support Windows XP. Please upgrade to a more
recent version of Windows to use the latest releases of MongoDB. See SERVER-564886 for more information.
Service Support for mongos.exe You may now run mongos.exe instances as a Windows Service. See the
http://docs.mongodb.org/manualreference/program/mongos.exe reference and MongoDB as a
Windows Service (page 15) and SERVER-158987 for more information.
Log Rotate Command Support MongoDB for Windows now supports log rotation by way of the logRotate
database command. See SERVER-261288 for more information.
New Build Using SlimReadWrite Locks for Windows Concurrency Labeled 2008+ on the Downloads Page89 ,
this build for 64-bit versions of Windows Server 2008 R2 and for Windows 7 or newer, offers increased performance
over the standard 64-bit Windows build of MongoDB. See SERVER-384490 for more information.
81 https://jira.mongodb.org/browse/SERVER-4584
82 https://jira.mongodb.org/browse/SERVER-6729
83 https://jira.mongodb.org/browse/SERVER-5516
84 https://jira.mongodb.org/browse/SERVER-2238
85 https://jira.mongodb.org/browse/SERVER-828
86 https://jira.mongodb.org/browse/SERVER-5648
87 https://jira.mongodb.org/browse/SERVER-1589
88 https://jira.mongodb.org/browse/SERVER-2612
89 http://www.mongodb.org/downloads
90 https://jira.mongodb.org/browse/SERVER-3844
614
Tool Improvements
Index Definitions Handled by mongodump and mongorestore When you specify the --collection option
to mongodump, mongodump will now backup the definitions for all indexes that exist on the source database. When
you attempt to restore this backup with mongorestore, the target mongod will rebuild all indexes. See SERVER80891 for more information.
mongorestore now includes the --noIndexRestore option to provide the preceding behavior.
--noIndexRestore to prevent mongorestore from building previous indexes.
Use
mongooplog for Replaying Oplogs The mongooplog tool makes it possible to pull oplog entries
from mongod instance and apply them to another mongod instance.
You can use mongooplog
to achieve point-in-time backup of a MongoDB data set.
See the SERVER-387392 case and the
http://docs.mongodb.org/manualreference/program/mongooplog documentation.
Authentication Support for mongotop and mongostat mongotop and mongostat now contain support for
username/password authentication. See SERVER-387593 and SERVER-387194 for more information regarding this
change. Also consider the documentation of the following options for additional information:
mongotop --username
mongotop --password
mongostat --username
mongostat --password
Write Concern Support for mongoimport and mongorestore mongoimport now provides an option to
halt the import if the operation encounters an error, such as a network interruption, a duplicate key exception, or a
write error. The --stopOnError option will produce an error rather than silently continue importing data. See
SERVER-393795 for more information.
In mongorestore, the --w option provides support for configurable write concern.
mongodump Support for Reading from Secondaries You can now run mongodump when connected to a secondary member of a replica set. See SERVER-385496 for more information.
mongoimport Support for full 16MB Documents Previously, mongoimport would only import documents
that were less than 4 megabytes in size. This issue is now corrected, and you may use mongoimport to import
documents that are at least 16 megabytes ins size. See SERVER-459397 for more information.
Timestamp() Extended JSON format MongoDB extended JSON now includes a new Timestamp() type to
represent the Timestamp type that MongoDB uses for timestamps in the oplog among other contexts.
This permits tools like mongooplog and mongodump to query for specific timestamps. Consider the following
mongodump operation:
91 https://jira.mongodb.org/browse/SERVER-808
92 https://jira.mongodb.org/browse/SERVER-3873
93 https://jira.mongodb.org/browse/SERVER-3875
94 https://jira.mongodb.org/browse/SERVER-3871
95 https://jira.mongodb.org/browse/SERVER-3937
96 https://jira.mongodb.org/browse/SERVER-3854
97 https://jira.mongodb.org/browse/SERVER-4593
615
Improved Shell User Interface 2.2 includes a number of changes that improve the overall quality and consistency
of the user interface for the mongo shell:
Full Unicode support.
Bash-like line editing features. See SERVER-431299 for more information.
Multi-line command support in shell history. See SERVER-3470100 for more information.
Windows support for the edit command. See SERVER-3998101 for more information.
Helper to load Server-Side Functions The db.loadServerScripts() loads the contents of the current
databases system.js collection into the current mongo shell session. See SERVER-1651102 for more information.
Support for Bulk Inserts If you pass an array of documents to the insert() method, the mongo shell will now
perform a bulk insert operation. See SERVER-3819103 and SERVER-2395104 for more information.
Note: For bulk inserts on sharded clusters, the getLastError command alone is insufficient to verify success.
Applications should must verify the success of bulk inserts in application logic.
Operations
Support for Logging to Syslog See the SERVER-2957105 case and the documentation of the syslog run-time
option or the mongod --syslog and mongos --syslog command line-options.
touch Command Added the touch command to read the data and/or indexes from a collection into memory. See:
SERVER-2023106 and touch for more information.
indexCounters No Longer Report Sampled Data indexCounters now report actual counters that reflect
index use and state. In previous versions, these data were sampled. See SERVER-5784107 and indexCounters for
more information.
Padding Specifiable on compact Command See the documentation of the compact and the SERVER-4018108
issue for more information.
98 https://jira.mongodb.org/browse/SERVER-3483
99 https://jira.mongodb.org/browse/SERVER-4312
100 https://jira.mongodb.org/browse/SERVER-3470
101 https://jira.mongodb.org/browse/SERVER-3998
102 https://jira.mongodb.org/browse/SERVER-1651
103 https://jira.mongodb.org/browse/SERVER-3819
104 https://jira.mongodb.org/browse/SERVER-2395
105 https://jira.mongodb.org/browse/SERVER-2957
106 https://jira.mongodb.org/browse/SERVER-2023
107 https://jira.mongodb.org/browse/SERVER-5784
108 https://jira.mongodb.org/browse/SERVER-4018
616
Added Build Flag to Use System Libraries The Boost library, version 1.49, is now embedded in the MongoDB
code base.
If you want to build MongoDB binaries using system Boost libraries, you can pass scons using the
--use-system-boost flag, as follows:
scons --use-system-boost
When building MongoDB, you can also pass scons a flag to compile MongoDB using only system libraries rather
than the included versions of the libraries. For example:
scons --use-system-all
Improved Logging for Replica Set Lag When secondary members of a replica set fall behind in replication,
mongod now provides better reporting in the log. This makes it possible to track replication in general and identify what process may produce errors or halt replication. See SERVER-3575114 for more information.
Replica Set Members can Sync from Specific Members The new replSetSyncFrom command and new
rs.syncFrom() helper in the mongo shell make it possible for you to manually configure from which member of the set a replica will poll oplog entries. Use these commands to override the default selection logic if needed.
Always exercise caution with replSetSyncFrom when overriding the default behavior.
Replica Set Members will not Sync from Members Without Indexes Unless buildIndexes: false To
prevent inconsistency between members of replica sets, if the member of a replica set has buildIndexes (page 468)
set to true, other members of the replica set will not sync from this member, unless they also have buildIndexes
(page 468) set to true. See SERVER-4160115 for more information.
New Option To Configure Index Pre-Fetching during Replication By default, when replicating options, secondaries will pre-fetch Indexes (page 313) associated with a query to improve replication throughput in most cases. The
replIndexPrefetch setting and --replIndexPrefetch option allow administrators to disable this feature
or allow the mongod to pre-fetch only the index on the _id field. See SERVER-6718116 for more information.
Map Reduce Improvements
617
Index on Shard Keys Can Now Be a Compound Index If your shard key uses the prefix of an existing index,
then you do not need to maintain a separate index for your shard key in addition to your existing index. This index,
however, cannot be a multi-key index. See the Shard Key Indexes (page 505) documentation and SERVER-1506119
for more information.
Migration Thresholds Modified The migration thresholds (page 502) have changed in 2.2 to permit more even
distribution of chunks in collections that have smaller quantities of data. See the Migration Thresholds (page 502)
documentation for more information.
Licensing Changes
Added License notice for Google Perftools (TCMalloc Utility). See the License Notice120 and the SERVER-4683121
for more information.
Resources
MongoDB Downloads122 .
All JIRA issues resolved in 2.2123 .
All backwards incompatible changes124 .
All third party license notices125 .
Whats New in MongoDB 2.2 Online Conference126 .
123 https://jira.mongodb.org/secure/IssueNavigator.jspa?reset=true&jqlQuery=project+%3D+SERVER+AND+fixVersion+in+%28%222.1.0%22%2C+%222.1.1%22%2
rc0%22%2C+%222.2.0-rc1%22%2C+%222.2.0-rc2%22%29+ORDER+BY+component+ASC%2C+key+DESC
124 https://jira.mongodb.org/secure/IssueNavigator.jspa?requestId=11225
125 https://github.com/mongodb/mongo/blob/v2.2/distsrc/THIRD-PARTY-NOTICES
126 http://www.mongodb.com/events/webinar/mongodb-online-conference-sept
618
Preparation
Read through all release notes before upgrading, and ensure that no changes will affect your deployment.
If you create new indexes in 2.0, then downgrading to 1.8 is possible but you must reindex the new collections.
mongoimport and mongoexport now correctly adhere to the CSV spec for handling CSV input/output. This
may break existing import/export workflows that relied on the previous behavior. For more information see SERVER1097127 .
Journaling128 is enabled by default in 2.0 for 64-bit builds. If you still prefer to run without journaling, start mongod
with the --nojournal run-time option. Otherwise, MongoDB creates journal files during startup. The first time
you start mongod with journaling, you will see a delay as mongod creates new files. In addition, you may see reduced
write throughput.
2.0 mongod instances are interoperable with 1.8 mongod instances; however, for best results, upgrade your deployments using the following procedures:
Upgrading a Standalone mongod
1. Upgrade the secondary members of the set one at a time by shutting down the mongod and replacing the 1.8
binary with the 2.0.x binary from the MongoDB Download Page130 .
2. To avoid losing the last few updates on failover you can temporarily halt your application (failover should take
less than 10 seconds), or you can set write concern (page 47) in your application code to confirm that each
update reaches multiple servers.
3. Use the rs.stepDown() to step down the primary to allow the normal failover (page 397) procedure.
rs.stepDown() and replSetStepDown provide for shorter and more consistent failover procedures than
simply shutting down the primary directly.
When the primary has stepped down, shut down its instance and upgrade by replacing the mongod binary with
the 2.0.x binary.
Upgrading a Sharded Cluster
1. Upgrade all config server instances first, in any order. Since config servers use two-phase commit, shard configuration metadata updates will halt until all are up and running.
2. Upgrade mongos routers in any order.
127 https://jira.mongodb.org/browse/SERVER-1097
128 http://www.mongodb.org/display/DOCS/Journaling
129 http://downloads.mongodb.org/
130 http://downloads.mongodb.org/
619
Changes
Compact Command
A compact command is now available for compacting a single collection and its indexes. Previously, the only way
to compact was to repair the entire database.
Concurrency Improvements
When going to disk, the server will yield the write lock when writing data that is not likely to be in memory. The
initial implementation of this feature now exists:
See SERVER-2563131 for more information.
The specific operations yield in 2.0 are:
Updates by _id
Removes
Long cursor iterations
Default Stack Size
MongoDB 2.0 reduces the default stack size. This change can reduce total memory usage when there are many (e.g.,
1000+) client connections, as there is a thread per connection. While portions of a threads stack can be swapped out
if unused, some operating systems do this slowly enough that it might be an issue. The default stack size is lesser of
the system setting or 1MB.
Index Performance Enhancements
v2.0 includes significant improvements to the index (page 346). Indexes are often 25% smaller and 25% faster (depends
on the use case). When upgrading from previous versions, the benefits of the new index type are realized only if you
create a new index or re-index an old one.
Dates are now signed, and the max index key size has increased slightly from 819 to 1024 bytes.
All operations that create a new index will result in a 2.0 index by default. For example:
Reindexing results on an older-version index results in a 2.0 index. However, reindexing on a secondary does
not work in versions prior to 2.0. Do not reindex on a secondary. For a workaround, see SERVER-3866132 .
The repairDatabase command converts indexes to a 2.0 indexes.
To convert all indexes for a given collection to the 2.0 type (page 620), invoke the compact command.
Once you create new indexes, downgrading to 1.8.x will require a re-index of any indexes created using 2.0. See Build
Old Style Indexes (page 346).
Sharding Authentication
620
Replica Sets
Hidden Nodes in Sharded Clusters In 2.0, mongos instances can now determine when a member of a replica set
becomes hidden without requiring a restart. In 1.8, mongos if you reconfigured a member as hidden, you had to
restart mongos to prevent queries from reaching the hidden member.
Priorities Each replica set member can now have a priority value consisting of a floating-point from 0 to 1000,
inclusive. Priorities let you control which member of the set you prefer to have as primary the member with the
highest priority that can see a majority of the set will be elected primary.
For example, suppose you have a replica set with three members, A, B, and C, and suppose that their priorities are set
as follows:
As priority is 2.
Bs priority is 3.
Cs priority is 1.
During normal operation, the set will always chose B as primary. If B becomes unavailable, the set will elect A as
primary.
For more information, see the priority (page 469) documentation.
Data-Center Awareness You can now tag replica set members to indicate their location. You can use these tags
to design custom write rules (page 47) across data centers, racks, specific servers, or any other architecture choice.
For example, an administrator can define rules such as very important write or customerData or audit-trail to
replicate to certain servers, racks, data centers, etc. Then in the application code, the developer would say:
db.foo.insert(doc, {w : "very important write"})
which would succeed if it fulfilled the conditions the DBA defined for very important write.
For more information, see Tagging133 .
Drivers may also support tag-aware reads. Instead of specifying slaveOk, you specify slaveOk with tags indicating
which data-centers to read from. For details, see the MongoDB Drivers and Client Libraries (page 95) documentation.
w : majority You can also set w to majority to ensure that the write propagates to a majority of nodes, effectively committing it. The value for majority will automatically adjust as you add or remove nodes from the
set.
For more information, see Write Concern (page 47).
Reconfiguration with a Minority Up If the majority of servers in a set has been permanently lost, you can now
force a reconfiguration of the set to bring it back online.
For more information see Reconfigure a Replica Set with Unavailable Members (page 454).
Primary Checks for a Caught up Secondary before Stepping Down To minimize time without a primary, the
rs.stepDown() method will now fail if the primary does not see a secondary within 10 seconds of its latest
optime. You can force the primary to step down anyway, but by default it will return an error message.
See also Force a Member to Become Primary (page 447).
133 http://www.mongodb.org/display/DOCS/Data+Center+Awareness#DataCenterAwareness-Tagging%28version2.0%29
621
Extended Shutdown on the Primary to Minimize Interruption When you call the shutdown command, the
primary will refuse to shut down unless there is a secondary whose optime is within 10 seconds of the primary. If such
a secondary isnt available, the primary will step down and wait up to a minute for the secondary to be fully caught up
before shutting down.
Note that to get this behavior, you must issue the shutdown command explicitly; sending a signal to the process will
not trigger this behavior.
You can also force the primary to shut down, even without an up-to-date secondary available.
Maintenance Mode When repair or compact runs on a secondary, the secondary will automatically drop into
recovering mode until the operation finishes. This prevents clients from trying to read from it while its busy.
Geospatial Features
Multi-Location Documents Indexing is now supported on documents which have multiple location objects, embedded either inline or in nested sub-documents. Additional command options are also supported, allowing results to
return with not only distance but the location used to generate the distance.
For more information, see Multi-location Documents134 .
Polygon searches Polygonal $within queries are also now supported for simple polygon shapes. For details, see
the $within operator documentation.
Journaling Enhancements
Journaling is now enabled by default for 64-bit platforms. Use the --nojournal command line option to
disable it.
The journal is now compressed for faster commits to disk.
A new --journalCommitInterval run-time option exists for specifying your own group commit interval.
The default settings do not change.
A new { getLastError: { j: true } } option is available to wait for the group commit. The
group commit will happen sooner when a client is waiting on {j: true}. If journaling is disabled, {j:
true} is a no-op.
New ContinueOnError Option for Bulk Insert
Set the continueOnError option for bulk inserts, in the driver (page 95), so that bulk insert will continue to insert
any remaining documents even if an insert fails, as is the case with duplicate key exceptions or network interruptions.
The getLastError command will report whether any inserts have failed, not just the last one. If multiple errors
occur, the client will only receive the most recent getLastError results.
See OP_INSERT135 .
Note: For bulk inserts on sharded clusters, the getLastError command alone is insufficient to verify success.
Applications should must verify the success of bulk inserts in application logic.
134 http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-MultilocationDocuments
135 http://www.mongodb.org/display/DOCS/Mongo+Wire+Protocol#MongoWireProtocol-OPINSERT
622
Map Reduce
Output to a Sharded Collection Using the new sharded flag, it is possible to send the result of a map/reduce to
a sharded collection. Combined with the reduce or merge flags, it is possible to keep adding data to very large
collections from map/reduce jobs.
See
Additional regex options: s Allows the dot (.) to match all characters including new lines. This is in addition to
the currently supported i, m and x. See Regular Expressions137 and $regex.
$and A special boolean $and query operator is now available.
Command Output Changes
The output of the validate command and the documents in the system.profile collection have both been
enhanced to return information as BSON objects with keys for each value rather than as free-form strings.
Shell Features
Custom Prompt You can define a custom prompt for the mongo shell. You can change the prompt at any time by
setting the prompt variable to a string or a custom JavaScript function returning a string. For examples, see Custom
Prompt138 .
Default Shell Init Script On startup, the shell will check for a .mongorc.js file in the users home directory.
The shell will execute this file after connecting to the database and before displaying the prompt.
If you would like the shell not to run the .mongorc.js file automatically, start the shell with --norc.
For more information, see http://docs.mongodb.org/manualreference/program/mongo.
Most Commands Require Authentication
In 2.0, when running with authentication (e.g. auth) all database commands require authentication, except the following commands.
isMaster
136 http://www.mongodb.org/display/DOCS/MapReduce#MapReduce-Outputoptions
137 http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-RegularExpressions
138 http://www.mongodb.org/display/DOCS/Overview+-+The+MongoDB+Interactive+Shell#Overview-TheMongoDBInteractiveShellCustomPrompt
623
authenticate
getnonce
buildInfo
ping
isdbgrid
Resources
MongoDB Downloads139
All JIRA Issues resolved in 2.0140
All Backward Incompatible Changes141
Read through all release notes before upgrading and ensure that no changes will affect your deployment.
Upgrading a Standalone mongod
624
625
"ismaster" : false,
"secondary" : true,
"hosts" : [
"ubuntu:27017",
"ubuntu:27018"
],
"arbiters" : [
"ubuntu:27019"
],
"primary" : "ubuntu:27018",
"ok" : 1
}
// for each secondary
config.members[0].priority = 0
config.members[3].priority = 0
config.members[4].priority = 0
rs.reconfig(config)
5. Shut down the primary (the final 1.6 server), and then restart it with the 1.8.x binary from the MongoDB
Download Page145 .
Upgrading a Sharded Cluster
626
Returning to 1.6
If for any reason you must move back to 1.6, follow the steps above in reverse. Please be careful that you have not
inserted any documents larger than 4MB while running on 1.8 (where the max size has increased to 16MB). If you
have you will get errors when the server tries to read those documents.
Journaling Returning to 1.6 after using 1.8 Journaling (page 234) works fine, as journaling does not change anything
about the data file format. Suppose you are running 1.8.x with journaling enabled and you decide to switch back to
1.6. There are two scenarios:
If you shut down cleanly with 1.8.x, just restart with the 1.6 mongod binary.
If 1.8.x shut down uncleanly, start 1.8.x up again and let the journal files run to fix any damage (incomplete
writes) that may have existed at the crash. Then shut down 1.8.x cleanly and restart with the 1.6 mongod binary.
Changes
Journaling
MongoDB now supports write-ahead Journaling Mechanics (page 234) to facilitate fast crash recovery and durability
in the storage engine. With journaling enabled, a mongod can be quickly restarted following a crash without needing
to repair the collections. The aggregation framework makes it possible to do aggregation
Sparse and Covered Indexes
Sparse Indexes (page 335) are indexes that only include documents that contain the fields specified in the index.
Documents missing the field will not appear in the index at all. This can significantly reduce index size for indexes of
fields that contain only a subset of documents within a collection.
Covered Indexes (page 314) enable MongoDB to answer queries entirely from the index when the query only selects
fields that the index contains.
Incremental MapReduce Support
The mapReduce command supports new options that enable incrementally updating existing collections. Previously,
a MapReduce job could output either to a temporary collection or to a named permanent collection, which it would
overwrite with new data.
You now have several options for the output of your MapReduce jobs:
148 http://downloads.mongodb.org/
627
You can merge MapReduce output into an existing collection. Output from the Reduce phase will replace
existing keys in the output collection if it already exists. Other keys will remain in the collection.
You can now re-reduce your output with the contents of an existing collection. Each key output by the reduce
phase will be reduced with the existing document in the output collection.
You can replace the existing output collection with the new results of the MapReduce job (equivalent to setting
a permanent output collection in previous releases)
You can compute MapReduce inline and return results to the caller without persisting the results of the job. This
is similar to the temporary collections generated in previous releases, except results are limited to 8MB.
For more information, see the out field options in the mapReduce document.
Additional Changes and Enhancements
1.8.1
Sharding migrate fix when moving larger chunks.
Durability fix with background indexing.
Fixed mongos concurrency issue with many incoming connections.
1.8.0
All changes from 1.7.x series.
1.7.6
Bug fixes.
1.7.5
Journaling (page 234).
Extent allocation improvements.
Improved replica set connectivity for mongos.
getLastError improvements for sharding.
1.7.4
mongos routes slaveOk queries to secondaries in replica sets.
New mapReduce output options.
Sparse Indexes (page 335).
1.7.3
Initial covered index (page 314) support.
Distinct can use data from indexes when possible.
mapReduce can merge or reduce results into an existing collection.
mongod tracks and mongostat displays network usage. See mongostat.
628
1.8.1149 , 1.8.0150
1.7.6151 , 1.7.5152 , 1.7.4153 , 1.7.3154 , 1.7.2155 , 1.7.1156 , 1.7.0157
149 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/v09MbhEm62Y
150 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/JeHQOnam6Qk
151 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/3t6GNZ1qGYc
152 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/S5R0Tx9wkEg
153 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/9Om3Vuw-y9c
154 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/DfNUrdbmflI
155 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/df7mwK6Xixo
156 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/HUR9zYtTpA8
157 https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/TUnJCg9161A
629
Resources
MongoDB Downloads158
All JIRA Issues resolved in 1.8159
630
631
632
Geo
2d geospatial search (page 331)
geo $center and $box searches
633
171
See the documentation of Write Concern (page 47) for more information about write concern in MongoDB.
Please migrate to the new MongoClient class expeditiously.
Releases
The following driver releases will include the changes outlined in Changes (page 634). See each drivers release notes
for a full account of each release as well as other related driver-specific changes.
C#, version 1.7
Java, version 2.10.0
Node.js, version 1.2
Perl, version 0.501.1
PHP, version 1.4
Python, version 2.4
Ruby, version 1.8
634
Generally, changes in the release series (e.g. 2.2 to 2.4) mark the introduction of new features that may break backwards compatibility. Changes to the revision number mark the release bug fixes and backwards-compatible changes.
Important: Always upgrade to the latest stable revision of your release series.
The version numbering system for MongoDB differs from the system used for the MongoDB drivers. Drivers use only
the first number to indicate a major version. For details, see Driver Version Numbers (page 96).
Example
Version numbers
2.0.0 : Stable release.
2.0.1 : Revision.
2.1.0 : Development release for testing only. Includes new features and changes for testing. Interfaces and
stability may not be compatible in development releases.
2.2.0 : Stable release. This is a culmination of the 2.1.x development series.
635
636
CHAPTER 12
The MongoDB Manual1 contains comprehensive documentation on the MongoDB document-oriented database management system. This page describes the manuals licensing, editions, and versions, and describes how to make a
change request and how to contribute to the manual.
For more information on MongoDB, see MongoDB: A Document Oriented Database2 . To download MongoDB, see
the downloads page3 .
12.1 License
This manual is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported4 (i.e.
CC-BY-NC-SA) license.
The MongoDB Manual is copyright 2011-2014 MongoDB, Inc.
12.2 Editions
In addition to the MongoDB Manual5 , you can also access this content in the following editions:
ePub Format6
Single HTML Page7
PDF Format8 (without reference.)
HTML tar.gz9
You also can access PDF files that contain subsets of the MongoDB Manual:
MongoDB Reference Manual10
MongoDB CRUD Operations11
1 http://docs.mongodb.org/manual/#
2 http://www.mongodb.org/about/
3 http://www.mongodb.org/downloads
4 http://creativecommons.org/licenses/by-nc-sa/3.0/
5 http://docs.mongodb.org/manual/#
6 http://docs.mongodb.org/v2.4/MongoDB-manual.epub
7 http://docs.mongodb.org/v2.4/single/
8 http://docs.mongodb.org/v2.4/MongoDB-manual.pdf
9 http://docs.mongodb.org/v2.4/manual.tar.gz
10 http://docs.mongodb.org/v2.4/MongoDB-reference-manual.pdf
11 http://docs.mongodb.org/v2.4/MongoDB-crud-guide.pdf
637
and
stable
version
of
the
manual
is
always
available
at
638
In this direction, the MongoDB Documentation project uses the service provided by Smartling24 to translate the MongoDB documentation into additional non-English languages. This translation project is largely supported by the work
of volunteer translators from the MongoDB community who contribute to the translation effort.
If you would like to volunteer to help translate the MongoDB documentation, please:
complete the MongoDB Contributor Agreement25 , and
create an account on Smartling at translate.docs.mongodb.org26 .
Please use the same email address you use to sign the contributor as you use to create your Smartling account.
The mongodb-translators27 user group exists to facilitate collaboration between translators and the documentation
team at large. You can join the Google Group without signing the contributors agreement.
We currently have the following languages configured:
Arabic28
Chinese29
Czech30
French31
German32
Hungarian33
Indonesian34
Italian35
Japanese36
Korean37
Lithuanian38
Polish39
Portuguese40
Romanian41
Russian42
Spanish43
24 http://smartling.com/
25 http://www.mongodb.com/legal/contributor-agreement
26 http://translate.docs.mongodb.org/
27 http://groups.google.com/group/mongodb-translators
28 http://ar.docs.mongodb.org
29 http://cn.docs.mongodb.org
30 http://cs.docs.mongodb.org
31 http://fr.docs.mongodb.org
32 http://de.docs.mongodb.org
33 http://hu.docs.mongodb.org
34 http://id.docs.mongodb.org
35 http://it.docs.mongodb.org
36 http://jp.docs.mongodb.org
37 http://ko.docs.mongodb.org
38 http://lt.docs.mongodb.org
39 http://pl.docs.mongodb.org
40 http://pt.docs.mongodb.org
41 http://ro.docs.mongodb.org
42 http://ru.docs.mongodb.org
43 http://es.docs.mongodb.org
639
Turkish44
Ukrainian45
If you would like to initiate a translation project to an additional language, please report this issue using the Report a
Problem link above or by posting to the mongodb-translators46 list.
Currently the translation project only publishes rendered translation. While the translation effort is currently focused
on the web site we are evaluating how to retrieve the translated phrases for use in other media.
See also:
Contribute to the Documentation (page 638)
Style Guide and Documentation Conventions (page 640)
MongoDB Manual Organization (page 649)
MongoDB Documentation Practices and Processes (page 646)
MongoDB Documentation Build System (page 650)
The entire documentation source for this manual is available in the mongodb/docs repository47 , which is one of the
MongoDB project repositories on GitHub48 .
To contribute to the documentation, you can open a GitHub account49 , fork the mongodb/docs repository50 , make a
change, and issue a pull request.
In order for the documentation team to accept your change, you must complete the MongoDB Contributor Agreement51 .
You can clone the repository by issuing the following command at your system shell:
git clone git://github.com/mongodb/docs.git
640
Document History
2011-09-27: Document created with a (very) rough list of style guidelines, conventions, and questions.
2012-01-12: Document revised based on slight shifts in practice, and as part of an effort of making it easier for people
outside of the documentation team to contribute to documentation.
2012-03-21: Merged in content from the Jargon, and cleaned up style in light of recent experiences.
2012-08-10: Addition to the Referencing section.
2013-02-07: Migrated this document to the manual. Added map-reduce terminology convention. Other edits.
2013-11-15: Added new table of preferred terms.
Naming Conventions
This section contains guidelines on naming files, sections, documents and other document elements.
File naming Convention:
For Sphinx, all files should have a .txt extension.
Separate words in file names with hyphens (i.e. -.)
For most documents, file names should have a terse one or two word name that
scribes the material covered in the document.
Allow the path of the file within the
ument tree to add some of the required context/categorization.
For example its
ceptable
to
have
http://docs.mongodb.org/manualcore/sharding.rst
http://docs.mongodb.org/manualadministration/sharding.rst.
dedocacand
For tutorials, the full title of the document should be in the file name.
For example,
http://docs.mongodb.org/manualtutorial/replace-one-configuration-server-in-a-shard Phrase headlines and titles so users can determine what questions the text will answer, and material that will
be addressed, without needing them to read the content. This shortens the amount of time that people spend
looking for answers, and improvise search/scanning, and possibly SEO.
Prefer titles and headers in the form of Using foo over How to Foo.
When using target references (i.e. :ref: references in documents), use names that include enough context to
be intelligible through all documentation. For example, use replica-set-secondary-only-node as
opposed to secondary-only-node. This makes the source more usable and easier to maintain.
Style Guide
This includes the local typesetting, English, grammatical, conventions and preferences that all documents in the manual
should use. The goal here is to choose good standards, that are clear, and have a stylistic minimalism that does not
interfere with or distract from the content. A uniform style will improve user experience and minimize the effect of a
multi-authored document.
Punctuation
Use the Oxford comma.
Oxford commas are the commas in a list of things (e.g. something, something else, and another thing) before
the conjunction (e.g. and or or.).
Do not add two spaces after terminal punctuation, such as periods.
641
642
workflow
Use unavailable, offline, or unreachable to refer to a mongod instance that cannot be accessed. Do not
use the colloquialism down.
Always write out units (e.g. megabytes) rather than using abbreviations (e.g. MB.)
Structural Formulations
There should be at least two headings at every nesting level. Within an h2 block, there should be either: no
h3 blocks, 2 h3 blocks, or more than 2 h3 blocks.
Section headers are in title case (capitalize first, last, and all important words) and should effectively describe
the contents of the section. In a single document you should strive to have section titles that are not redundant
and grammatically consistent with each other.
Use paragraphs and paragraph breaks to increase clarity and flow. Avoid burying critical information in the
middle of long paragraphs. Err on the side of shorter paragraphs.
Prefer shorter sentences to longer sentences. Use complex formations only as a last resort, if at all (e.g. compound complex structures that require semi-colons).
Avoid paragraphs that consist of single sentences as they often represent a sentence that has unintentionally
become too complex or incomplete. However, sometimes such paragraphs are useful for emphasis, summary,
or introductions.
As a corollary, most sections should have multiple paragraphs.
For longer lists and more complex lists, use bulleted items rather than integrating them inline into a sentence.
Do not expect that the content of any example (inline or blocked) will be self explanatory. Even when it feels
redundant, make sure that the function and use of every example is clearly described.
ReStructured Text and Typesetting
Place spaces between nested parentheticals and elements in JavaScript examples. For example, prefer { [ a,
a, a ] } over {[a,a,a]}.
For underlines associated with headers in RST, use:
= for heading level 1 or h1s. Use underlines and overlines for document titles.
- for heading level 2 or h2s.
~ for heading level 3 or h3s.
for heading level 4 or h4s.
Use hyphens (-) to indicate items of an ordered list.
Place footnotes and other references, if you use them, at the end of a section rather than the end of a file.
Use the footnote format that includes automatic numbering and a target name for ease of use. For instance a
footnote tag may look like: [#note]_ with the corresponding directive holding the body of the footnote that
resembles the following: .. [#note].
Do not include ..
code-block::
[language] in footnotes.
As it makes sense, use the .. code-block:: [language] form to insert literal blocks into the text.
While the double colon, ::, is functional, the .. code-block:: [language] form makes the source
easier to read and understand.
For all mentions of referenced types (i.e. commands, operators, expressions, functions, statuses, etc.) use the
reference types to ensure uniform formatting and cross-referencing.
12.5. Contribute to the Documentation
643
644
Preferred
Term
document
Concept
Dispreferred
Alternatives
Notes
record, object,
row
instance
process
(acceptable
sometimes), node
(never
acceptable),
server.
field
name
key, column
field/value
The name/value pair that
describes a unit of data in
MongoDB.
value
MongoDB
data
mongo,
mongodb, cluster
embedded
document, nested
document
mapReduce, map
reduce,
map/reduce
grid, shard
cluster, set,
deployment
cluster
A sharded cluster.
shard cluster,
cluster, sharded
system
set, replication
deployment
cluster, system
database, data
Geo-Location
1. While MongoDB is capable of storing coordinates in sub-documents, in practice, users should only store
coordinates in arrays. (See: DOCS-4155 .)
MongoDB Documentation Practices and Processes
This document provides an overview of the practices and processes.
Commits
When relevant, include a Jira case identifier in a commit message. Reference documentation cases when applicable,
but feel free to reference other cases from jira.mongodb.org56 .
Err on the side of creating a larger number of discrete commits rather than bundling large set of changes into one
commit.
55 https://jira.mongodb.org/browse/DOCS-41
56 http://jira.mongodb.org/
646
For the sake of consistency, remove trailing whitespaces in the source file.
Hard wrap files to between 72 and 80 characters per-line.
Standards and Practices
At least two people should vet all non-trivial changes to the documentation before publication. One of the
reviewers should have significant technical experience with the material covered in the documentation.
All development and editorial work should transpire on GitHub branches or forks that editors can then merge
into the publication branches.
Collaboration
Building the documentation is useful because Sphinx60 and docutils can catch numerous errors in the format and
syntax of the documentation. Additionally, having access to an example documentation as it will appear to the users
is useful for providing more effective basis for the review process. Besides Sphinx, Pygments, and Python-Docutils,
the documentation repository contains all requirements for building the documentation resource.
Talk to someone on the documentation team if you are having problems running builds yourself.
Publication
The makefile for this repository contains targets that automate the publication process. Use make html to publish
a test build of the documentation in the build/ directory of your repository. Use make publish to build the full
contents of the manual from the current branch in the ../public-docs/ directory relative the docs repository.
Other targets include:
man - builds UNIX Manual pages for all Mongodb utilities.
push - builds and deploys the contents of the ../public-docs/.
pdfs - builds a PDF version of the manual (requires LaTeX dependencies.)
Branches
This section provides an overview of the git branches in the MongoDB documentation repository and their use.
57 https://jira.mongodb.org/browse/DOCS
58 https://github.com/
59 https://github.com/mongodb/docs
60 http://sphinx.pocoo.org/
647
At the present time, future work transpires in the master, with the main publication being current. As the
documentation stabilizes, the documentation team will begin to maintain branches of the documentation for specific
MongoDB releases.
Migration from Legacy Documentation
The MongoDB.org Wiki contains a wealth of information. As the transition to the Manual (i.e. this project and
resource) continues, its critical that no information disappears or goes missing. The following process outlines how
to migrate a wiki page to the manual:
1. Read the relevant sections of the Manual, and see what the new documentation has to offer on a specific topic.
In this process you should follow cross references and gain an understanding of both the underlying information
and how the parts of the new content relates its constituent parts.
2. Read the wiki page you wish to redirect, and take note of all of the factual assertions, examples presented by the
wiki page.
3. Test the factual assertions of the wiki page to the greatest extent possible. Ensure that example output is accurate.
In the case of commands and reference material, make sure that documented options are accurate.
4. Make corrections to the manual page or pages to reflect any missing pieces of information.
The target of the redirect need not contain every piece of information on the wiki page, if the manual as a
whole does, and relevant section(s) with the information from the wiki page are accessible from the target of the
redirection.
5. As necessary, get these changes reviewed by another writer and/or someone familiar with the area of the information in question.
At this point, update the relevant Jira case with the target that youve chosen for the redirect, and make the ticket
unassigned.
6. When someone has reviewed the changes and published those changes to Manual, you, or preferably someone
else on the team, should make a final pass at both pages with fresh eyes and then make the redirect.
Steps 1-5 should ensure that no information is lost in the migration, and that the final review in step 6 should be
trivial to complete.
Review Process
Types of Review The content in the Manual undergoes many types of review, including the following:
Initial Technical Review Review by an engineer familiar with MongoDB and the topic area of the documentation.
This review focuses on technical content, and correctness of the procedures and facts presented, but can improve any
aspect of the documentation that may still be lacking. When both the initial technical review and the content review
are complete, the piece may be published.
Content Review Textual review by another writer to ensure stylistic consistency with the rest of the manual. Depending on the content, this may precede or follow the initial technical review. When both the initial technical review
and the content review are complete, the piece may be published.
648
Consistency Review This occurs post-publication and is content focused. The goals of consistency reviews are to
increase the internal consistency of the documentation as a whole. Insert relevant cross-references, update the style as
needed, and provide background fact-checking.
When possible, consistency reviews should be as systematic as possible and we should avoid encouraging stylistic and
information drift by editing only small sections at a time.
Subsequent Technical Review If the documentation needs to be updated following a change in functionality of the
server or following the resolution of a user issue, changes may be significant enough to warrant additional technical
review. These reviews follow the same form as the initial technical review, but is often less involved and covers a
smaller area.
Review Methods If youre not a usual contributor to the documentation and would like to review something, you
can submit reviews in any of the following methods:
If youre reviewing an open pull request in GitHub, the best way to comment is on the overview diff, which
you can find by clicking on the diff button in the upper left portion of the screen. You can also use the
following URL to reach this interface:
https://github.com/mongodb/docs/pull/[pull-request-id]/files
Replace [pull-request-id] with the identifier of the pull request. Make all comments inline, using
GitHubs comment system.
You may also provide comments directly on commits, or on the pull request itself but these commit-comments
are archived in less coherent ways and generate less useful emails, while comments on the pull request lead to
less specific changes to the document.
Leave feedback on Jira cases in the DOCS61 project. These are better for more general changes that arent
necessarily tied to a specific line, or affect multiple files.
Create a fork of the repository in your GitHub account, make any required changes and then create a pull request
with your changes.
If you insert lines that begin with any of the following annotations:
.. TODO:
TODO:
.. TODO
TODO
followed by your comments, it will be easier for the original writer to locate your comments. The two dots ..
format is a comment in reStructured Text, which will hide your comments from Sphinx and publication if youre
worried about that.
This format is often easier for reviewers with larger portions of content to review.
MongoDB Manual Organization
This document provides an overview of the global organization of the documentation resource. Refer to the notes
below if you are having trouble understanding the reasoning behind a files current location, or if you want to add new
documentation but arent sure how to integrate it into the existing resource.
If you have questions, dont hesitate to open a ticket in the Documentation Jira Project62 or contact the documentation
team63 .
61 http://jira.mongodb.org/browse/DOCS
62 https://jira.mongodb.org/browse/DOCS
63 [email protected]
649
Global Organization
Indexes
and
Experience The
documentation
project
has
two
index
files:
http://docs.mongodb.org/manualcontents.txt and http://docs.mongodb.org/manualindex.txt.
The contents file provides the documentations tree structure, which Sphinx uses to create the left-pane navigational
structure, to power the Next and Previous page functionality, and to provide all overarching outlines of the
resource. The index file is not included in the contents file (and thus builds will produce a warning here) and is
the page that users first land on when visiting the resource.
Having separate contents and index files provides a bit more flexibility with the organization of the resource while
also making it possible to customize the primary user experience.
Topical Organization The placement of files in the repository depends on the type of documentation rather than the
topic of the content. Like the difference between contents.txt and index.txt, by decoupling the organization
of the files from the organization of the information the documentation can be more flexible and can more adequately
address changes in the product and in users needs.
Files in the source/ directory represent the tip of a logical tree of documents, while directories are containers of
types of content. The administration and applications directories, however, are legacy artifacts and with a
few exceptions contain sub-navigation pages.
With several exceptions in the reference/ directory, there is only one level of sub-directories in the source/
directory.
Tools
The organization of the site, like all Sphinx sites derives from the toctree64 structure. However, in order to annotate
the table of contents and provide additional flexibility, the MongoDB documentation generates toctree65 structures
using data from YAML files stored in the source/includes/ directory. These files start with ref-toc or toc
and generate output in the source/includes/toc/ directory. Briefly this system has the following behavior:
files that start with ref-toc refer to the documentation of API objects (i.e. commands, operators and methods),
and the build system generates files that hold toctree66 directives as well as files that hold tables that list
objects and a brief description.
files that start with toc refer to all other documentation and the build system generates files that hold
toctree67 directives as well as files that hold definition lists that contain links to the documents and short
descriptions the content.
file names that have spec following toc or ref-toc will generate aggregated tables or definition lists and
allow ad-hoc combinations of documents for landing pages and quick reference guides.
MongoDB Documentation Build System
This document contains more direct instructions for building the MongoDB documentation.
Getting Started
Install Dependencies The MongoDB Documentation project depends on the following tools:
64 http://sphinx-doc.org/markup/toctree.html#directive-toctree
65 http://sphinx-doc.org/markup/toctree.html#directive-toctree
66 http://sphinx-doc.org/markup/toctree.html#directive-toctree
67 http://sphinx-doc.org/markup/toctree.html#directive-toctree
650
GNU Make
GNU Tar
Python
Git
Sphinx (documentation management toolchain)
Pygments (syntax highlighting)
PyYAML (for the generated tables)
Droopy (Python package for static text analysis)
Fabric (Python package for scripting and orchestration)
Inkscape (Image generation.)
python-argparse (For Python 2.6.)
LaTeX/PDF LaTeX (typically texlive; for building PDFs)
Common Utilities (rsync, tar, gzip, sed)
OS X Install Sphinx, Docutils, and their dependencies with easy_install the following command:
easy_install Sphinx Jinja2 Pygments docutils
Feel free to use pip rather than easy_install to install python packages.
To generate the images used in the documentation, download and install Inkscape68 .
Optional
To generate PDFs for the full production build, install a TeX distribution (for building the PDF.) If you do not have a
LaTeX installation, use MacTeX69 . This is only required to build PDFs.
Arch Linux Install packages from the system repositories with the following command:
pacman -S python2-sphinx python2-yaml inkscape python2-pip
Optional
To generate PDFs for the full production build, install the following packages from the system repository:
pacman -S texlive-bin texlive-core texlive-latexextra
Debian/Ubuntu Install the required system packages with the following command:
apt-get install python-sphinx python-yaml python-argparse inkscape python-pip
651
Optional
To generate PDFs for the full production build, install the following packages from the system repository:
apt-get install texlive-latex-recommended texlive-latex-recommended
Then run the bootstrap.py script in the docs/ repository, to configure the build dependencies:
python bootstrap.py
This downloads and configures the mongodb/docs-tools70 repository, which contains the authoritative build system
shared between branches of the MongoDB Manual and other MongoDB documentation projects.
You can run bootstrap.py regularly to update build system.
Building the Documentation
The MongoDB documentation build system is entirely accessible via make targets. For example, to build an HTML
version of the documentation issue the following command:
make html
You can find the build output in build/<branch>/html, where <branch> is the name of the current branch.
In addition to the html target, the build system provides the following targets:
publish Builds and integrates all output for the production build.
Build output is in
build/public/<branch>/. When you run publish in the master, the build will generate
some output in build/public/.
push; stage Uploads the production build to the production or staging web servers. Depends on publish. Requires access production or staging environment.
push-all; stage-all Uploads the entire content of build/public/ to the web servers.
publish. Not used in common practice.
Depends on
push-with-delete; stage-with-delete Modifies the action of push and stage to remove remote file
that dont exist in the local build. Use with caution.
html; latex; dirhtml; epub; texinfo; man; json These are standard targets derived from the default
Sphinx Makefile, with adjusted dependencies. Additionally, for all of these targets you can append -nitpick
to increase Sphinxs verbosity, or -clean to remove all Sphinx build artifacts.
latex performs several additional post-processing steps on .tex output generated by Sphinx. This target will
also compile PDFs using pdflatex.
html and man also generates a .tar.gz file of the build outputs for inclusion in the final releases.
70 http://github.com/mongodb/docs-tools/
652
Internally the build system has a number of components and processes. See the docs-tools README71 for more
information on the internals. This section documents a few of these components from a very high level and lists useful
operations for contributors to the documentation.
Fabric Fabric is an orchestration and scripting package for Python. The documentation uses Fabric to handle the
deployment of the build products to the web servers and also unifies a number of independent build operations. Fabric
commands have the following form:
fab <module>.<task>[:<argument>]
The <argument> is optional in most cases. Additionally some tasks are available at the root level, without a module.
To see a full list of fabric tasks, use the following command:
fab -l
You can chain fabric tasks on a single command line, although this doesnt always make sense.
Important fabric tasks include:
tools.bootstrap Runs the bootstrap.py script. Useful for re-initializing the repository without needing to
be in root of the repository.
tools.dev; tools.reset tools.dev switches the origin remote of the docs-tools checkout in build
directory, to ../docs-tools to facilitate build system testing and development. tools.reset resets the
origin remote for normal operation.
tools.conf tools.conf returns the content of the configuration object for the current project. These data are
useful during development.
stats.report:<filename> Returns, a collection of readability statistics.
source/ tree.
make Provides a thin wrapper around Make calls. Allows you to start make builds from different locations in the
project repository.
process.refresh_dependencies Updates the time stamp of .txt source files with changed include files, to
facilitate Sphinxs incremental rebuild process. This task runs internally as part of the build process.
Buildcloth Buildcloth72 is a meta-build tool, used to generate Makefiles programmatically. This makes the build
system easier to maintain, and makes it easier to use the same fundamental code to generate various branches of the
Manual as well as related documentation projects. See makecloth/ in the docs-tools repository73 for the relevant code.
Running make with no arguments will regenerate these parts of the build system automatically.
Rstcloth Rstcloth74 is a library for generating reStructuredText programmatically. This makes it possible to generate
content for the documentation, such as tables, tables of contents, and API reference material programmatically and
transparently. See rstcloth/ in the docs-tools repository75 for the relevant code.
If you have any questions, please feel free to open a Jira Case76 .
71 https://github.com/mongodb/docs-tools/blob/master/README.rst
72 https://pypi.python.org/pypi/buildcloth/
73 https://github.com/mongodb/docs-tools/tree/master/makecloth
74 https://pypi.python.org/pypi/rstcloth
75 https://github.com/mongodb/docs-tools/tree/master/rstcloth
76 https://jira.mongodb.org/browse/DOCS
653
654
Index
Symbols
_id, 320
_id index, 320
<database>.system.indexes (MongoDB reporting output),
227
<database>.system.js (MongoDB reporting output), 227
<database>.system.namespaces (MongoDB reporting
output), 227
<database>.system.profile (MongoDB reporting output),
227
<database>.system.users (MongoDB reporting output),
272
<database>.system.users.pwd (MongoDB reporting output), 273
<database>.system.users.roles (MongoDB reporting output), 273
<database>.system.users.user (MongoDB reporting output), 273
<database>.system.users.userSource (MongoDB reporting output), 273
0 (error code), 235
100 (error code), 236
12 (error code), 236
14 (error code), 236
2 (error code), 235
20 (error code), 236
3 (error code), 236
4 (error code), 236
45 (error code), 236
47 (error code), 236
48 (error code), 236
49 (error code), 236
5 (error code), 236
balancing, 500
configure, 529
internals, 501
migration, 502
operations, 531
secondary throttle, 530
655
D
data_binary (BSON type), 228
data_date (BSON type), 228
data_maxkey (BSON type), 230
data_minkey (BSON type), 230
data_oid (BSON type), 229
data_ref (BSON type), 229
data_regex (BSON type), 229
data_timestamp (BSON type), 228
data_undefined (BSON type), 230
database, 488
local, 472
database references, 124
dbAdmin (user role), 269
dbAdminAnyDatabase (user role), 271
DBRef, 124
development tutorials, 189
DOWN (replica set state), 475
E
EDITOR, 569
environment variable
EDITOR, 569
HOME, 212
F
failover
replica set, 397
FATAL (replica set state), 475
files._id (MongoDB reporting output), 128
files.aliases (MongoDB reporting output), 128
files.chunkSize (MongoDB reporting output), 128
files.contentType (MongoDB reporting output), 128
files.filename (MongoDB reporting output), 128
files.length (MongoDB reporting output), 128
files.md5 (MongoDB reporting output), 128
files.metadata (MongoDB reporting output), 128
files.uploadDate (MongoDB reporting output), 128
fundamentals
sharding, 489
G
geospatial queries, 355
exact, 355
656
H
HOME, 212
I
index
_id, 320
background creation, 336
compound, 322, 341
create, 339, 341
create in background, 345
drop duplicates, 338, 342
duplicates, 338, 342
embedded fields, 321
hashed, 333, 343
list indexes, 348
measure use, 349
monitor index building, 348
multikey, 324
name, 338
options, 336
overview, 313
rebuild, 347
remove, 347
replica set, 343
sort order, 323
sparse, 335, 342
subdocuments, 321
TTL index, 334
unique, 334, 341
index types, 319
primary key, 320
installation, 3
installation guides, 3
installation tutorials, 3
internals
config database, 547
L
load balancer, 490
local database, 472
local.oplog.$main (MongoDB reporting output), 474
local.oplog.rs (MongoDB reporting output), 473
local.replset.minvalid (MongoDB reporting output), 473
local.slaves (MongoDB reporting output), 473, 474
local.sources (MongoDB reporting output), 474
local.startup_log (MongoDB reporting output), 472
local.startup_log._id (MongoDB reporting output), 473
Index
M
mongos, 490, 496
mongos load balancer, 490
N
namespace
local, 472
system, 227
nearest (read preference mode), 477
P
primary (read preference mode), 476
PRIMARY (replica set state), 474
primaryPreferred (read preference mode), 476
Q
query optimizer, 37
R
read (user role), 268
read operation
architecture, 38
connection pooling, 38
read operations
query, 31
read preference, 406
background, 406
behavior, 408
member selection, 408
modes, 476
mongos, 410
nearest, 408
ping time, 408
semantics, 476
sharding, 410
tag sets, 408, 450
readAnyDatabase (user role), 271
readWrite (user role), 268
readWriteAnyDatabase (user role), 271
RECOVERING (replica set state), 475
references, 124
replica set
elections, 397
failover, 397
index, 343
local database, 472
network partitions, 397
reconfiguration, 454
resync, 449, 450
rollbacks, 401
security, 240
sync, 410, 449
tag sets, 450
replica set members
arbiters, 388
delayed, 387
hidden, 387
non-voting, 400
ROLLBACK (replica set state), 475
rollbacks, 401
S
secondary (read preference mode), 476
Index
657
658
T
tag sets, 408
configuration, 450
text search tutorials, 189
TTL index, 334
tutorials, 186
administration, 187
development patterns, 189
installation, 3
text search, 189
U
UNKNOWN (replica set state), 475
userAdmin (user role), 269
userAdminAnyDatabase (user role), 271
W
write concern, 46
write operations, 43
Index