BSON Types
BSON is a binary serialization format used to store documents and make remote procedure calls in MongoDB. The BSON specification is located at bsonspec.org.
Each BSON type has both integer and string identifiers as listed in the following table:
Type | Number | Alias | Notes |
---|---|---|---|
Double | 1 | "double" | |
String | 2 | "string" | |
Object | 3 | "object" | |
Array | 4 | "array" | |
Binary data | 5 | "binData" | |
Undefined | 6 | "undefined" | Deprecated. |
ObjectId | 7 | "objectId" | |
Boolean | 8 | "bool" | |
Date | 9 | "date" | |
Null | 10 | "null" | |
Regular Expression | 11 | "regex" | |
DBPointer | 12 | "dbPointer" | Deprecated. |
JavaScript | 13 | "javascript" | |
Symbol | 14 | "symbol" | Deprecated. |
32-bit integer | 16 | "int" | |
Timestamp | 17 | "timestamp" | |
64-bit integer | 18 | "long" | |
Decimal128 | 19 | "decimal" | |
Min key | -1 | "minKey" | |
Max key | 127 | "maxKey" |
The
$type
operator supports using these values to query fields by their BSON type.$type
also supports thenumber
alias, which matches the integer, decimal, double, and long BSON types.The
$type
aggregation operator returns the BSON type of its argument.The
$isNumber
aggregation operator returnstrue
if its argument is a BSON integer, decimal, double, or long.
To determine a field's type, see Type Checking.
If you convert BSON to JSON, see the Extended JSON reference.
The following sections describe special considerations for particular BSON types.
Binary Data
A BSON binary binData
value is a byte array. A binData
value
has a subtype that indicates how to interpret the binary data. The
following table shows the subtypes:
Number | Description |
---|---|
0 | Generic binary subtype |
1 | Function data |
2 | Binary (old) |
3 | UUID (old) |
4 | UUID |
5 | MD5 |
6 | Encrypted BSON value |
7 | Compressed time series data New in version 5.2. |
8 | Sensitive data, such as a key or secret. MongoDB does not log
literal values for binary data with subtype 8. Instead, MongoDB
logs a placeholder value of |
9 | Vector data, which is densely packed arrays of numbers of the same type. |
128 | Custom data |
ObjectId
ObjectIds are small, likely unique, fast to generate, and ordered. ObjectId values are 12 bytes in length, consisting of:
A 4-byte timestamp, representing the ObjectId's creation, measured in seconds since the Unix epoch.
A 5-byte random value generated once per process. This random value is unique to the machine and process.
A 3-byte incrementing counter, initialized to a random value.
For timestamp and counter values, the most significant bytes appear first in the byte sequence (big-endian). This is unlike other BSON values, where the least significant bytes appear first (little-endian).
If an integer value is used to create an ObjectId, the integer replaces the timestamp.
In MongoDB, each document stored in a standard collection requires a unique
_id field that acts as a primary key. If an inserted
document omits the _id
field, the MongoDB driver automatically
generates an ObjectId for the _id
field.
This also applies to documents inserted through update operations with upsert: true.
MongoDB clients should add an _id
field with a unique ObjectId.
Using ObjectIds for the _id
field provides the following additional
benefits:
You can access
ObjectId
creation time inmongosh
using theObjectId.getTimestamp()
method.ObjectIds are approximately ordered by creation time, but are not perfectly ordered. Sorting a collection on an
_id
field containingObjectId
values is roughly equivalent to sorting by creation time.Important
While ObjectId values should increase over time, they are not necessarily monotonic. This is because they:
Only contain one second of temporal resolution, so ObjectId values created within the same second do not have a guaranteed ordering, and
Are generated by clients, which may have differing system clocks.
Use the ObjectId()
methods to set and retrieve ObjectId
values.
Starting in MongoDB 5.0, mongosh
replaces the legacy mongo
shell. The ObjectId()
methods work differently in mongosh
than
in the legacy mongo
shell. For more information on the legacy
methods, see Legacy mongo Shell.
String
BSON strings are UTF-8. In general, drivers for each programming
language convert from the language's string format to UTF-8 when
serializing and deserializing BSON. This makes it possible to store
most international characters in BSON strings with ease.
[1] In addition, MongoDB
$regex
queries support UTF-8 in the regex string.
[1] | Given strings using UTF-8
character sets, using sort() on strings
will be reasonably correct. However, because internally
sort() uses the C++ strcmp api, the
sort order may handle some characters incorrectly. |
Timestamps
BSON has a special timestamp type for internal MongoDB use and is not associated with the regular Date type. This internal timestamp type is a 64 bit value where:
the most significant 32 bits are a
time_t
value (seconds since the Unix epoch)the least significant 32 bits are an incrementing
ordinal
for operations within a given second.
While the BSON format is little-endian, and therefore stores the least
significant bits first, the mongod
instance
always compares the time_t
value before
the ordinal
value on all platforms, regardless of
endianness.
In replication, the oplog has a ts
field. The values in
this field reflect the operation time, which uses a BSON timestamp
value.
Within a single mongod
instance, timestamp values in the
oplog are always unique.
Note
The BSON timestamp type is for internal MongoDB use. For most cases, in application development, you will want to use the BSON date type. See Date for more information.
Date
BSON Date is a 64-bit integer that represents the number of milliseconds since the Unix epoch (Jan 1, 1970). This results in a representable date range of about 290 million years into the past and future.
The official BSON specification refers to the BSON Date type as the UTC datetime.
BSON Date type is signed. [2] Negative values represent dates before 1970.
To construct a Date
in mongosh
, you can use the
new Date()
or ISODate()
constructor.
Construct a Date With the New Date() Constructor
To construct a Date
with the new Date()
constructor, run the following
command:
var mydate1 = new Date()
The mydate1
variable outputs a date and time wrapped as an ISODate:
mydate1
ISODate("2020-05-11T20:14:14.796Z")
Construct a Date With the ISODate() Constructor
To construct a Date
using the ISODate()
constructor, run the following
command:
var mydate2 = ISODate()
The mydate2
variable stores a date and time wrapped as an ISODate:
mydate2
ISODate("2020-05-11T20:14:14.796Z")
Convert a Date to a String
To print the Date
in a string
format, use the toString()
method:
mydate1.toString()
Mon May 11 2020 13:14:14 GMT-0700 (Pacific Daylight Time)
Return the Month Portion of a Date
You can also return the month portion of the Date
value. Months are
zero-indexed, so that January is month 0
.
mydate1.getMonth()
4
[2] | Prior to version 2.0, Date values were
incorrectly interpreted as unsigned integers, which affected
sorts, range queries, and indexes on Date fields. Because
indexes are not recreated when upgrading, please re-index if you
created an index on Date values with an earlier version, and
dates before 1970 are relevant to your application. |
decimal128
BSON Data Type
decimal128
is a 128-bit decimal representation for storing very
large or very precise numbers, whenever rounding decimals is important.
It was created in August 2009 as part of the
IEEE 754-2008
revision of floating points. When you need high precision when
working with BSON data types, you should use decimal128
.
decimal128
supports 34 decimal digits of precision, or
significand along with
an exponent range of -6143 to +6144. The significand is not normalized
in the decimal128
standard, allowing for multiple possible representations:
10 x 10^-1 = 1 x 10^0 = .1 x 10^1 = .01 x 10^2
, etc. Having the
ability to store maximum and minimum values in the order of 10^6144
and 10^-6143
, respectively, allows for a lot of precision.
Use decimal128
With the NumberDecimal()
Constructor
In MongoDB, you can store data in decimal128
format using the
NumberDecimal()
constructor. If you pass in the decimal value
as a string, MongoDB stores the value in the database as follows:
NumberDecimal("9823.1297")
You can also pass in the decimal value as a double
:
NumberDecimal(1234.99999999999)
You should also consider the usage and support your programming
language has for decimal128
. The following languages don’t
natively support this feature and require a plugin
or additional package to get the functionality:
Python: The decimal.Decimal module can be used for floating-point arithmetic.
Java: The Java BigDecimal class provides support for
decimal128
numbers.Node.js: There are several packages that provide support, such as js-big-decimal or node.js bigdecimal available on npm.
Use Cases
When you perfom mathematical calculations programmatically, you can sometimes receive unexpected results. The following example in Node.js yields incorrect results:
> 0.1 0.1 > 0.2 0.2 > 0.1 * 0.2 0.020000000000000004 > 0.1 + 0.1 0.010000000000000002
Similarly, the following example in Java produces incorrect output:
1 class Main { 2 public static void main(String[] args) { 3 System.out.println("0.1 * 0.2:"); 4 System.out.println(0.1 * 0.2); 5 } 6 }
1 0.1 * 0.2: 2 0.020000000000000004
The same computations in Python, Ruby, Rust, and other languages produce the same results. This happens because binary floating-point numbers do not represent base 10 values well.
For example, the 0.1
used in the above examples is represented
in binary as 0.0001100110011001101
. Most of the time, this
does not cause any significant issues. However, in applications
such as finance or banking where precision is important,
use decimal128
as your data type.