Query execution reference
This page explains the output of a query executed with Query Explain. To learn how to execute a query with Query Explain, see Analyze query execution with Query Explain.Common Concepts
The following common concepts and terms are used throughout the execution tree.
Rows and records
The terms row and record are used to generically refer to a document or index entry.
Variables
The following internal variables can appear in the execution nodes:
__key__
-the key is an internal identifier for a document. This is an absolute, unique identifier with the project, database, and the full path of the document.__id__
-the ID is a unique identifier for a document within its collection. This is unique within a single collection.__$0__…__$N__
-these are context specific variables which are created or referenced in the execution tree. These variables are typically used to refer to the contents of a document or the value of an expression evaluated during the execution of a query.
Consider an example where an extend node is used to extract the
__id__
from the document __key__
:
Extend
| expressions: [_id(__key__) AS __id__]
| records returned: 1
Constraints and ranges
Some scan nodes use constraints
and ranges
attributes to describe the range
of values that are scanned. These attributes use a range tree format which
contains a list of values. These values correspond to the ordered list of keys
which appear in the index definition. For example, the first range which appears
in the tree, here (1..5]
, corresponds to the constraints on the first key,
here a
, in the ordered list of keys:
| index: type=CollectionGroupIndex, id=CICAgOjXh#EK, keys=[a ASC, b ASC, __key__ ASC]
| constraints: /
|----(1..5]
|----[1L]
Each level of indentation indicates the constraint applying to the next key in
the list. Square brackets represent an inclusive range, rounded brackets are an
exclusive range. In this case, the constraint translates to 1 < "a" <= 5
, and
"b" = 1
.
In the following example with multiple branches for a
,
the constraint corresponds to 1 < a <= 5 OR a = 10
:
| constraints: /
|----(1L, 5L]
|----[10L]
Key Variables
In some scan nodes (such as SequentialScan
), there is both a list of keys as
part of the index
attribute, and a separate keys
attribute in the Scan
node. The
keys
attribute in the Scan
node denotes the variable name of each key in the
index definition, in order. The variables can be used to reference the runtime
values of the scanned field further up in the execution tree.
In the following example, the value of the user
field for the current document
maps to variable __$6__
and the value of date_placed
to __$7__
.
index: type=CollectionGroupIndex, id=CICAgOjXh4EK, keys=[user ASC, date_placed ASC, __path__ ASC]
keys: [__$6__ ASC, __$7__ ASC, __path__ ASC]
Execution Nodes
A query execution tree can contain the following nodes.
SeekingScan
Represents a dynamic scan where the rows returned may not be along a single sequential range of the index, and multiple distinct scans must be performed to satisfy the query.
For example, a query where a
exists and b
equals 1 working on an
index of ["a" ASC, "b" ASC]
, would need to scan and return a separate,
potentially non-sequential range for each distinct value of a
.
This is more efficient than a full TableScan
, but less efficient than a single
SequentialScan
on a composite index of ["b" ASC, "a" ASC]
.
• SeekingScan
| constraints: /
|----(-∞..+∞)
|----[1L]
| index: type=CollectionGroupIndex, id=CAE, keys=[user ASC, quantity ASC, __key__ ASC]
| keys: [__$1__ ASC, __$2__ ASC, __key__ ASC]
| properties: Selection { user }
| records returned: 1
| records scanned: 1
SequentialScan
Represents a scan of a static, sequential range of rows in storage that can be performed in a single read operation.
The key ordering length
refers to the number of keys that must be preserved
and returned in original key order. For a schema of [k1, k2, k3]
, a key
ordering length of 0 means the scan can return in any order, 1 means order by
k1, but rows with the same k1 value can come with any order, 3 returns documents
in exact sorted order.
• SequentialScan
| index: type=CollectionGroupIndex, id=CAE, keys=[user ASC, date_placed ASC, __key__ ASC]
| key ordering length: 3
| keys: [__$1__ ASC, __$2__ ASC, __key__ ASC]
| limit: 10
| properties: Selection { a }
| ranges: /
| records returned: 1
| records scanned: 1
UniqueScan
Represents a scan of a static, sequential range of rows in storage with in-memory deduplication of rows.
• UniqueScan
| index: type=CollectionGroupIndex, id=CAE, keys=[user ASC, date_placed ASC, __key__ ASC]
| keys: [__$1__ ASC, __$2__ ASC, __key__ ASC]
| properties: Selection { a }
| ranges: /
|----(-∞..+∞)
| records returned: 1
| records scanned: 1
TableAccess
Back-joins the supplied row's identifier to the actual row contents from primary
storage. TableAccess
is required if a parent node (or the final query
result) requires a subset of fields from the documents.
• TableAccess
| order: PRESERVE_INPUT_ORDER
| peak memory usage: 4.00 KiB (4,096 B)
| properties: *
| records returned: 1
TableScan
A full, unordered scan of a collection. Used when a query is run without an associated index.
Order can be either STABLE
or UNDEFINED
, with STABLE
denoting a
deterministic ordering.
• TableScan
| order: STABLE
| properties: *
| records returned: 1
| records scanned: 1
| source: (default)#/**/collection
HashAggregate
Hash-backed implementation of aggregate operations. Requires materializing the full group in-memory before returning the result and must not exceed the the query memory limit.
• HashAggregate
| aggregations: [sum(__$0__) AS total]
| groups: [a]
| peak memory usage: 4.00 KiB (4,096 B)
| records returned: 0
StreamAggregate
Specialized aggregate node which only maintains state for a single group at a time, reducing peak memory usage. Used when the underlying child node will return groups sequentially. For example, when grouping by distinct values of a field while using an index on that field.
• StreamAggregate
| keys: [foo ASC, bar ASC]
| properties: Selection { baz }
| aggregations: [$sum(foo) AS baz]
MajorSort
Performs a sort operation on a fixed set of properties. Materializes all records in memory at once and returns the sorted values in order, the size of the sort set is limited by the query memory limit.
When a subsequent limit is provided, a top-k sorting algorithm is used to reduce the memory usage. With it, sorts can be performed on an arbitrarily large set of records so long as the memory used by storing the k considered elements does not exceed the limit.
• MajorSort
| fields: [a ASC, b DESC]
| limit: 10
| peak memory usage: 4.00 KiB (4,096 B)
| records returned: 1
Concat
Concatenates the results of multiple child nodes and returns the result to the parent node. This node does not deduplicate results that appear in multiple children, and the order of returned results is nondeterministic.
• Concat
├── • TableAccess
...
├── • TableAccess
Restrict
Projects a set of properties to be passed to its parent node. Reduces memory usage by preventing propagation of unused fields as soon as they are irrelevant to the rest of the query.
• Restrict
| expressions: [foo AS foo, bar AS bar]
| records returned: 0
Extend
Adds a set of fields to each row in the result set. Used when adding fields to a document and as an intermediate node to support more complex operations.
• Extend
| expressions: ["bar" AS foo]
| records returned: 1
Filter
Selectively returns rows if and only if they match the supplied expression.
• Filter
| expression: $eq(foo, "bar")
| records returned: 1
Values
Produces a sequence of literal values to work on. Used primarily when a set list of documents is provided as the input to a query.
• Values
| expression: [{__key__=/col/1}, {__key__=/col/2}]
ReplaceWith
Replaces the fields of rows produced by the child node with the fields from
the supplied map
expression.
• ReplaceWith
| map: map("full_name", str_concat(first_name, " ", last_name)), current_context())
| records returned: 1
Unnest
Unnests the value produced by the child node.
• Unnest
| expression: foo AS unnested_foo
Limit
Limits the number of rows returned to the parent node.
• Limit
| limit: 10
| records returned: 1
Offset
Skips a set number of rows produced by the child node.
• Offset
| offset: 10
| records returned: 1
Drop
Drops a specified set of fields from results produced by the child node.
• Drop
| fields to drop: [__key__]
| records returned: 1