Window
A variant of ring buffer or backtrace logging implemented as a sliding window
flush_when
condition is matched. When the buffer is full, the oldest events are dropped.Configuration
Example configurations
{
"transforms": {
"my_transform_id": {
"type": "window",
"inputs": [
"my-source-or-transform-id"
]
}
}
}
[transforms.my_transform_id]
type = "window"
inputs = [ "my-source-or-transform-id" ]
transforms:
my_transform_id:
type: window
inputs:
- my-source-or-transform-id
{
"transforms": {
"my_transform_id": {
"type": "window",
"inputs": [
"my-source-or-transform-id"
],
"num_events_before": 100
}
}
}
[transforms.my_transform_id]
type = "window"
inputs = [ "my-source-or-transform-id" ]
num_events_before = 100
transforms:
my_transform_id:
type: window
inputs:
- my-source-or-transform-id
num_events_before: 100
flush_when
required conditionA condition used to flush the events.
If the condition resolves to true
for an event, the whole window is immediately flushed,
including the event itself, and any following events if num_events_after
is more than zero.
type
.Available syntaxes
Syntax | Description | Example |
---|---|---|
vrl | A Vector Remap Language (VRL) Boolean expression. | .status_code != 200 && !includes(["info", "debug"], .severity) |
datadog_search | A Datadog Search query string. | *stack |
is_log | Whether the incoming event is a log. |
|
is_metric | Whether the incoming event is a metric. |
|
is_trace | Whether the incoming event is a trace. |
|
Shorthand for VRL
If you opt for the vrl
syntax for this condition, you can set the condition
as a string via the condition
parameter, without needing to specify both a source
and a type
. The
table below shows some examples:
Config format | Example |
---|---|
YAML | condition: .status == 200 |
TOML | condition = ".status == 200" |
JSON | "condition": ".status == 200" |
Condition config examples
Standard VRL
flush_when:
type: "vrl"
source: ".status == 500"
flush_when = { type = "vrl", source = ".status == 500" }
"flush_when": {
"type": "vrl",
"source": ".status == 500"
}
forward_when
optional conditionA condition used to pass events through the transform without buffering.
If the condition resolves to true
for an event, the event is immediately forwarded without
buffering and without preserving the original order of events. Use with caution if the sink
cannot handle out of order events.
type
.Available syntaxes
Syntax | Description | Example |
---|---|---|
vrl | A Vector Remap Language (VRL) Boolean expression. | .status_code != 200 && !includes(["info", "debug"], .severity) |
datadog_search | A Datadog Search query string. | *stack |
is_log | Whether the incoming event is a log. |
|
is_metric | Whether the incoming event is a metric. |
|
is_trace | Whether the incoming event is a trace. |
|
Shorthand for VRL
If you opt for the vrl
syntax for this condition, you can set the condition
as a string via the condition
parameter, without needing to specify both a source
and a type
. The
table below shows some examples:
Config format | Example |
---|---|
YAML | condition: .status == 200 |
TOML | condition = ".status == 200" |
JSON | "condition": ".status == 200" |
Condition config examples
Standard VRL
forward_when:
type: "vrl"
source: ".status == 500"
forward_when = { type = "vrl", source = ".status == 500" }
"forward_when": {
"type": "vrl",
"source": ".status == 500"
}
graph
optional objectExtra graph configuration
Configure output for component when generated with graph command
graph.node_attributes
optional objectNode attributes to add to this component’s node in resulting graph
They are added to the node as provided
graph.node_attributes.*
required string literalinputs
required [string]A list of upstream source or transform IDs.
Wildcards (*
) are supported.
See configuration for more info.
num_events_after
optional uintflush_when
condition.num_events_before
optional uintflush_when
condition.100
Outputs
<component_id>
Telemetry
Metrics
linkcomponent_discarded_events_total
counterfilter
transform, or false if due to an error.component_errors_total
countercomponent_received_event_bytes_total
countercomponent_received_events_count
histogramA histogram of the number of events passed in each internal batch in Vector’s internal topology.
Note that this is separate than sink-level batching. It is mostly useful for low level debugging performance issues in Vector due to small internal batches.
component_received_events_total
countercomponent_sent_event_bytes_total
countercomponent_sent_events_total
counterstale_events_flushed_total
counterutilization
gaugeExamples
Flush recent events when an error happens
Given this event...[{"log":{"level":"info","message":"A01"}},{"log":{"level":"debug","message":"A02"}},{"log":{"level":"info","message":"A03"}},{"log":{"level":"debug","message":"A04"}},{"log":{"level":"error","message":"A05"}},{"log":{"level":"debug","message":"A06"}},{"log":{"level":"warning","message":"A07"}},{"log":{"level":"info","message":"A08"}},{"log":{"level":"debug","message":"A09"}},{"log":{"level":"info","message":"A10"}}]
transforms:
my_transform_id:
type: window
inputs:
- my-source-or-transform-id
flush_when: .level == "error"
num_events_before: 2
num_events_after: 2
[transforms.my_transform_id]
type = "window"
inputs = [ "my-source-or-transform-id" ]
flush_when = '.level == "error"'
num_events_before = 2
num_events_after = 2
{
"transforms": {
"my_transform_id": {
"type": "window",
"inputs": [
"my-source-or-transform-id"
],
"flush_when": ".level == \"error\"",
"num_events_before": 2,
"num_events_after": 2
}
}
}
[{"log":{"level":"info","message":"A03"}},{"log":{"level":"debug","message":"A04"}},{"log":{"level":"error","message":"A05"}},{"log":{"level":"debug","message":"A06"}},{"log":{"level":"warning","message":"A07"}}]
Pass events through without preserving the order
Given this event...[{"log":{"level":"info","message":"A01"}},{"log":{"level":"debug","message":"A02"}},{"log":{"level":"info","message":"A03"}},{"log":{"level":"debug","message":"A04"}},{"log":{"level":"error","message":"A05"}},{"log":{"level":"debug","message":"A06"}},{"log":{"level":"warning","message":"A07"}},{"log":{"level":"info","message":"A08"}},{"log":{"level":"debug","message":"A09"}},{"log":{"level":"info","message":"A10"}}]
transforms:
my_transform_id:
type: window
inputs:
- my-source-or-transform-id
flush_when: .level == "error"
forward_when: .level == "info"
num_events_before: 2
num_events_after: 2
[transforms.my_transform_id]
type = "window"
inputs = [ "my-source-or-transform-id" ]
flush_when = '.level == "error"'
forward_when = '.level == "info"'
num_events_before = 2
num_events_after = 2
{
"transforms": {
"my_transform_id": {
"type": "window",
"inputs": [
"my-source-or-transform-id"
],
"flush_when": ".level == \"error\"",
"forward_when": ".level == \"info\"",
"num_events_before": 2,
"num_events_after": 2
}
}
}
[{"log":{"level":"info","message":"A01"}},{"log":{"level":"info","message":"A03"}},{"log":{"level":"debug","message":"A02"}},{"log":{"level":"debug","message":"A04"}},{"log":{"level":"error","message":"A05"}},{"log":{"level":"debug","message":"A06"}},{"log":{"level":"warning","message":"A07"}},{"log":{"level":"info","message":"A08"}},{"log":{"level":"info","message":"A10"}}]
How it works
Advantages of Use
A common way to reduce log volume from a verbose system is to filter out less important messages, and only ingest errors and warnings. However, an error message by itself may not be sufficient to determine the cause, as surrounding events often contain important context information leading to the failure.
The window
transform allows for reduction of log volume by filtering out logs
when the system is healthy, but preserving detailed logs when they are most relevant.
Sliding Window
As the stream of events passes through the transform, it is observed though a “window” that spans between
num_events_before
and num_events_after
relative to an event matched by the flush_when
condition. When the
condition is matched, the whole window is flushed to the output. This is also known as backtrace logging or
ring buffer logging.
Past events are stored in a memory buffer with the capacity of num_events_before
. When the buffer is full,
the oldest events are dropped to make space for new ones. The buffer is not persistent, so in case of a hard
system crash, all the buffered events will be lost.
Future events are counted from the event matched by the flush_when
condition until num_events_after
number
of events is reached.
If the flush_when
condition is matched before the buffer fills up, it will be flushed again. If the flush
condition is triggered often enough (for example, the system is constantly logging errors), the transform may
always be in the flushing state, meaning that no events will be filtered. Therefore it works best for conditions
that are relatively uncommon.