Elasticsearch Serverless API

Base URL
http://api.example.com

Documentation source and versions

This documentation is derived from the main branch of the elasticsearch-specification repository. It is provided under license Attribution-NonCommercial-NoDerivatives 4.0 International.

Last update on May 6, 2025.

This API is provided under license Apache 2.0.

Authentication

Api key auth (http_api_key)

Elasticsearch APIs use key-based authentication. You must create an API key and use the encoded value in the request header. For example:

curl -X GET "${ES_URL}/_cat/indices?v=true" \
  -H "Authorization: ApiKey ${API_KEY}"

For more information about where to find API keys for the Elasticsearch endpoint (${ES_URL}) for a project, go to Get started with Elasticsearch Serverless.

Get behavioral analytics collections Deprecated Technical preview

GET /_application/analytics/{name}

Path parameters

  • name array[string] Required

    A list of analytics collections to limit the returned information

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object Additional properties
      Hide * attribute Show * attribute object
      • event_data_stream object Required
        Hide event_data_stream attribute Show event_data_stream attribute object
GET /_application/analytics/{name}
curl \
 --request GET 'http://api.example.com/_application/analytics/{name}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _application/analytics/my*`
{
  "my_analytics_collection": {
      "event_data_stream": {
          "name": "behavioral_analytics-events-my_analytics_collection"
      }
  },
  "my_analytics_collection2": {
      "event_data_stream": {
          "name": "behavioral_analytics-events-my_analytics_collection2"
      }
  }
}

Create a behavioral analytics collection Deprecated Technical preview

PUT /_application/analytics/{name}

Path parameters

  • name string Required

    The name of the analytics collection to be created or updated.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

    • name string Required
PUT /_application/analytics/{name}
curl \
 --request PUT 'http://api.example.com/_application/analytics/{name}' \
 --header "Authorization: $API_KEY"

Delete a behavioral analytics collection Deprecated Technical preview

DELETE /_application/analytics/{name}

The associated data stream is also deleted.

Path parameters

  • name string Required

    The name of the analytics collection to be deleted

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_application/analytics/{name}
curl \
 --request DELETE 'http://api.example.com/_application/analytics/{name}' \
 --header "Authorization: $API_KEY"

Get behavioral analytics collections Deprecated Technical preview

GET /_application/analytics

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object Additional properties
      Hide * attribute Show * attribute object
      • event_data_stream object Required
        Hide event_data_stream attribute Show event_data_stream attribute object
GET /_application/analytics
curl \
 --request GET 'http://api.example.com/_application/analytics' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _application/analytics/my*`
{
  "my_analytics_collection": {
      "event_data_stream": {
          "name": "behavioral_analytics-events-my_analytics_collection"
      }
  },
  "my_analytics_collection2": {
      "event_data_stream": {
          "name": "behavioral_analytics-events-my_analytics_collection2"
      }
  }
}

Compact and aligned text (CAT)

The compact and aligned text (CAT) APIs aim are intended only for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, it's recommend to use a corresponding JSON API. All the cat commands accept a query string parameter help to see all the headers and info they provide, and the /_cat command alone lists all the available commands.

Get aliases

GET /_cat/aliases

Get the cluster's index aliases, including filter and routing information. This API does not return data stream aliases.

IMPORTANT: CAT APIs are only intended for human consumption using the command line or the Kibana console. They are not intended for use by applications. For application consumption, use the aliases API.

Query parameters

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

  • expand_wildcards string | array[string]

    The type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. It supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • The period to wait for a connection to the master node. If the master node is not available before the timeout expires, the request fails and returns an error. To indicated that the request should never timeout, you can set it to -1.

Responses

GET /_cat/aliases
curl \
 --request GET 'http://api.example.com/_cat/aliases' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _cat/aliases?format=json&v=true`. This response shows that `alias2` has configured a filter and `alias3` and `alias4` have routing configurations.
[
  {
    "alias": "alias1",
    "index": "test1",
    "filter": "-",
    "routing.index": "-",
    "routing.search": "-",
    "is_write_index": "true"
  },
  {
    "alias": "alias1",
    "index": "test1",
    "filter": "*",
    "routing.index": "-",
    "routing.search": "-",
    "is_write_index": "true"
  },
  {
    "alias": "alias3",
    "index": "test1",
    "filter": "-",
    "routing.index": "1",
    "routing.search": "1",
    "is_write_index": "true"
  },
  {
    "alias": "alias4",
    "index": "test1",
    "filter": "-",
    "routing.index": "2",
    "routing.search": "1,2",
    "is_write_index": "true"
  }
]

Get aliases

GET /_cat/aliases/{name}

Get the cluster's index aliases, including filter and routing information. This API does not return data stream aliases.

IMPORTANT: CAT APIs are only intended for human consumption using the command line or the Kibana console. They are not intended for use by applications. For application consumption, use the aliases API.

Path parameters

  • name string | array[string] Required

    A comma-separated list of aliases to retrieve. Supports wildcards (*). To retrieve all aliases, omit this parameter or use * or _all.

Query parameters

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

  • expand_wildcards string | array[string]

    The type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. It supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • The period to wait for a connection to the master node. If the master node is not available before the timeout expires, the request fails and returns an error. To indicated that the request should never timeout, you can set it to -1.

Responses

GET /_cat/aliases/{name}
curl \
 --request GET 'http://api.example.com/_cat/aliases/{name}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _cat/aliases?format=json&v=true`. This response shows that `alias2` has configured a filter and `alias3` and `alias4` have routing configurations.
[
  {
    "alias": "alias1",
    "index": "test1",
    "filter": "-",
    "routing.index": "-",
    "routing.search": "-",
    "is_write_index": "true"
  },
  {
    "alias": "alias1",
    "index": "test1",
    "filter": "*",
    "routing.index": "-",
    "routing.search": "-",
    "is_write_index": "true"
  },
  {
    "alias": "alias3",
    "index": "test1",
    "filter": "-",
    "routing.index": "1",
    "routing.search": "1",
    "is_write_index": "true"
  },
  {
    "alias": "alias4",
    "index": "test1",
    "filter": "-",
    "routing.index": "2",
    "routing.search": "1,2",
    "is_write_index": "true"
  }
]

Get component templates Added in 5.1.0

GET /_cat/component_templates

Get information about component templates in a cluster. Component templates are building blocks for constructing index templates that specify index mappings, settings, and aliases.

IMPORTANT: CAT APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the get component template API.

Query parameters

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

  • local boolean

    If true, the request computes the list of selected nodes from the local cluster state. If false the list of selected nodes are computed from the cluster state of the master node. In both cases the coordinating node will send requests for further information to each selected node.

  • The period to wait for a connection to the master node.

Responses

GET /_cat/component_templates
curl \
 --request GET 'http://api.example.com/_cat/component_templates' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _cat/component_templates/my-template-*?v=true&s=name&format=json`.
[
  {
    "name": "my-template-1",
    "version": "null",
    "alias_count": "0",
    "mapping_count": "0",
    "settings_count": "1",
    "metadata_count": "0",
    "included_in": "[my-index-template]"
  },
    {
    "name": "my-template-2",
    "version": null,
    "alias_count": "0",
    "mapping_count": "3",
    "settings_count": "0",
    "metadata_count": "0",
    "included_in": "[my-index-template]"
  }
]

Get component templates Added in 5.1.0

GET /_cat/component_templates/{name}

Get information about component templates in a cluster. Component templates are building blocks for constructing index templates that specify index mappings, settings, and aliases.

IMPORTANT: CAT APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the get component template API.

Path parameters

  • name string Required

    The name of the component template. It accepts wildcard expressions. If it is omitted, all component templates are returned.

Query parameters

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

  • local boolean

    If true, the request computes the list of selected nodes from the local cluster state. If false the list of selected nodes are computed from the cluster state of the master node. In both cases the coordinating node will send requests for further information to each selected node.

  • The period to wait for a connection to the master node.

Responses

GET /_cat/component_templates/{name}
curl \
 --request GET 'http://api.example.com/_cat/component_templates/{name}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _cat/component_templates/my-template-*?v=true&s=name&format=json`.
[
  {
    "name": "my-template-1",
    "version": "null",
    "alias_count": "0",
    "mapping_count": "0",
    "settings_count": "1",
    "metadata_count": "0",
    "included_in": "[my-index-template]"
  },
    {
    "name": "my-template-2",
    "version": null,
    "alias_count": "0",
    "mapping_count": "3",
    "settings_count": "0",
    "metadata_count": "0",
    "included_in": "[my-index-template]"
  }
]

Get a document count

GET /_cat/count

Get quick access to a document count for a data stream, an index, or an entire cluster. The document count only includes live documents, not deleted documents which have not yet been removed by the merge process.

IMPORTANT: CAT APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the count API.

Query parameters

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • epoch number | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • Time of day, expressed as HH:MM:SS

    • count string

      the document count

GET /_cat/count
curl \
 --request GET 'http://api.example.com/_cat/count' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET /_cat/count/my-index-000001?v=true&format=json`. It retrieves the document count for the `my-index-000001` data stream or index.
[
  {
    "epoch": "1475868259",
    "timestamp": "15:24:20",
    "count": "120"
  }
]
A successful response from `GET /_cat/count?v=true&format=json`. It retrieves the document count for all data streams and indices in the cluster.
[
  {
    "epoch": "1475868259",
    "timestamp": "15:24:20",
    "count": "121"
  }
]

Get a document count

GET /_cat/count/{index}

Get quick access to a document count for a data stream, an index, or an entire cluster. The document count only includes live documents, not deleted documents which have not yet been removed by the merge process.

IMPORTANT: CAT APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the count API.

Path parameters

  • index string | array[string] Required

    A comma-separated list of data streams, indices, and aliases used to limit the request. It supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

Query parameters

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • epoch number | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • Time of day, expressed as HH:MM:SS

    • count string

      the document count

GET /_cat/count/{index}
curl \
 --request GET 'http://api.example.com/_cat/count/{index}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET /_cat/count/my-index-000001?v=true&format=json`. It retrieves the document count for the `my-index-000001` data stream or index.
[
  {
    "epoch": "1475868259",
    "timestamp": "15:24:20",
    "count": "120"
  }
]
A successful response from `GET /_cat/count?v=true&format=json`. It retrieves the document count for all data streams and indices in the cluster.
[
  {
    "epoch": "1475868259",
    "timestamp": "15:24:20",
    "count": "121"
  }
]

Get CAT help

GET /_cat

Get help for the CAT APIs.

Responses

GET /_cat
curl \
 --request GET 'http://api.example.com/_cat' \
 --header "Authorization: $API_KEY"

Get index information

GET /_cat/indices

Get high-level information about indices in a cluster, including backing indices for data streams.

Use this request to get the following information for each index in a cluster:

  • shard count
  • document count
  • deleted document count
  • primary store size
  • total store size of all shards, including shard replicas

These metrics are retrieved directly from Lucene, which Elasticsearch uses internally to power indexing and search. As a result, all document counts include hidden nested documents. To get an accurate count of Elasticsearch documents, use the cat count or count APIs.

CAT APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use an index endpoint.

Query parameters

  • bytes string

    The unit used to display byte values.

    Values are b, kb, mb, gb, tb, or pb.

  • expand_wildcards string | array[string]

    The type of index that wildcard patterns can match.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • health string

    The health status used to limit returned indices. By default, the response includes indices of any health status.

    Supported values include:

    • green (or GREEN): All shards are assigned.
    • yellow (or YELLOW): All primary shards are assigned, but one or more replica shards are unassigned. If a node in the cluster fails, some data could be unavailable until that node is repaired.
    • red (or RED): One or more primary shards are unassigned, so some data is unavailable. This can occur briefly during cluster startup as primary shards are assigned.

    Values are green, GREEN, yellow, YELLOW, red, or RED.

  • If true, the response includes information from segments that are not loaded into memory.

  • pri boolean

    If true, the response only includes information from primary shards.

  • time string

    The unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

  • Period to wait for a connection to the master node.

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

Responses

GET /_cat/indices
curl \
 --request GET 'http://api.example.com/_cat/indices' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET /_cat/indices/my-index-*?v=true&s=index&format=json`.
[
  {
    "health": "yellow",
    "status": "open",
    "index": "my-index-000001",
    "uuid": "u8FNjxh8Rfy_awN11oDKYQ",
    "pri": "1",
    "rep": "1",
    "docs.count": "1200",
    "docs.deleted": "0",
    "store.size": "88.1kb",
    "pri.store.size": "88.1kb",
    "dataset.size": "88.1kb"
  },
  {
    "health": "green",
    "status": "open",
    "index": "my-index-000002",
    "uuid": "nYFWZEO7TUiOjLQXBaYJpA ",
    "pri": "1",
    "rep": "0",
    "docs.count": "0",
    "docs.deleted": "0",
    "store.size": "260b",
    "pri.store.size": "260b",
    "dataset.size": "260b"
  }
]

Get index information

GET /_cat/indices/{index}

Get high-level information about indices in a cluster, including backing indices for data streams.

Use this request to get the following information for each index in a cluster:

  • shard count
  • document count
  • deleted document count
  • primary store size
  • total store size of all shards, including shard replicas

These metrics are retrieved directly from Lucene, which Elasticsearch uses internally to power indexing and search. As a result, all document counts include hidden nested documents. To get an accurate count of Elasticsearch documents, use the cat count or count APIs.

CAT APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use an index endpoint.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and aliases used to limit the request. Supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

Query parameters

  • bytes string

    The unit used to display byte values.

    Values are b, kb, mb, gb, tb, or pb.

  • expand_wildcards string | array[string]

    The type of index that wildcard patterns can match.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • health string

    The health status used to limit returned indices. By default, the response includes indices of any health status.

    Supported values include:

    • green (or GREEN): All shards are assigned.
    • yellow (or YELLOW): All primary shards are assigned, but one or more replica shards are unassigned. If a node in the cluster fails, some data could be unavailable until that node is repaired.
    • red (or RED): One or more primary shards are unassigned, so some data is unavailable. This can occur briefly during cluster startup as primary shards are assigned.

    Values are green, GREEN, yellow, YELLOW, red, or RED.

  • If true, the response includes information from segments that are not loaded into memory.

  • pri boolean

    If true, the response only includes information from primary shards.

  • time string

    The unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

  • Period to wait for a connection to the master node.

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

Responses

GET /_cat/indices/{index}
curl \
 --request GET 'http://api.example.com/_cat/indices/{index}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET /_cat/indices/my-index-*?v=true&s=index&format=json`.
[
  {
    "health": "yellow",
    "status": "open",
    "index": "my-index-000001",
    "uuid": "u8FNjxh8Rfy_awN11oDKYQ",
    "pri": "1",
    "rep": "1",
    "docs.count": "1200",
    "docs.deleted": "0",
    "store.size": "88.1kb",
    "pri.store.size": "88.1kb",
    "dataset.size": "88.1kb"
  },
  {
    "health": "green",
    "status": "open",
    "index": "my-index-000002",
    "uuid": "nYFWZEO7TUiOjLQXBaYJpA ",
    "pri": "1",
    "rep": "0",
    "docs.count": "0",
    "docs.deleted": "0",
    "store.size": "260b",
    "pri.store.size": "260b",
    "dataset.size": "260b"
  }
]

Get data frame analytics jobs Added in 7.7.0

GET /_cat/ml/data_frame/analytics

Get configuration and usage information about data frame analytics jobs.

IMPORTANT: CAT APIs are only intended for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, use the get data frame analytics jobs statistics API.

Query parameters

  • Whether to ignore if a wildcard expression matches no configs. (This includes _all string or when no configs have been specified)

  • bytes string

    The unit in which to display byte values

    Values are b, kb, mb, gb, tb, or pb.

  • h string | array[string]

    Comma-separated list of column names to display.

    Supported values include:

    • assignment_explanation (or ae): Contains messages relating to the selection of a node.
    • create_time (or ct, createTime): The time when the data frame analytics job was created.
    • description (or d): A description of a job.
    • dest_index (or di, destIndex): Name of the destination index.
    • failure_reason (or fr, failureReason): Contains messages about the reason why a data frame analytics job failed.
    • id: Identifier for the data frame analytics job.
    • model_memory_limit (or mml, modelMemoryLimit): The approximate maximum amount of memory resources that are permitted for the data frame analytics job.
    • node.address (or na, nodeAddress): The network address of the node that the data frame analytics job is assigned to.
    • node.ephemeral_id (or ne, nodeEphemeralId): The ephemeral ID of the node that the data frame analytics job is assigned to.
    • node.id (or ni, nodeId): The unique identifier of the node that the data frame analytics job is assigned to.
    • node.name (or nn, nodeName): The name of the node that the data frame analytics job is assigned to.
    • progress (or p): The progress report of the data frame analytics job by phase.
    • source_index (or si, sourceIndex): Name of the source index.
    • state (or s): Current state of the data frame analytics job.
    • type (or t): The type of analysis that the data frame analytics job performs.
    • version (or v): The Elasticsearch version number in which the data frame analytics job was created.
  • s string | array[string]

    Comma-separated list of column names or column aliases used to sort the response.

    Supported values include:

    • assignment_explanation (or ae): Contains messages relating to the selection of a node.
    • create_time (or ct, createTime): The time when the data frame analytics job was created.
    • description (or d): A description of a job.
    • dest_index (or di, destIndex): Name of the destination index.
    • failure_reason (or fr, failureReason): Contains messages about the reason why a data frame analytics job failed.
    • id: Identifier for the data frame analytics job.
    • model_memory_limit (or mml, modelMemoryLimit): The approximate maximum amount of memory resources that are permitted for the data frame analytics job.
    • node.address (or na, nodeAddress): The network address of the node that the data frame analytics job is assigned to.
    • node.ephemeral_id (or ne, nodeEphemeralId): The ephemeral ID of the node that the data frame analytics job is assigned to.
    • node.id (or ni, nodeId): The unique identifier of the node that the data frame analytics job is assigned to.
    • node.name (or nn, nodeName): The name of the node that the data frame analytics job is assigned to.
    • progress (or p): The progress report of the data frame analytics job by phase.
    • source_index (or si, sourceIndex): Name of the source index.
    • state (or s): Current state of the data frame analytics job.
    • type (or t): The type of analysis that the data frame analytics job performs.
    • version (or v): The Elasticsearch version number in which the data frame analytics job was created.
  • time string

    Unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

Responses

GET /_cat/ml/data_frame/analytics
curl \
 --request GET 'http://api.example.com/_cat/ml/data_frame/analytics' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _cat/ml/data_frame/analytics?v=true&format=json`.
[
  {
    "id": "classifier_job_1",
    "type": "classification",
    "create_time": "2020-02-12T11:49:09.594Z",
    "state": "stopped"
  },
    {
    "id": "classifier_job_2",
    "type": "classification",
    "create_time": "2020-02-12T11:49:14.479Z",
    "state": "stopped"
  },
  {
    "id": "classifier_job_3",
    "type": "classification",
    "create_time": "2020-02-12T11:49:16.928Z",
    "state": "stopped"
  },
  {
    "id": "classifier_job_4",
    "type": "classification",
    "create_time": "2020-02-12T11:49:19.127Z",
    "state": "stopped"
  },
  {
    "id": "classifier_job_5",
    "type": "classification",
    "create_time": "2020-02-12T11:49:21.349Z",
    "state": "stopped"
  }
]

Get data frame analytics jobs Added in 7.7.0

GET /_cat/ml/data_frame/analytics/{id}

Get configuration and usage information about data frame analytics jobs.

IMPORTANT: CAT APIs are only intended for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, use the get data frame analytics jobs statistics API.

Path parameters

  • id string Required

    The ID of the data frame analytics to fetch

Query parameters

  • Whether to ignore if a wildcard expression matches no configs. (This includes _all string or when no configs have been specified)

  • bytes string

    The unit in which to display byte values

    Values are b, kb, mb, gb, tb, or pb.

  • h string | array[string]

    Comma-separated list of column names to display.

    Supported values include:

    • assignment_explanation (or ae): Contains messages relating to the selection of a node.
    • create_time (or ct, createTime): The time when the data frame analytics job was created.
    • description (or d): A description of a job.
    • dest_index (or di, destIndex): Name of the destination index.
    • failure_reason (or fr, failureReason): Contains messages about the reason why a data frame analytics job failed.
    • id: Identifier for the data frame analytics job.
    • model_memory_limit (or mml, modelMemoryLimit): The approximate maximum amount of memory resources that are permitted for the data frame analytics job.
    • node.address (or na, nodeAddress): The network address of the node that the data frame analytics job is assigned to.
    • node.ephemeral_id (or ne, nodeEphemeralId): The ephemeral ID of the node that the data frame analytics job is assigned to.
    • node.id (or ni, nodeId): The unique identifier of the node that the data frame analytics job is assigned to.
    • node.name (or nn, nodeName): The name of the node that the data frame analytics job is assigned to.
    • progress (or p): The progress report of the data frame analytics job by phase.
    • source_index (or si, sourceIndex): Name of the source index.
    • state (or s): Current state of the data frame analytics job.
    • type (or t): The type of analysis that the data frame analytics job performs.
    • version (or v): The Elasticsearch version number in which the data frame analytics job was created.
  • s string | array[string]

    Comma-separated list of column names or column aliases used to sort the response.

    Supported values include:

    • assignment_explanation (or ae): Contains messages relating to the selection of a node.
    • create_time (or ct, createTime): The time when the data frame analytics job was created.
    • description (or d): A description of a job.
    • dest_index (or di, destIndex): Name of the destination index.
    • failure_reason (or fr, failureReason): Contains messages about the reason why a data frame analytics job failed.
    • id: Identifier for the data frame analytics job.
    • model_memory_limit (or mml, modelMemoryLimit): The approximate maximum amount of memory resources that are permitted for the data frame analytics job.
    • node.address (or na, nodeAddress): The network address of the node that the data frame analytics job is assigned to.
    • node.ephemeral_id (or ne, nodeEphemeralId): The ephemeral ID of the node that the data frame analytics job is assigned to.
    • node.id (or ni, nodeId): The unique identifier of the node that the data frame analytics job is assigned to.
    • node.name (or nn, nodeName): The name of the node that the data frame analytics job is assigned to.
    • progress (or p): The progress report of the data frame analytics job by phase.
    • source_index (or si, sourceIndex): Name of the source index.
    • state (or s): Current state of the data frame analytics job.
    • type (or t): The type of analysis that the data frame analytics job performs.
    • version (or v): The Elasticsearch version number in which the data frame analytics job was created.
  • time string

    Unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

Responses

GET /_cat/ml/data_frame/analytics/{id}
curl \
 --request GET 'http://api.example.com/_cat/ml/data_frame/analytics/{id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _cat/ml/data_frame/analytics?v=true&format=json`.
[
  {
    "id": "classifier_job_1",
    "type": "classification",
    "create_time": "2020-02-12T11:49:09.594Z",
    "state": "stopped"
  },
    {
    "id": "classifier_job_2",
    "type": "classification",
    "create_time": "2020-02-12T11:49:14.479Z",
    "state": "stopped"
  },
  {
    "id": "classifier_job_3",
    "type": "classification",
    "create_time": "2020-02-12T11:49:16.928Z",
    "state": "stopped"
  },
  {
    "id": "classifier_job_4",
    "type": "classification",
    "create_time": "2020-02-12T11:49:19.127Z",
    "state": "stopped"
  },
  {
    "id": "classifier_job_5",
    "type": "classification",
    "create_time": "2020-02-12T11:49:21.349Z",
    "state": "stopped"
  }
]

Get datafeeds Added in 7.7.0

GET /_cat/ml/datafeeds

Get configuration and usage information about datafeeds. This API returns a maximum of 10,000 datafeeds. If the Elasticsearch security features are enabled, you must have monitor_ml, monitor, manage_ml, or manage cluster privileges to use this API.

IMPORTANT: CAT APIs are only intended for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, use the get datafeed statistics API.

Query parameters

  • Specifies what to do when the request:

    • Contains wildcard expressions and there are no datafeeds that match.
    • Contains the _all string or no identifiers and there are no matches.
    • Contains wildcard expressions and there are only partial matches.

    If true, the API returns an empty datafeeds array when there are no matches and the subset of results when there are partial matches. If false, the API returns a 404 status code when there are no matches or only partial matches.

  • h string | array[string]

    Comma-separated list of column names to display.

    Supported values include:

    • ae (or assignment_explanation): For started datafeeds only, contains messages relating to the selection of a node.
    • bc (or buckets.count, bucketsCount): The number of buckets processed.
    • id: A numerical character string that uniquely identifies the datafeed.
    • na (or node.address, nodeAddress): For started datafeeds only, the network address of the node where the datafeed is started.
    • ne (or node.ephemeral_id, nodeEphemeralId): For started datafeeds only, the ephemeral ID of the node where the datafeed is started.
    • ni (or node.id, nodeId): For started datafeeds only, the unique identifier of the node where the datafeed is started.
    • nn (or node.name, nodeName): For started datafeeds only, the name of the node where the datafeed is started.
    • sba (or search.bucket_avg, searchBucketAvg): The average search time per bucket, in milliseconds.
    • sc (or search.count, searchCount): The number of searches run by the datafeed.
    • seah (or search.exp_avg_hour, searchExpAvgHour): The exponential average search time per hour, in milliseconds.
    • st (or search.time, searchTime): The total time the datafeed spent searching, in milliseconds.
    • s (or state): The status of the datafeed: starting, started, stopping, or stopped. If starting, the datafeed has been requested to start but has not yet started. If started, the datafeed is actively receiving data. If stopping, the datafeed has been requested to stop gracefully and is completing its final action. If stopped, the datafeed is stopped and will not receive data until it is re-started.
  • s string | array[string]

    Comma-separated list of column names or column aliases used to sort the response.

    Supported values include:

    • ae (or assignment_explanation): For started datafeeds only, contains messages relating to the selection of a node.
    • bc (or buckets.count, bucketsCount): The number of buckets processed.
    • id: A numerical character string that uniquely identifies the datafeed.
    • na (or node.address, nodeAddress): For started datafeeds only, the network address of the node where the datafeed is started.
    • ne (or node.ephemeral_id, nodeEphemeralId): For started datafeeds only, the ephemeral ID of the node where the datafeed is started.
    • ni (or node.id, nodeId): For started datafeeds only, the unique identifier of the node where the datafeed is started.
    • nn (or node.name, nodeName): For started datafeeds only, the name of the node where the datafeed is started.
    • sba (or search.bucket_avg, searchBucketAvg): The average search time per bucket, in milliseconds.
    • sc (or search.count, searchCount): The number of searches run by the datafeed.
    • seah (or search.exp_avg_hour, searchExpAvgHour): The exponential average search time per hour, in milliseconds.
    • st (or search.time, searchTime): The total time the datafeed spent searching, in milliseconds.
    • s (or state): The status of the datafeed: starting, started, stopping, or stopped. If starting, the datafeed has been requested to start but has not yet started. If started, the datafeed is actively receiving data. If stopping, the datafeed has been requested to stop gracefully and is completing its final action. If stopped, the datafeed is stopped and will not receive data until it is re-started.
  • time string

    The unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string

      The datafeed identifier.

    • state string

      Values are started, stopped, starting, or stopping.

    • For started datafeeds only, contains messages relating to the selection of a node.

    • The number of buckets processed.

    • The number of searches run by the datafeed.

    • The total time the datafeed spent searching, in milliseconds.

    • The average search time per bucket, in milliseconds.

    • The exponential average search time per hour, in milliseconds.

    • node.id string

      The unique identifier of the assigned node. For started datafeeds only, this information pertains to the node upon which the datafeed is started.

    • The name of the assigned node. For started datafeeds only, this information pertains to the node upon which the datafeed is started.

    • The ephemeral identifier of the assigned node. For started datafeeds only, this information pertains to the node upon which the datafeed is started.

    • The network address of the assigned node. For started datafeeds only, this information pertains to the node upon which the datafeed is started.

GET /_cat/ml/datafeeds
curl \
 --request GET 'http://api.example.com/_cat/ml/datafeeds' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _cat/ml/datafeeds?v=true&format=json`.
[
  {
    "id": "datafeed-high_sum_total_sales",
    "state": "stopped",
    "buckets.count": "743",
    "search.count": "7"
  },
  {
    "id": "datafeed-low_request_rate",
    "state": "stopped",
    "buckets.count": "1457",
    "search.count": "3"
  },
  {
    "id": "datafeed-response_code_rates",
    "state": "stopped",
    "buckets.count": "1460",
    "search.count": "18"
  },
  {
    "id": "datafeed-url_scanning",
    "state": "stopped",
    "buckets.count": "1460",
    "search.count": "18"
  }
]

Get datafeeds Added in 7.7.0

GET /_cat/ml/datafeeds/{datafeed_id}

Get configuration and usage information about datafeeds. This API returns a maximum of 10,000 datafeeds. If the Elasticsearch security features are enabled, you must have monitor_ml, monitor, manage_ml, or manage cluster privileges to use this API.

IMPORTANT: CAT APIs are only intended for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, use the get datafeed statistics API.

Path parameters

  • datafeed_id string Required

    A numerical character string that uniquely identifies the datafeed.

Query parameters

  • Specifies what to do when the request:

    • Contains wildcard expressions and there are no datafeeds that match.
    • Contains the _all string or no identifiers and there are no matches.
    • Contains wildcard expressions and there are only partial matches.

    If true, the API returns an empty datafeeds array when there are no matches and the subset of results when there are partial matches. If false, the API returns a 404 status code when there are no matches or only partial matches.

  • h string | array[string]

    Comma-separated list of column names to display.

    Supported values include:

    • ae (or assignment_explanation): For started datafeeds only, contains messages relating to the selection of a node.
    • bc (or buckets.count, bucketsCount): The number of buckets processed.
    • id: A numerical character string that uniquely identifies the datafeed.
    • na (or node.address, nodeAddress): For started datafeeds only, the network address of the node where the datafeed is started.
    • ne (or node.ephemeral_id, nodeEphemeralId): For started datafeeds only, the ephemeral ID of the node where the datafeed is started.
    • ni (or node.id, nodeId): For started datafeeds only, the unique identifier of the node where the datafeed is started.
    • nn (or node.name, nodeName): For started datafeeds only, the name of the node where the datafeed is started.
    • sba (or search.bucket_avg, searchBucketAvg): The average search time per bucket, in milliseconds.
    • sc (or search.count, searchCount): The number of searches run by the datafeed.
    • seah (or search.exp_avg_hour, searchExpAvgHour): The exponential average search time per hour, in milliseconds.
    • st (or search.time, searchTime): The total time the datafeed spent searching, in milliseconds.
    • s (or state): The status of the datafeed: starting, started, stopping, or stopped. If starting, the datafeed has been requested to start but has not yet started. If started, the datafeed is actively receiving data. If stopping, the datafeed has been requested to stop gracefully and is completing its final action. If stopped, the datafeed is stopped and will not receive data until it is re-started.
  • s string | array[string]

    Comma-separated list of column names or column aliases used to sort the response.

    Supported values include:

    • ae (or assignment_explanation): For started datafeeds only, contains messages relating to the selection of a node.
    • bc (or buckets.count, bucketsCount): The number of buckets processed.
    • id: A numerical character string that uniquely identifies the datafeed.
    • na (or node.address, nodeAddress): For started datafeeds only, the network address of the node where the datafeed is started.
    • ne (or node.ephemeral_id, nodeEphemeralId): For started datafeeds only, the ephemeral ID of the node where the datafeed is started.
    • ni (or node.id, nodeId): For started datafeeds only, the unique identifier of the node where the datafeed is started.
    • nn (or node.name, nodeName): For started datafeeds only, the name of the node where the datafeed is started.
    • sba (or search.bucket_avg, searchBucketAvg): The average search time per bucket, in milliseconds.
    • sc (or search.count, searchCount): The number of searches run by the datafeed.
    • seah (or search.exp_avg_hour, searchExpAvgHour): The exponential average search time per hour, in milliseconds.
    • st (or search.time, searchTime): The total time the datafeed spent searching, in milliseconds.
    • s (or state): The status of the datafeed: starting, started, stopping, or stopped. If starting, the datafeed has been requested to start but has not yet started. If started, the datafeed is actively receiving data. If stopping, the datafeed has been requested to stop gracefully and is completing its final action. If stopped, the datafeed is stopped and will not receive data until it is re-started.
  • time string

    The unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string

      The datafeed identifier.

    • state string

      Values are started, stopped, starting, or stopping.

    • For started datafeeds only, contains messages relating to the selection of a node.

    • The number of buckets processed.

    • The number of searches run by the datafeed.

    • The total time the datafeed spent searching, in milliseconds.

    • The average search time per bucket, in milliseconds.

    • The exponential average search time per hour, in milliseconds.

    • node.id string

      The unique identifier of the assigned node. For started datafeeds only, this information pertains to the node upon which the datafeed is started.

    • The name of the assigned node. For started datafeeds only, this information pertains to the node upon which the datafeed is started.

    • The ephemeral identifier of the assigned node. For started datafeeds only, this information pertains to the node upon which the datafeed is started.

    • The network address of the assigned node. For started datafeeds only, this information pertains to the node upon which the datafeed is started.

GET /_cat/ml/datafeeds/{datafeed_id}
curl \
 --request GET 'http://api.example.com/_cat/ml/datafeeds/{datafeed_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _cat/ml/datafeeds?v=true&format=json`.
[
  {
    "id": "datafeed-high_sum_total_sales",
    "state": "stopped",
    "buckets.count": "743",
    "search.count": "7"
  },
  {
    "id": "datafeed-low_request_rate",
    "state": "stopped",
    "buckets.count": "1457",
    "search.count": "3"
  },
  {
    "id": "datafeed-response_code_rates",
    "state": "stopped",
    "buckets.count": "1460",
    "search.count": "18"
  },
  {
    "id": "datafeed-url_scanning",
    "state": "stopped",
    "buckets.count": "1460",
    "search.count": "18"
  }
]

Get anomaly detection jobs Added in 7.7.0

GET /_cat/ml/anomaly_detectors

Get configuration and usage information for anomaly detection jobs. This API returns a maximum of 10,000 jobs. If the Elasticsearch security features are enabled, you must have monitor_ml, monitor, manage_ml, or manage cluster privileges to use this API.

IMPORTANT: CAT APIs are only intended for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, use the get anomaly detection job statistics API.

Query parameters

  • Specifies what to do when the request:

    • Contains wildcard expressions and there are no jobs that match.
    • Contains the _all string or no identifiers and there are no matches.
    • Contains wildcard expressions and there are only partial matches.

    If true, the API returns an empty jobs array when there are no matches and the subset of results when there are partial matches. If false, the API returns a 404 status code when there are no matches or only partial matches.

  • bytes string

    The unit used to display byte values.

    Values are b, kb, mb, gb, tb, or pb.

  • h string | array[string]

    Comma-separated list of column names to display.

    Supported values include:

    • assignment_explanation (or ae): For open anomaly detection jobs only, contains messages relating to the selection of a node to run the job.
    • buckets.count (or bc, bucketsCount): The number of bucket results produced by the job.
    • buckets.time.exp_avg (or btea, bucketsTimeExpAvg): Exponential moving average of all bucket processing times, in milliseconds.
    • buckets.time.exp_avg_hour (or bteah, bucketsTimeExpAvgHour): Exponentially-weighted moving average of bucket processing times calculated in a 1 hour time window, in milliseconds.
    • buckets.time.max (or btmax, bucketsTimeMax): Maximum among all bucket processing times, in milliseconds.
    • buckets.time.min (or btmin, bucketsTimeMin): Minimum among all bucket processing times, in milliseconds.
    • buckets.time.total (or btt, bucketsTimeTotal): Sum of all bucket processing times, in milliseconds.
    • data.buckets (or db, dataBuckets): The number of buckets processed.
    • data.earliest_record (or der, dataEarliestRecord): The timestamp of the earliest chronologically input document.
    • data.empty_buckets (or deb, dataEmptyBuckets): The number of buckets which did not contain any data.
    • data.input_bytes (or dib, dataInputBytes): The number of bytes of input data posted to the anomaly detection job.
    • data.input_fields (or dif, dataInputFields): The total number of fields in input documents posted to the anomaly detection job. This count includes fields that are not used in the analysis. However, be aware that if you are using a datafeed, it extracts only the required fields from the documents it retrieves before posting them to the job.
    • data.input_records (or dir, dataInputRecords): The number of input documents posted to the anomaly detection job.
    • data.invalid_dates (or did, dataInvalidDates): The number of input documents with either a missing date field or a date that could not be parsed.
    • data.last (or dl, dataLast): The timestamp at which data was last analyzed, according to server time.
    • data.last_empty_bucket (or dleb, dataLastEmptyBucket): The timestamp of the last bucket that did not contain any data.
    • data.last_sparse_bucket (or dlsb, dataLastSparseBucket): The timestamp of the last bucket that was considered sparse.
    • data.latest_record (or dlr, dataLatestRecord): The timestamp of the latest chronologically input document.
    • data.missing_fields (or dmf, dataMissingFields): The number of input documents that are missing a field that the anomaly detection job is configured to analyze. Input documents with missing fields are still processed because it is possible that not all fields are missing.
    • data.out_of_order_timestamps (or doot, dataOutOfOrderTimestamps): The number of input documents that have a timestamp chronologically preceding the start of the current anomaly detection bucket offset by the latency window. This information is applicable only when you provide data to the anomaly detection job by using the post data API. These out of order documents are discarded, since jobs require time series data to be in ascending chronological order.
    • data.processed_fields (or dpf, dataProcessedFields): The total number of fields in all the documents that have been processed by the anomaly detection job. Only fields that are specified in the detector configuration object contribute to this count. The timestamp is not included in this count.
    • data.processed_records (or dpr, dataProcessedRecords): The number of input documents that have been processed by the anomaly detection job. This value includes documents with missing fields, since they are nonetheless analyzed. If you use datafeeds and have aggregations in your search query, the processed record count is the number of aggregation results processed, not the number of Elasticsearch documents.
    • data.sparse_buckets (or dsb, dataSparseBuckets): The number of buckets that contained few data points compared to the expected number of data points.
    • forecasts.memory.avg (or fmavg, forecastsMemoryAvg): The average memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.max (or fmmax, forecastsMemoryMax): The maximum memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.min (or fmmin, forecastsMemoryMin): The minimum memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.total (or fmt, forecastsMemoryTotal): The total memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.records.avg (or fravg, forecastsRecordsAvg): The average number of model_forecast` documents written for forecasts related to the anomaly detection job.
    • forecasts.records.max (or frmax, forecastsRecordsMax): The maximum number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.records.min (or frmin, forecastsRecordsMin): The minimum number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.records.total (or frt, forecastsRecordsTotal): The total number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.time.avg (or ftavg, forecastsTimeAvg): The average runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.max (or ftmax, forecastsTimeMax): The maximum runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.min (or ftmin, forecastsTimeMin): The minimum runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.total (or ftt, forecastsTimeTotal): The total runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.total (or ft, forecastsTotal): The number of individual forecasts currently available for the job.
    • id: Identifier for the anomaly detection job.
    • model.bucket_allocation_failures (or mbaf, modelBucketAllocationFailures): The number of buckets for which new entities in incoming data were not processed due to insufficient model memory.
    • model.by_fields (or mbf, modelByFields): The number of by field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.bytes (or mb, modelBytes): The number of bytes of memory used by the models. This is the maximum value since the last time the model was persisted. If the job is closed, this value indicates the latest size.
    • model.bytes_exceeded (or mbe, modelBytesExceeded): The number of bytes over the high limit for memory usage at the last allocation failure.
    • model.categorization_status (or mcs, modelCategorizationStatus): The status of categorization for the job: ok or warn. If ok, categorization is performing acceptably well (or not being used at all). If warn, categorization is detecting a distribution of categories that suggests the input data is inappropriate for categorization. Problems could be that there is only one category, more than 90% of categories are rare, the number of categories is greater than 50% of the number of categorized documents, there are no frequently matched categories, or more than 50% of categories are dead.
    • model.categorized_doc_count (or mcdc, modelCategorizedDocCount): The number of documents that have had a field categorized.
    • model.dead_category_count (or mdcc, modelDeadCategoryCount): The number of categories created by categorization that will never be assigned again because another category’s definition makes it a superset of the dead category. Dead categories are a side effect of the way categorization has no prior training.
    • model.failed_category_count (or mdcc, modelFailedCategoryCount): The number of times that categorization wanted to create a new category but couldn’t because the job had hit its model memory limit. This count does not track which specific categories failed to be created. Therefore, you cannot use this value to determine the number of unique categories that were missed.
    • model.frequent_category_count (or mfcc, modelFrequentCategoryCount): The number of categories that match more than 1% of categorized documents.
    • model.log_time (or mlt, modelLogTime): The timestamp when the model stats were gathered, according to server time.
    • model.memory_limit (or mml, modelMemoryLimit): The timestamp when the model stats were gathered, according to server time.
    • model.memory_status (or mms, modelMemoryStatus): The status of the mathematical models: ok, soft_limit, or hard_limit. If ok, the models stayed below the configured value. If soft_limit, the models used more than 60% of the configured memory limit and older unused models will be pruned to free up space. Additionally, in categorization jobs no further category examples will be stored. If hard_limit, the models used more space than the configured memory limit. As a result, not all incoming data was processed.
    • model.over_fields (or mof, modelOverFields): The number of over field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.partition_fields (or mpf, modelPartitionFields): The number of partition field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.rare_category_count (or mrcc, modelRareCategoryCount): The number of categories that match just one categorized document.
    • model.timestamp (or mt, modelTimestamp): The timestamp of the last record when the model stats were gathered.
    • model.total_category_count (or mtcc, modelTotalCategoryCount): The number of categories created by categorization.
    • node.address (or na, nodeAddress): The network address of the node that runs the job. This information is available only for open jobs.
    • node.ephemeral_id (or ne, nodeEphemeralId): The ephemeral ID of the node that runs the job. This information is available only for open jobs.
    • node.id (or ni, nodeId): The unique identifier of the node that runs the job. This information is available only for open jobs.
    • node.name (or nn, nodeName): The name of the node that runs the job. This information is available only for open jobs.
    • opened_time (or ot): For open jobs only, the elapsed time for which the job has been open.
    • state (or s): The status of the anomaly detection job: closed, closing, failed, opened, or opening. If closed, the job finished successfully with its model state persisted. The job must be opened before it can accept further data. If closing, the job close action is in progress and has not yet completed. A closing job cannot accept further data. If failed, the job did not finish successfully due to an error. This situation can occur due to invalid input data, a fatal error occurring during the analysis, or an external interaction such as the process being killed by the Linux out of memory (OOM) killer. If the job had irrevocably failed, it must be force closed and then deleted. If the datafeed can be corrected, the job can be closed and then re-opened. If opened, the job is available to receive and process data. If opening, the job open action is in progress and has not yet completed.
  • s string | array[string]

    Comma-separated list of column names or column aliases used to sort the response.

    Supported values include:

    • assignment_explanation (or ae): For open anomaly detection jobs only, contains messages relating to the selection of a node to run the job.
    • buckets.count (or bc, bucketsCount): The number of bucket results produced by the job.
    • buckets.time.exp_avg (or btea, bucketsTimeExpAvg): Exponential moving average of all bucket processing times, in milliseconds.
    • buckets.time.exp_avg_hour (or bteah, bucketsTimeExpAvgHour): Exponentially-weighted moving average of bucket processing times calculated in a 1 hour time window, in milliseconds.
    • buckets.time.max (or btmax, bucketsTimeMax): Maximum among all bucket processing times, in milliseconds.
    • buckets.time.min (or btmin, bucketsTimeMin): Minimum among all bucket processing times, in milliseconds.
    • buckets.time.total (or btt, bucketsTimeTotal): Sum of all bucket processing times, in milliseconds.
    • data.buckets (or db, dataBuckets): The number of buckets processed.
    • data.earliest_record (or der, dataEarliestRecord): The timestamp of the earliest chronologically input document.
    • data.empty_buckets (or deb, dataEmptyBuckets): The number of buckets which did not contain any data.
    • data.input_bytes (or dib, dataInputBytes): The number of bytes of input data posted to the anomaly detection job.
    • data.input_fields (or dif, dataInputFields): The total number of fields in input documents posted to the anomaly detection job. This count includes fields that are not used in the analysis. However, be aware that if you are using a datafeed, it extracts only the required fields from the documents it retrieves before posting them to the job.
    • data.input_records (or dir, dataInputRecords): The number of input documents posted to the anomaly detection job.
    • data.invalid_dates (or did, dataInvalidDates): The number of input documents with either a missing date field or a date that could not be parsed.
    • data.last (or dl, dataLast): The timestamp at which data was last analyzed, according to server time.
    • data.last_empty_bucket (or dleb, dataLastEmptyBucket): The timestamp of the last bucket that did not contain any data.
    • data.last_sparse_bucket (or dlsb, dataLastSparseBucket): The timestamp of the last bucket that was considered sparse.
    • data.latest_record (or dlr, dataLatestRecord): The timestamp of the latest chronologically input document.
    • data.missing_fields (or dmf, dataMissingFields): The number of input documents that are missing a field that the anomaly detection job is configured to analyze. Input documents with missing fields are still processed because it is possible that not all fields are missing.
    • data.out_of_order_timestamps (or doot, dataOutOfOrderTimestamps): The number of input documents that have a timestamp chronologically preceding the start of the current anomaly detection bucket offset by the latency window. This information is applicable only when you provide data to the anomaly detection job by using the post data API. These out of order documents are discarded, since jobs require time series data to be in ascending chronological order.
    • data.processed_fields (or dpf, dataProcessedFields): The total number of fields in all the documents that have been processed by the anomaly detection job. Only fields that are specified in the detector configuration object contribute to this count. The timestamp is not included in this count.
    • data.processed_records (or dpr, dataProcessedRecords): The number of input documents that have been processed by the anomaly detection job. This value includes documents with missing fields, since they are nonetheless analyzed. If you use datafeeds and have aggregations in your search query, the processed record count is the number of aggregation results processed, not the number of Elasticsearch documents.
    • data.sparse_buckets (or dsb, dataSparseBuckets): The number of buckets that contained few data points compared to the expected number of data points.
    • forecasts.memory.avg (or fmavg, forecastsMemoryAvg): The average memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.max (or fmmax, forecastsMemoryMax): The maximum memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.min (or fmmin, forecastsMemoryMin): The minimum memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.total (or fmt, forecastsMemoryTotal): The total memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.records.avg (or fravg, forecastsRecordsAvg): The average number of model_forecast` documents written for forecasts related to the anomaly detection job.
    • forecasts.records.max (or frmax, forecastsRecordsMax): The maximum number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.records.min (or frmin, forecastsRecordsMin): The minimum number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.records.total (or frt, forecastsRecordsTotal): The total number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.time.avg (or ftavg, forecastsTimeAvg): The average runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.max (or ftmax, forecastsTimeMax): The maximum runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.min (or ftmin, forecastsTimeMin): The minimum runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.total (or ftt, forecastsTimeTotal): The total runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.total (or ft, forecastsTotal): The number of individual forecasts currently available for the job.
    • id: Identifier for the anomaly detection job.
    • model.bucket_allocation_failures (or mbaf, modelBucketAllocationFailures): The number of buckets for which new entities in incoming data were not processed due to insufficient model memory.
    • model.by_fields (or mbf, modelByFields): The number of by field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.bytes (or mb, modelBytes): The number of bytes of memory used by the models. This is the maximum value since the last time the model was persisted. If the job is closed, this value indicates the latest size.
    • model.bytes_exceeded (or mbe, modelBytesExceeded): The number of bytes over the high limit for memory usage at the last allocation failure.
    • model.categorization_status (or mcs, modelCategorizationStatus): The status of categorization for the job: ok or warn. If ok, categorization is performing acceptably well (or not being used at all). If warn, categorization is detecting a distribution of categories that suggests the input data is inappropriate for categorization. Problems could be that there is only one category, more than 90% of categories are rare, the number of categories is greater than 50% of the number of categorized documents, there are no frequently matched categories, or more than 50% of categories are dead.
    • model.categorized_doc_count (or mcdc, modelCategorizedDocCount): The number of documents that have had a field categorized.
    • model.dead_category_count (or mdcc, modelDeadCategoryCount): The number of categories created by categorization that will never be assigned again because another category’s definition makes it a superset of the dead category. Dead categories are a side effect of the way categorization has no prior training.
    • model.failed_category_count (or mdcc, modelFailedCategoryCount): The number of times that categorization wanted to create a new category but couldn’t because the job had hit its model memory limit. This count does not track which specific categories failed to be created. Therefore, you cannot use this value to determine the number of unique categories that were missed.
    • model.frequent_category_count (or mfcc, modelFrequentCategoryCount): The number of categories that match more than 1% of categorized documents.
    • model.log_time (or mlt, modelLogTime): The timestamp when the model stats were gathered, according to server time.
    • model.memory_limit (or mml, modelMemoryLimit): The timestamp when the model stats were gathered, according to server time.
    • model.memory_status (or mms, modelMemoryStatus): The status of the mathematical models: ok, soft_limit, or hard_limit. If ok, the models stayed below the configured value. If soft_limit, the models used more than 60% of the configured memory limit and older unused models will be pruned to free up space. Additionally, in categorization jobs no further category examples will be stored. If hard_limit, the models used more space than the configured memory limit. As a result, not all incoming data was processed.
    • model.over_fields (or mof, modelOverFields): The number of over field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.partition_fields (or mpf, modelPartitionFields): The number of partition field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.rare_category_count (or mrcc, modelRareCategoryCount): The number of categories that match just one categorized document.
    • model.timestamp (or mt, modelTimestamp): The timestamp of the last record when the model stats were gathered.
    • model.total_category_count (or mtcc, modelTotalCategoryCount): The number of categories created by categorization.
    • node.address (or na, nodeAddress): The network address of the node that runs the job. This information is available only for open jobs.
    • node.ephemeral_id (or ne, nodeEphemeralId): The ephemeral ID of the node that runs the job. This information is available only for open jobs.
    • node.id (or ni, nodeId): The unique identifier of the node that runs the job. This information is available only for open jobs.
    • node.name (or nn, nodeName): The name of the node that runs the job. This information is available only for open jobs.
    • opened_time (or ot): For open jobs only, the elapsed time for which the job has been open.
    • state (or s): The status of the anomaly detection job: closed, closing, failed, opened, or opening. If closed, the job finished successfully with its model state persisted. The job must be opened before it can accept further data. If closing, the job close action is in progress and has not yet completed. A closing job cannot accept further data. If failed, the job did not finish successfully due to an error. This situation can occur due to invalid input data, a fatal error occurring during the analysis, or an external interaction such as the process being killed by the Linux out of memory (OOM) killer. If the job had irrevocably failed, it must be force closed and then deleted. If the datafeed can be corrected, the job can be closed and then re-opened. If opened, the job is available to receive and process data. If opening, the job open action is in progress and has not yet completed.
  • time string

    The unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string
    • state string

      Values are closing, closed, opened, failed, or opening.

    • For open jobs only, the amount of time the job has been opened.

    • For open anomaly detection jobs only, contains messages relating to the selection of a node to run the job.

    • The number of input documents that have been processed by the anomaly detection job. This value includes documents with missing fields, since they are nonetheless analyzed. If you use datafeeds and have aggregations in your search query, the processed_record_count is the number of aggregation results processed, not the number of Elasticsearch documents.

    • The total number of fields in all the documents that have been processed by the anomaly detection job. Only fields that are specified in the detector configuration object contribute to this count. The timestamp is not included in this count.

    • The number of input documents posted to the anomaly detection job.

    • The total number of fields in input documents posted to the anomaly detection job. This count includes fields that are not used in the analysis. However, be aware that if you are using a datafeed, it extracts only the required fields from the documents it retrieves before posting them to the job.

    • The number of input documents with either a missing date field or a date that could not be parsed.

    • The number of input documents that are missing a field that the anomaly detection job is configured to analyze. Input documents with missing fields are still processed because it is possible that not all fields are missing. If you are using datafeeds or posting data to the job in JSON format, a high missing_field_count is often not an indication of data issues. It is not necessarily a cause for concern.

    • The number of input documents that have a timestamp chronologically preceding the start of the current anomaly detection bucket offset by the latency window. This information is applicable only when you provide data to the anomaly detection job by using the post data API. These out of order documents are discarded, since jobs require time series data to be in ascending chronological order.

    • The number of buckets which did not contain any data. If your data contains many empty buckets, consider increasing your bucket_span or using functions that are tolerant to gaps in data such as mean, non_null_sum or non_zero_count.

    • The number of buckets that contained few data points compared to the expected number of data points. If your data contains many sparse buckets, consider using a longer bucket_span.

    • The total number of buckets processed.

    • The timestamp of the earliest chronologically input document.

    • The timestamp of the latest chronologically input document.

    • The timestamp at which data was last analyzed, according to server time.

    • The timestamp of the last bucket that did not contain any data.

    • The timestamp of the last bucket that was considered sparse.

    • Values are ok, soft_limit, or hard_limit.

    • The upper limit for model memory usage, checked on increasing values.

    • The number of by field values that were analyzed by the models. This value is cumulative for all detectors in the job.

    • The number of over field values that were analyzed by the models. This value is cumulative for all detectors in the job.

    • The number of partition field values that were analyzed by the models. This value is cumulative for all detectors in the job.

    • The number of buckets for which new entities in incoming data were not processed due to insufficient model memory. This situation is also signified by a hard_limit: memory_status property value.

    • Values are ok or warn.

    • The number of documents that have had a field categorized.

    • The number of categories created by categorization.

    • The number of categories that match more than 1% of categorized documents.

    • The number of categories that match just one categorized document.

    • The number of categories created by categorization that will never be assigned again because another category’s definition makes it a superset of the dead category. Dead categories are a side effect of the way categorization has no prior training.

    • The number of times that categorization wanted to create a new category but couldn’t because the job had hit its model_memory_limit. This count does not track which specific categories failed to be created. Therefore you cannot use this value to determine the number of unique categories that were missed.

    • The timestamp when the model stats were gathered, according to server time.

    • The timestamp of the last record when the model stats were gathered.

    • The number of individual forecasts currently available for the job. A value of one or more indicates that forecasts exist.

    • The minimum memory usage in bytes for forecasts related to the anomaly detection job.

    • The maximum memory usage in bytes for forecasts related to the anomaly detection job.

    • The average memory usage in bytes for forecasts related to the anomaly detection job.

    • The total memory usage in bytes for forecasts related to the anomaly detection job.

    • The minimum number of model_forecast documents written for forecasts related to the anomaly detection job.

    • The maximum number of model_forecast documents written for forecasts related to the anomaly detection job.

    • The average number of model_forecast documents written for forecasts related to the anomaly detection job.

    • The total number of model_forecast documents written for forecasts related to the anomaly detection job.

    • The minimum runtime in milliseconds for forecasts related to the anomaly detection job.

    • The maximum runtime in milliseconds for forecasts related to the anomaly detection job.

    • The average runtime in milliseconds for forecasts related to the anomaly detection job.

    • The total runtime in milliseconds for forecasts related to the anomaly detection job.

    • node.id string
    • The name of the assigned node.

    • The network address of the assigned node.

    • The number of bucket results produced by the job.

    • The sum of all bucket processing times, in milliseconds.

    • The minimum of all bucket processing times, in milliseconds.

    • The maximum of all bucket processing times, in milliseconds.

    • The exponential moving average of all bucket processing times, in milliseconds.

    • The exponential moving average of bucket processing times calculated in a one hour time window, in milliseconds.

GET /_cat/ml/anomaly_detectors
curl \
 --request GET 'http://api.example.com/_cat/ml/anomaly_detectors' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _cat/ml/anomaly_detectors?h=id,s,dpr,mb&v=true&format=json`.
[
  {
    "id": "high_sum_total_sales",
    "s": "closed",
    "dpr": "14022",
    "mb": "1.5mb"
  },
  {
    "id": "low_request_rate",
    "s": "closed",
    "dpr": "1216",
    "mb": "40.5kb"
  },
  {
    "id": "response_code_rates",
    "s": "closed",
    "dpr": "28146",
    "mb": "132.7kb"
  },
  {
    "id": "url_scanning",
    "s": "closed",
    "dpr": "28146",
    "mb": "501.6kb"
  }
]

Get anomaly detection jobs Added in 7.7.0

GET /_cat/ml/anomaly_detectors/{job_id}

Get configuration and usage information for anomaly detection jobs. This API returns a maximum of 10,000 jobs. If the Elasticsearch security features are enabled, you must have monitor_ml, monitor, manage_ml, or manage cluster privileges to use this API.

IMPORTANT: CAT APIs are only intended for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, use the get anomaly detection job statistics API.

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

Query parameters

  • Specifies what to do when the request:

    • Contains wildcard expressions and there are no jobs that match.
    • Contains the _all string or no identifiers and there are no matches.
    • Contains wildcard expressions and there are only partial matches.

    If true, the API returns an empty jobs array when there are no matches and the subset of results when there are partial matches. If false, the API returns a 404 status code when there are no matches or only partial matches.

  • bytes string

    The unit used to display byte values.

    Values are b, kb, mb, gb, tb, or pb.

  • h string | array[string]

    Comma-separated list of column names to display.

    Supported values include:

    • assignment_explanation (or ae): For open anomaly detection jobs only, contains messages relating to the selection of a node to run the job.
    • buckets.count (or bc, bucketsCount): The number of bucket results produced by the job.
    • buckets.time.exp_avg (or btea, bucketsTimeExpAvg): Exponential moving average of all bucket processing times, in milliseconds.
    • buckets.time.exp_avg_hour (or bteah, bucketsTimeExpAvgHour): Exponentially-weighted moving average of bucket processing times calculated in a 1 hour time window, in milliseconds.
    • buckets.time.max (or btmax, bucketsTimeMax): Maximum among all bucket processing times, in milliseconds.
    • buckets.time.min (or btmin, bucketsTimeMin): Minimum among all bucket processing times, in milliseconds.
    • buckets.time.total (or btt, bucketsTimeTotal): Sum of all bucket processing times, in milliseconds.
    • data.buckets (or db, dataBuckets): The number of buckets processed.
    • data.earliest_record (or der, dataEarliestRecord): The timestamp of the earliest chronologically input document.
    • data.empty_buckets (or deb, dataEmptyBuckets): The number of buckets which did not contain any data.
    • data.input_bytes (or dib, dataInputBytes): The number of bytes of input data posted to the anomaly detection job.
    • data.input_fields (or dif, dataInputFields): The total number of fields in input documents posted to the anomaly detection job. This count includes fields that are not used in the analysis. However, be aware that if you are using a datafeed, it extracts only the required fields from the documents it retrieves before posting them to the job.
    • data.input_records (or dir, dataInputRecords): The number of input documents posted to the anomaly detection job.
    • data.invalid_dates (or did, dataInvalidDates): The number of input documents with either a missing date field or a date that could not be parsed.
    • data.last (or dl, dataLast): The timestamp at which data was last analyzed, according to server time.
    • data.last_empty_bucket (or dleb, dataLastEmptyBucket): The timestamp of the last bucket that did not contain any data.
    • data.last_sparse_bucket (or dlsb, dataLastSparseBucket): The timestamp of the last bucket that was considered sparse.
    • data.latest_record (or dlr, dataLatestRecord): The timestamp of the latest chronologically input document.
    • data.missing_fields (or dmf, dataMissingFields): The number of input documents that are missing a field that the anomaly detection job is configured to analyze. Input documents with missing fields are still processed because it is possible that not all fields are missing.
    • data.out_of_order_timestamps (or doot, dataOutOfOrderTimestamps): The number of input documents that have a timestamp chronologically preceding the start of the current anomaly detection bucket offset by the latency window. This information is applicable only when you provide data to the anomaly detection job by using the post data API. These out of order documents are discarded, since jobs require time series data to be in ascending chronological order.
    • data.processed_fields (or dpf, dataProcessedFields): The total number of fields in all the documents that have been processed by the anomaly detection job. Only fields that are specified in the detector configuration object contribute to this count. The timestamp is not included in this count.
    • data.processed_records (or dpr, dataProcessedRecords): The number of input documents that have been processed by the anomaly detection job. This value includes documents with missing fields, since they are nonetheless analyzed. If you use datafeeds and have aggregations in your search query, the processed record count is the number of aggregation results processed, not the number of Elasticsearch documents.
    • data.sparse_buckets (or dsb, dataSparseBuckets): The number of buckets that contained few data points compared to the expected number of data points.
    • forecasts.memory.avg (or fmavg, forecastsMemoryAvg): The average memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.max (or fmmax, forecastsMemoryMax): The maximum memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.min (or fmmin, forecastsMemoryMin): The minimum memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.total (or fmt, forecastsMemoryTotal): The total memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.records.avg (or fravg, forecastsRecordsAvg): The average number of model_forecast` documents written for forecasts related to the anomaly detection job.
    • forecasts.records.max (or frmax, forecastsRecordsMax): The maximum number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.records.min (or frmin, forecastsRecordsMin): The minimum number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.records.total (or frt, forecastsRecordsTotal): The total number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.time.avg (or ftavg, forecastsTimeAvg): The average runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.max (or ftmax, forecastsTimeMax): The maximum runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.min (or ftmin, forecastsTimeMin): The minimum runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.total (or ftt, forecastsTimeTotal): The total runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.total (or ft, forecastsTotal): The number of individual forecasts currently available for the job.
    • id: Identifier for the anomaly detection job.
    • model.bucket_allocation_failures (or mbaf, modelBucketAllocationFailures): The number of buckets for which new entities in incoming data were not processed due to insufficient model memory.
    • model.by_fields (or mbf, modelByFields): The number of by field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.bytes (or mb, modelBytes): The number of bytes of memory used by the models. This is the maximum value since the last time the model was persisted. If the job is closed, this value indicates the latest size.
    • model.bytes_exceeded (or mbe, modelBytesExceeded): The number of bytes over the high limit for memory usage at the last allocation failure.
    • model.categorization_status (or mcs, modelCategorizationStatus): The status of categorization for the job: ok or warn. If ok, categorization is performing acceptably well (or not being used at all). If warn, categorization is detecting a distribution of categories that suggests the input data is inappropriate for categorization. Problems could be that there is only one category, more than 90% of categories are rare, the number of categories is greater than 50% of the number of categorized documents, there are no frequently matched categories, or more than 50% of categories are dead.
    • model.categorized_doc_count (or mcdc, modelCategorizedDocCount): The number of documents that have had a field categorized.
    • model.dead_category_count (or mdcc, modelDeadCategoryCount): The number of categories created by categorization that will never be assigned again because another category’s definition makes it a superset of the dead category. Dead categories are a side effect of the way categorization has no prior training.
    • model.failed_category_count (or mdcc, modelFailedCategoryCount): The number of times that categorization wanted to create a new category but couldn’t because the job had hit its model memory limit. This count does not track which specific categories failed to be created. Therefore, you cannot use this value to determine the number of unique categories that were missed.
    • model.frequent_category_count (or mfcc, modelFrequentCategoryCount): The number of categories that match more than 1% of categorized documents.
    • model.log_time (or mlt, modelLogTime): The timestamp when the model stats were gathered, according to server time.
    • model.memory_limit (or mml, modelMemoryLimit): The timestamp when the model stats were gathered, according to server time.
    • model.memory_status (or mms, modelMemoryStatus): The status of the mathematical models: ok, soft_limit, or hard_limit. If ok, the models stayed below the configured value. If soft_limit, the models used more than 60% of the configured memory limit and older unused models will be pruned to free up space. Additionally, in categorization jobs no further category examples will be stored. If hard_limit, the models used more space than the configured memory limit. As a result, not all incoming data was processed.
    • model.over_fields (or mof, modelOverFields): The number of over field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.partition_fields (or mpf, modelPartitionFields): The number of partition field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.rare_category_count (or mrcc, modelRareCategoryCount): The number of categories that match just one categorized document.
    • model.timestamp (or mt, modelTimestamp): The timestamp of the last record when the model stats were gathered.
    • model.total_category_count (or mtcc, modelTotalCategoryCount): The number of categories created by categorization.
    • node.address (or na, nodeAddress): The network address of the node that runs the job. This information is available only for open jobs.
    • node.ephemeral_id (or ne, nodeEphemeralId): The ephemeral ID of the node that runs the job. This information is available only for open jobs.
    • node.id (or ni, nodeId): The unique identifier of the node that runs the job. This information is available only for open jobs.
    • node.name (or nn, nodeName): The name of the node that runs the job. This information is available only for open jobs.
    • opened_time (or ot): For open jobs only, the elapsed time for which the job has been open.
    • state (or s): The status of the anomaly detection job: closed, closing, failed, opened, or opening. If closed, the job finished successfully with its model state persisted. The job must be opened before it can accept further data. If closing, the job close action is in progress and has not yet completed. A closing job cannot accept further data. If failed, the job did not finish successfully due to an error. This situation can occur due to invalid input data, a fatal error occurring during the analysis, or an external interaction such as the process being killed by the Linux out of memory (OOM) killer. If the job had irrevocably failed, it must be force closed and then deleted. If the datafeed can be corrected, the job can be closed and then re-opened. If opened, the job is available to receive and process data. If opening, the job open action is in progress and has not yet completed.
  • s string | array[string]

    Comma-separated list of column names or column aliases used to sort the response.

    Supported values include:

    • assignment_explanation (or ae): For open anomaly detection jobs only, contains messages relating to the selection of a node to run the job.
    • buckets.count (or bc, bucketsCount): The number of bucket results produced by the job.
    • buckets.time.exp_avg (or btea, bucketsTimeExpAvg): Exponential moving average of all bucket processing times, in milliseconds.
    • buckets.time.exp_avg_hour (or bteah, bucketsTimeExpAvgHour): Exponentially-weighted moving average of bucket processing times calculated in a 1 hour time window, in milliseconds.
    • buckets.time.max (or btmax, bucketsTimeMax): Maximum among all bucket processing times, in milliseconds.
    • buckets.time.min (or btmin, bucketsTimeMin): Minimum among all bucket processing times, in milliseconds.
    • buckets.time.total (or btt, bucketsTimeTotal): Sum of all bucket processing times, in milliseconds.
    • data.buckets (or db, dataBuckets): The number of buckets processed.
    • data.earliest_record (or der, dataEarliestRecord): The timestamp of the earliest chronologically input document.
    • data.empty_buckets (or deb, dataEmptyBuckets): The number of buckets which did not contain any data.
    • data.input_bytes (or dib, dataInputBytes): The number of bytes of input data posted to the anomaly detection job.
    • data.input_fields (or dif, dataInputFields): The total number of fields in input documents posted to the anomaly detection job. This count includes fields that are not used in the analysis. However, be aware that if you are using a datafeed, it extracts only the required fields from the documents it retrieves before posting them to the job.
    • data.input_records (or dir, dataInputRecords): The number of input documents posted to the anomaly detection job.
    • data.invalid_dates (or did, dataInvalidDates): The number of input documents with either a missing date field or a date that could not be parsed.
    • data.last (or dl, dataLast): The timestamp at which data was last analyzed, according to server time.
    • data.last_empty_bucket (or dleb, dataLastEmptyBucket): The timestamp of the last bucket that did not contain any data.
    • data.last_sparse_bucket (or dlsb, dataLastSparseBucket): The timestamp of the last bucket that was considered sparse.
    • data.latest_record (or dlr, dataLatestRecord): The timestamp of the latest chronologically input document.
    • data.missing_fields (or dmf, dataMissingFields): The number of input documents that are missing a field that the anomaly detection job is configured to analyze. Input documents with missing fields are still processed because it is possible that not all fields are missing.
    • data.out_of_order_timestamps (or doot, dataOutOfOrderTimestamps): The number of input documents that have a timestamp chronologically preceding the start of the current anomaly detection bucket offset by the latency window. This information is applicable only when you provide data to the anomaly detection job by using the post data API. These out of order documents are discarded, since jobs require time series data to be in ascending chronological order.
    • data.processed_fields (or dpf, dataProcessedFields): The total number of fields in all the documents that have been processed by the anomaly detection job. Only fields that are specified in the detector configuration object contribute to this count. The timestamp is not included in this count.
    • data.processed_records (or dpr, dataProcessedRecords): The number of input documents that have been processed by the anomaly detection job. This value includes documents with missing fields, since they are nonetheless analyzed. If you use datafeeds and have aggregations in your search query, the processed record count is the number of aggregation results processed, not the number of Elasticsearch documents.
    • data.sparse_buckets (or dsb, dataSparseBuckets): The number of buckets that contained few data points compared to the expected number of data points.
    • forecasts.memory.avg (or fmavg, forecastsMemoryAvg): The average memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.max (or fmmax, forecastsMemoryMax): The maximum memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.min (or fmmin, forecastsMemoryMin): The minimum memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.memory.total (or fmt, forecastsMemoryTotal): The total memory usage in bytes for forecasts related to the anomaly detection job.
    • forecasts.records.avg (or fravg, forecastsRecordsAvg): The average number of model_forecast` documents written for forecasts related to the anomaly detection job.
    • forecasts.records.max (or frmax, forecastsRecordsMax): The maximum number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.records.min (or frmin, forecastsRecordsMin): The minimum number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.records.total (or frt, forecastsRecordsTotal): The total number of model_forecast documents written for forecasts related to the anomaly detection job.
    • forecasts.time.avg (or ftavg, forecastsTimeAvg): The average runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.max (or ftmax, forecastsTimeMax): The maximum runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.min (or ftmin, forecastsTimeMin): The minimum runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.time.total (or ftt, forecastsTimeTotal): The total runtime in milliseconds for forecasts related to the anomaly detection job.
    • forecasts.total (or ft, forecastsTotal): The number of individual forecasts currently available for the job.
    • id: Identifier for the anomaly detection job.
    • model.bucket_allocation_failures (or mbaf, modelBucketAllocationFailures): The number of buckets for which new entities in incoming data were not processed due to insufficient model memory.
    • model.by_fields (or mbf, modelByFields): The number of by field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.bytes (or mb, modelBytes): The number of bytes of memory used by the models. This is the maximum value since the last time the model was persisted. If the job is closed, this value indicates the latest size.
    • model.bytes_exceeded (or mbe, modelBytesExceeded): The number of bytes over the high limit for memory usage at the last allocation failure.
    • model.categorization_status (or mcs, modelCategorizationStatus): The status of categorization for the job: ok or warn. If ok, categorization is performing acceptably well (or not being used at all). If warn, categorization is detecting a distribution of categories that suggests the input data is inappropriate for categorization. Problems could be that there is only one category, more than 90% of categories are rare, the number of categories is greater than 50% of the number of categorized documents, there are no frequently matched categories, or more than 50% of categories are dead.
    • model.categorized_doc_count (or mcdc, modelCategorizedDocCount): The number of documents that have had a field categorized.
    • model.dead_category_count (or mdcc, modelDeadCategoryCount): The number of categories created by categorization that will never be assigned again because another category’s definition makes it a superset of the dead category. Dead categories are a side effect of the way categorization has no prior training.
    • model.failed_category_count (or mdcc, modelFailedCategoryCount): The number of times that categorization wanted to create a new category but couldn’t because the job had hit its model memory limit. This count does not track which specific categories failed to be created. Therefore, you cannot use this value to determine the number of unique categories that were missed.
    • model.frequent_category_count (or mfcc, modelFrequentCategoryCount): The number of categories that match more than 1% of categorized documents.
    • model.log_time (or mlt, modelLogTime): The timestamp when the model stats were gathered, according to server time.
    • model.memory_limit (or mml, modelMemoryLimit): The timestamp when the model stats were gathered, according to server time.
    • model.memory_status (or mms, modelMemoryStatus): The status of the mathematical models: ok, soft_limit, or hard_limit. If ok, the models stayed below the configured value. If soft_limit, the models used more than 60% of the configured memory limit and older unused models will be pruned to free up space. Additionally, in categorization jobs no further category examples will be stored. If hard_limit, the models used more space than the configured memory limit. As a result, not all incoming data was processed.
    • model.over_fields (or mof, modelOverFields): The number of over field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.partition_fields (or mpf, modelPartitionFields): The number of partition field values that were analyzed by the models. This value is cumulative for all detectors in the job.
    • model.rare_category_count (or mrcc, modelRareCategoryCount): The number of categories that match just one categorized document.
    • model.timestamp (or mt, modelTimestamp): The timestamp of the last record when the model stats were gathered.
    • model.total_category_count (or mtcc, modelTotalCategoryCount): The number of categories created by categorization.
    • node.address (or na, nodeAddress): The network address of the node that runs the job. This information is available only for open jobs.
    • node.ephemeral_id (or ne, nodeEphemeralId): The ephemeral ID of the node that runs the job. This information is available only for open jobs.
    • node.id (or ni, nodeId): The unique identifier of the node that runs the job. This information is available only for open jobs.
    • node.name (or nn, nodeName): The name of the node that runs the job. This information is available only for open jobs.
    • opened_time (or ot): For open jobs only, the elapsed time for which the job has been open.
    • state (or s): The status of the anomaly detection job: closed, closing, failed, opened, or opening. If closed, the job finished successfully with its model state persisted. The job must be opened before it can accept further data. If closing, the job close action is in progress and has not yet completed. A closing job cannot accept further data. If failed, the job did not finish successfully due to an error. This situation can occur due to invalid input data, a fatal error occurring during the analysis, or an external interaction such as the process being killed by the Linux out of memory (OOM) killer. If the job had irrevocably failed, it must be force closed and then deleted. If the datafeed can be corrected, the job can be closed and then re-opened. If opened, the job is available to receive and process data. If opening, the job open action is in progress and has not yet completed.
  • time string

    The unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string
    • state string

      Values are closing, closed, opened, failed, or opening.

    • For open jobs only, the amount of time the job has been opened.

    • For open anomaly detection jobs only, contains messages relating to the selection of a node to run the job.

    • The number of input documents that have been processed by the anomaly detection job. This value includes documents with missing fields, since they are nonetheless analyzed. If you use datafeeds and have aggregations in your search query, the processed_record_count is the number of aggregation results processed, not the number of Elasticsearch documents.

    • The total number of fields in all the documents that have been processed by the anomaly detection job. Only fields that are specified in the detector configuration object contribute to this count. The timestamp is not included in this count.

    • The number of input documents posted to the anomaly detection job.

    • The total number of fields in input documents posted to the anomaly detection job. This count includes fields that are not used in the analysis. However, be aware that if you are using a datafeed, it extracts only the required fields from the documents it retrieves before posting them to the job.

    • The number of input documents with either a missing date field or a date that could not be parsed.

    • The number of input documents that are missing a field that the anomaly detection job is configured to analyze. Input documents with missing fields are still processed because it is possible that not all fields are missing. If you are using datafeeds or posting data to the job in JSON format, a high missing_field_count is often not an indication of data issues. It is not necessarily a cause for concern.

    • The number of input documents that have a timestamp chronologically preceding the start of the current anomaly detection bucket offset by the latency window. This information is applicable only when you provide data to the anomaly detection job by using the post data API. These out of order documents are discarded, since jobs require time series data to be in ascending chronological order.

    • The number of buckets which did not contain any data. If your data contains many empty buckets, consider increasing your bucket_span or using functions that are tolerant to gaps in data such as mean, non_null_sum or non_zero_count.

    • The number of buckets that contained few data points compared to the expected number of data points. If your data contains many sparse buckets, consider using a longer bucket_span.

    • The total number of buckets processed.

    • The timestamp of the earliest chronologically input document.

    • The timestamp of the latest chronologically input document.

    • The timestamp at which data was last analyzed, according to server time.

    • The timestamp of the last bucket that did not contain any data.

    • The timestamp of the last bucket that was considered sparse.

    • Values are ok, soft_limit, or hard_limit.

    • The upper limit for model memory usage, checked on increasing values.

    • The number of by field values that were analyzed by the models. This value is cumulative for all detectors in the job.

    • The number of over field values that were analyzed by the models. This value is cumulative for all detectors in the job.

    • The number of partition field values that were analyzed by the models. This value is cumulative for all detectors in the job.

    • The number of buckets for which new entities in incoming data were not processed due to insufficient model memory. This situation is also signified by a hard_limit: memory_status property value.

    • Values are ok or warn.

    • The number of documents that have had a field categorized.

    • The number of categories created by categorization.

    • The number of categories that match more than 1% of categorized documents.

    • The number of categories that match just one categorized document.

    • The number of categories created by categorization that will never be assigned again because another category’s definition makes it a superset of the dead category. Dead categories are a side effect of the way categorization has no prior training.

    • The number of times that categorization wanted to create a new category but couldn’t because the job had hit its model_memory_limit. This count does not track which specific categories failed to be created. Therefore you cannot use this value to determine the number of unique categories that were missed.

    • The timestamp when the model stats were gathered, according to server time.

    • The timestamp of the last record when the model stats were gathered.

    • The number of individual forecasts currently available for the job. A value of one or more indicates that forecasts exist.

    • The minimum memory usage in bytes for forecasts related to the anomaly detection job.

    • The maximum memory usage in bytes for forecasts related to the anomaly detection job.

    • The average memory usage in bytes for forecasts related to the anomaly detection job.

    • The total memory usage in bytes for forecasts related to the anomaly detection job.

    • The minimum number of model_forecast documents written for forecasts related to the anomaly detection job.

    • The maximum number of model_forecast documents written for forecasts related to the anomaly detection job.

    • The average number of model_forecast documents written for forecasts related to the anomaly detection job.

    • The total number of model_forecast documents written for forecasts related to the anomaly detection job.

    • The minimum runtime in milliseconds for forecasts related to the anomaly detection job.

    • The maximum runtime in milliseconds for forecasts related to the anomaly detection job.

    • The average runtime in milliseconds for forecasts related to the anomaly detection job.

    • The total runtime in milliseconds for forecasts related to the anomaly detection job.

    • node.id string
    • The name of the assigned node.

    • The network address of the assigned node.

    • The number of bucket results produced by the job.

    • The sum of all bucket processing times, in milliseconds.

    • The minimum of all bucket processing times, in milliseconds.

    • The maximum of all bucket processing times, in milliseconds.

    • The exponential moving average of all bucket processing times, in milliseconds.

    • The exponential moving average of bucket processing times calculated in a one hour time window, in milliseconds.

GET /_cat/ml/anomaly_detectors/{job_id}
curl \
 --request GET 'http://api.example.com/_cat/ml/anomaly_detectors/{job_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _cat/ml/anomaly_detectors?h=id,s,dpr,mb&v=true&format=json`.
[
  {
    "id": "high_sum_total_sales",
    "s": "closed",
    "dpr": "14022",
    "mb": "1.5mb"
  },
  {
    "id": "low_request_rate",
    "s": "closed",
    "dpr": "1216",
    "mb": "40.5kb"
  },
  {
    "id": "response_code_rates",
    "s": "closed",
    "dpr": "28146",
    "mb": "132.7kb"
  },
  {
    "id": "url_scanning",
    "s": "closed",
    "dpr": "28146",
    "mb": "501.6kb"
  }
]

Get trained models Added in 7.7.0

GET /_cat/ml/trained_models

Get configuration and usage information about inference trained models.

IMPORTANT: CAT APIs are only intended for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, use the get trained models statistics API.

Query parameters

  • Specifies what to do when the request: contains wildcard expressions and there are no models that match; contains the _all string or no identifiers and there are no matches; contains wildcard expressions and there are only partial matches. If true, the API returns an empty array when there are no matches and the subset of results when there are partial matches. If false, the API returns a 404 status code when there are no matches or only partial matches.

  • bytes string

    The unit used to display byte values.

    Values are b, kb, mb, gb, tb, or pb.

  • h string | array[string]

    A comma-separated list of column names to display.

    Supported values include:

    • create_time (or ct): The time when the trained model was created.
    • created_by (or c, createdBy): Information on the creator of the trained model.
    • data_frame_analytics_id (or df, dataFrameAnalytics, dfid): Identifier for the data frame analytics job that created the model. Only displayed if it is still available.
    • description (or d): The description of the trained model.
    • heap_size (or hs, modelHeapSize): The estimated heap size to keep the trained model in memory.
    • id: Identifier for the trained model.
    • ingest.count (or ic, ingestCount): The total number of documents that are processed by the model.
    • ingest.current (or icurr, ingestCurrent): The total number of document that are currently being handled by the trained model.
    • ingest.failed (or if, ingestFailed): The total number of failed ingest attempts with the trained model.
    • ingest.pipelines (or ip, ingestPipelines): The total number of ingest pipelines that are referencing the trained model.
    • ingest.time (or it, ingestTime): The total time that is spent processing documents with the trained model.
    • license (or l): The license level of the trained model.
    • operations (or o, modelOperations): The estimated number of operations to use the trained model. This number helps measuring the computational complexity of the model.
    • version (or v): The Elasticsearch version number in which the trained model was created.
  • s string | array[string]

    A comma-separated list of column names or aliases used to sort the response.

    Supported values include:

    • create_time (or ct): The time when the trained model was created.
    • created_by (or c, createdBy): Information on the creator of the trained model.
    • data_frame_analytics_id (or df, dataFrameAnalytics, dfid): Identifier for the data frame analytics job that created the model. Only displayed if it is still available.
    • description (or d): The description of the trained model.
    • heap_size (or hs, modelHeapSize): The estimated heap size to keep the trained model in memory.
    • id: Identifier for the trained model.
    • ingest.count (or ic, ingestCount): The total number of documents that are processed by the model.
    • ingest.current (or icurr, ingestCurrent): The total number of document that are currently being handled by the trained model.
    • ingest.failed (or if, ingestFailed): The total number of failed ingest attempts with the trained model.
    • ingest.pipelines (or ip, ingestPipelines): The total number of ingest pipelines that are referencing the trained model.
    • ingest.time (or it, ingestTime): The total time that is spent processing documents with the trained model.
    • license (or l): The license level of the trained model.
    • operations (or o, modelOperations): The estimated number of operations to use the trained model. This number helps measuring the computational complexity of the model.
    • version (or v): The Elasticsearch version number in which the trained model was created.
  • from number

    Skips the specified number of transforms.

  • size number

    The maximum number of transforms to display.

  • time string

    Unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

Responses

GET /_cat/ml/trained_models
curl \
 --request GET 'http://api.example.com/_cat/ml/trained_models' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _cat/ml/trained_models?v=true&format=json`.
[
  {
    "id": "ddddd-1580216177138",
    "heap_size": "0b",
    "operations": "196",
    "create_time": "2025-03-25T00:01:38.662Z",
    "type": "pytorch",
    "ingest.pipelines": "0",
    "data_frame.id": "__none__"
  },
  {
    "id": "lang_ident_model_1",
    "heap_size": "1mb",
    "operations": "39629",
    "create_time": "2019-12-05T12:28:34.594Z",
    "type": "lang_ident",
    "ingest.pipelines": "0",
    "data_frame.id": "__none__"
  }
]

Get trained models Added in 7.7.0

GET /_cat/ml/trained_models/{model_id}

Get configuration and usage information about inference trained models.

IMPORTANT: CAT APIs are only intended for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, use the get trained models statistics API.

Path parameters

  • model_id string Required

    A unique identifier for the trained model.

Query parameters

  • Specifies what to do when the request: contains wildcard expressions and there are no models that match; contains the _all string or no identifiers and there are no matches; contains wildcard expressions and there are only partial matches. If true, the API returns an empty array when there are no matches and the subset of results when there are partial matches. If false, the API returns a 404 status code when there are no matches or only partial matches.

  • bytes string

    The unit used to display byte values.

    Values are b, kb, mb, gb, tb, or pb.

  • h string | array[string]

    A comma-separated list of column names to display.

    Supported values include:

    • create_time (or ct): The time when the trained model was created.
    • created_by (or c, createdBy): Information on the creator of the trained model.
    • data_frame_analytics_id (or df, dataFrameAnalytics, dfid): Identifier for the data frame analytics job that created the model. Only displayed if it is still available.
    • description (or d): The description of the trained model.
    • heap_size (or hs, modelHeapSize): The estimated heap size to keep the trained model in memory.
    • id: Identifier for the trained model.
    • ingest.count (or ic, ingestCount): The total number of documents that are processed by the model.
    • ingest.current (or icurr, ingestCurrent): The total number of document that are currently being handled by the trained model.
    • ingest.failed (or if, ingestFailed): The total number of failed ingest attempts with the trained model.
    • ingest.pipelines (or ip, ingestPipelines): The total number of ingest pipelines that are referencing the trained model.
    • ingest.time (or it, ingestTime): The total time that is spent processing documents with the trained model.
    • license (or l): The license level of the trained model.
    • operations (or o, modelOperations): The estimated number of operations to use the trained model. This number helps measuring the computational complexity of the model.
    • version (or v): The Elasticsearch version number in which the trained model was created.
  • s string | array[string]

    A comma-separated list of column names or aliases used to sort the response.

    Supported values include:

    • create_time (or ct): The time when the trained model was created.
    • created_by (or c, createdBy): Information on the creator of the trained model.
    • data_frame_analytics_id (or df, dataFrameAnalytics, dfid): Identifier for the data frame analytics job that created the model. Only displayed if it is still available.
    • description (or d): The description of the trained model.
    • heap_size (or hs, modelHeapSize): The estimated heap size to keep the trained model in memory.
    • id: Identifier for the trained model.
    • ingest.count (or ic, ingestCount): The total number of documents that are processed by the model.
    • ingest.current (or icurr, ingestCurrent): The total number of document that are currently being handled by the trained model.
    • ingest.failed (or if, ingestFailed): The total number of failed ingest attempts with the trained model.
    • ingest.pipelines (or ip, ingestPipelines): The total number of ingest pipelines that are referencing the trained model.
    • ingest.time (or it, ingestTime): The total time that is spent processing documents with the trained model.
    • license (or l): The license level of the trained model.
    • operations (or o, modelOperations): The estimated number of operations to use the trained model. This number helps measuring the computational complexity of the model.
    • version (or v): The Elasticsearch version number in which the trained model was created.
  • from number

    Skips the specified number of transforms.

  • size number

    The maximum number of transforms to display.

  • time string

    Unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

Responses

GET /_cat/ml/trained_models/{model_id}
curl \
 --request GET 'http://api.example.com/_cat/ml/trained_models/{model_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _cat/ml/trained_models?v=true&format=json`.
[
  {
    "id": "ddddd-1580216177138",
    "heap_size": "0b",
    "operations": "196",
    "create_time": "2025-03-25T00:01:38.662Z",
    "type": "pytorch",
    "ingest.pipelines": "0",
    "data_frame.id": "__none__"
  },
  {
    "id": "lang_ident_model_1",
    "heap_size": "1mb",
    "operations": "39629",
    "create_time": "2019-12-05T12:28:34.594Z",
    "type": "lang_ident",
    "ingest.pipelines": "0",
    "data_frame.id": "__none__"
  }
]

Get transform information Added in 7.7.0

GET /_cat/transforms

Get configuration and usage information about transforms.

CAT APIs are only intended for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, use the get transform statistics API.

Query parameters

  • Specifies what to do when the request: contains wildcard expressions and there are no transforms that match; contains the _all string or no identifiers and there are no matches; contains wildcard expressions and there are only partial matches. If true, it returns an empty transforms array when there are no matches and the subset of results when there are partial matches. If false, the request returns a 404 status code when there are no matches or only partial matches.

  • from number

    Skips the specified number of transforms.

  • h string | array[string]

    Comma-separated list of column names to display.

    Supported values include:

    • changes_last_detection_time (or cldt): The timestamp when changes were last detected in the source indices.
    • checkpoint (or cp): The sequence number for the checkpoint.
    • checkpoint_duration_time_exp_avg (or cdtea, checkpointTimeExpAvg): Exponential moving average of the duration of the checkpoint, in milliseconds.
    • checkpoint_progress (or c, checkpointProgress): The progress of the next checkpoint that is currently in progress.
    • create_time (or ct, createTime): The time the transform was created.
    • delete_time (or dtime): The amount of time spent deleting, in milliseconds.
    • description (or d): The description of the transform.
    • dest_index (or di, destIndex): The destination index for the transform. The mappings of the destination index are deduced based on the source fields when possible. If alternate mappings are required, use the Create index API prior to starting the transform.
    • documents_deleted (or docd): The number of documents that have been deleted from the destination index due to the retention policy for this transform.
    • documents_indexed (or doci): The number of documents that have been indexed into the destination index for the transform.
    • docs_per_second (or dps): Specifies a limit on the number of input documents per second. This setting throttles the transform by adding a wait time between search requests. The default value is null, which disables throttling.
    • documents_processed (or docp): The number of documents that have been processed from the source index of the transform.
    • frequency (or f): The interval between checks for changes in the source indices when the transform is running continuously. Also determines the retry interval in the event of transient failures while the transform is searching or indexing. The minimum value is 1s and the maximum is 1h. The default value is 1m.
    • id: Identifier for the transform.
    • index_failure (or if): The number of indexing failures.
    • index_time (or itime): The amount of time spent indexing, in milliseconds.
    • index_total (or it): The number of index operations.
    • indexed_documents_exp_avg (or idea): Exponential moving average of the number of new documents that have been indexed.
    • last_search_time (or lst, lastSearchTime): The timestamp of the last search in the source indices. This field is only shown if the transform is running.
    • max_page_search_size (or mpsz): Defines the initial page size to use for the composite aggregation for each checkpoint. If circuit breaker exceptions occur, the page size is dynamically adjusted to a lower value. The minimum value is 10 and the maximum is 65,536. The default value is 500.
    • pages_processed (or pp): The number of search or bulk index operations processed. Documents are processed in batches instead of individually.
    • pipeline (or p): The unique identifier for an ingest pipeline.
    • processed_documents_exp_avg (or pdea): Exponential moving average of the number of documents that have been processed.
    • processing_time (or pt): The amount of time spent processing results, in milliseconds.
    • reason (or r): If a transform has a failed state, this property provides details about the reason for the failure.
    • search_failure (or sf): The number of search failures.
    • search_time (or stime): The amount of time spent searching, in milliseconds.
    • search_total (or st): The number of search operations on the source index for the transform.
    • source_index (or si, sourceIndex): The source indices for the transform. It can be a single index, an index pattern (for example, "my-index-*"), an array of indices (for example, ["my-index-000001", "my-index-000002"]), or an array of index patterns (for example, ["my-index-*", "my-other-index-*"]. For remote indices use the syntax "remote_name:index_name". If any indices are in remote clusters then the master node and at least one transform node must have the remote_cluster_client node role.
    • state (or s): The status of the transform, which can be one of the following values:

      • aborting: The transform is aborting.
      • failed: The transform failed. For more information about the failure, check the reason field.
      • indexing: The transform is actively processing data and creating new documents.
      • started: The transform is running but not actively indexing data.
      • stopped: The transform is stopped.
      • stopping: The transform is stopping.
    • transform_type (or tt): Indicates the type of transform: batch or continuous.

    • trigger_count (or tc): The number of times the transform has been triggered by the scheduler. For example, the scheduler triggers the transform indexer to check for updates or ingest new data at an interval specified in the frequency property.

    • version (or v): The version of Elasticsearch that existed on the node when the transform was created.

  • s string | array[string]

    Comma-separated list of column names or column aliases used to sort the response.

    Supported values include:

    • changes_last_detection_time (or cldt): The timestamp when changes were last detected in the source indices.
    • checkpoint (or cp): The sequence number for the checkpoint.
    • checkpoint_duration_time_exp_avg (or cdtea, checkpointTimeExpAvg): Exponential moving average of the duration of the checkpoint, in milliseconds.
    • checkpoint_progress (or c, checkpointProgress): The progress of the next checkpoint that is currently in progress.
    • create_time (or ct, createTime): The time the transform was created.
    • delete_time (or dtime): The amount of time spent deleting, in milliseconds.
    • description (or d): The description of the transform.
    • dest_index (or di, destIndex): The destination index for the transform. The mappings of the destination index are deduced based on the source fields when possible. If alternate mappings are required, use the Create index API prior to starting the transform.
    • documents_deleted (or docd): The number of documents that have been deleted from the destination index due to the retention policy for this transform.
    • documents_indexed (or doci): The number of documents that have been indexed into the destination index for the transform.
    • docs_per_second (or dps): Specifies a limit on the number of input documents per second. This setting throttles the transform by adding a wait time between search requests. The default value is null, which disables throttling.
    • documents_processed (or docp): The number of documents that have been processed from the source index of the transform.
    • frequency (or f): The interval between checks for changes in the source indices when the transform is running continuously. Also determines the retry interval in the event of transient failures while the transform is searching or indexing. The minimum value is 1s and the maximum is 1h. The default value is 1m.
    • id: Identifier for the transform.
    • index_failure (or if): The number of indexing failures.
    • index_time (or itime): The amount of time spent indexing, in milliseconds.
    • index_total (or it): The number of index operations.
    • indexed_documents_exp_avg (or idea): Exponential moving average of the number of new documents that have been indexed.
    • last_search_time (or lst, lastSearchTime): The timestamp of the last search in the source indices. This field is only shown if the transform is running.
    • max_page_search_size (or mpsz): Defines the initial page size to use for the composite aggregation for each checkpoint. If circuit breaker exceptions occur, the page size is dynamically adjusted to a lower value. The minimum value is 10 and the maximum is 65,536. The default value is 500.
    • pages_processed (or pp): The number of search or bulk index operations processed. Documents are processed in batches instead of individually.
    • pipeline (or p): The unique identifier for an ingest pipeline.
    • processed_documents_exp_avg (or pdea): Exponential moving average of the number of documents that have been processed.
    • processing_time (or pt): The amount of time spent processing results, in milliseconds.
    • reason (or r): If a transform has a failed state, this property provides details about the reason for the failure.
    • search_failure (or sf): The number of search failures.
    • search_time (or stime): The amount of time spent searching, in milliseconds.
    • search_total (or st): The number of search operations on the source index for the transform.
    • source_index (or si, sourceIndex): The source indices for the transform. It can be a single index, an index pattern (for example, "my-index-*"), an array of indices (for example, ["my-index-000001", "my-index-000002"]), or an array of index patterns (for example, ["my-index-*", "my-other-index-*"]. For remote indices use the syntax "remote_name:index_name". If any indices are in remote clusters then the master node and at least one transform node must have the remote_cluster_client node role.
    • state (or s): The status of the transform, which can be one of the following values:

      • aborting: The transform is aborting.
      • failed: The transform failed. For more information about the failure, check the reason field.
      • indexing: The transform is actively processing data and creating new documents.
      • started: The transform is running but not actively indexing data.
      • stopped: The transform is stopped.
      • stopping: The transform is stopping.
    • transform_type (or tt): Indicates the type of transform: batch or continuous.

    • trigger_count (or tc): The number of times the transform has been triggered by the scheduler. For example, the scheduler triggers the transform indexer to check for updates or ingest new data at an interval specified in the frequency property.

    • version (or v): The version of Elasticsearch that existed on the node when the transform was created.

  • time string

    The unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

  • size number

    The maximum number of transforms to obtain.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string
    • state string

      The status of the transform. Returned values include: aborting: The transform is aborting. failed: The transform failed. For more information about the failure, check thereasonfield. indexing: The transform is actively processing data and creating new documents. started: The transform is running but not actively indexing data. stopped: The transform is stopped. stopping`: The transform is stopping.

    • The sequence number for the checkpoint.

    • The number of documents that have been processed from the source index of the transform.

    • checkpoint_progress string | null

      The progress of the next checkpoint that is currently in progress.

    • last_search_time string | null

      The timestamp of the last search in the source indices. This field is shown only if the transform is running.

    • changes_last_detection_time string | null

      The timestamp when changes were last detected in the source indices.

    • The time the transform was created.

    • version string
    • The source indices for the transform.

    • The destination index for the transform.

    • pipeline string

      The unique identifier for the ingest pipeline.

    • The description of the transform.

    • The type of transform: batch or continuous.

    • The interval between checks for changes in the source indices when the transform is running continuously.

    • The initial page size that is used for the composite aggregation for each checkpoint.

    • The number of input documents per second.

    • reason string

      If a transform has a failed state, these details describe the reason for failure.

    • The total number of search operations on the source index for the transform.

    • The total number of search failures.

    • The total amount of search time, in milliseconds.

    • The total number of index operations done by the transform.

    • The total number of indexing failures.

    • The total time spent indexing documents, in milliseconds.

    • The number of documents that have been indexed into the destination index for the transform.

    • The total time spent deleting documents, in milliseconds.

    • The number of documents deleted from the destination index due to the retention policy for the transform.

    • The number of times the transform has been triggered by the scheduler. For example, the scheduler triggers the transform indexer to check for updates or ingest new data at an interval specified in the frequency property.

    • The number of search or bulk index operations processed. Documents are processed in batches instead of individually.

    • The total time spent processing results, in milliseconds.

    • The exponential moving average of the duration of the checkpoint, in milliseconds.

    • The exponential moving average of the number of new documents that have been indexed.

    • The exponential moving average of the number of documents that have been processed.

GET /_cat/transforms
curl \
 --request GET 'http://api.example.com/_cat/transforms' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET /_cat/transforms?v=true&format=json`.
[
  {
    "id" : "ecommerce_transform",
    "state" : "started",
    "checkpoint" : "1",
    "documents_processed" : "705",
    "checkpoint_progress" : "100.00",
    "changes_last_detection_time" : null
  }
]

Get transform information Added in 7.7.0

GET /_cat/transforms/{transform_id}

Get configuration and usage information about transforms.

CAT APIs are only intended for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, use the get transform statistics API.

Path parameters

  • transform_id string Required

    A transform identifier or a wildcard expression. If you do not specify one of these options, the API returns information for all transforms.

Query parameters

  • Specifies what to do when the request: contains wildcard expressions and there are no transforms that match; contains the _all string or no identifiers and there are no matches; contains wildcard expressions and there are only partial matches. If true, it returns an empty transforms array when there are no matches and the subset of results when there are partial matches. If false, the request returns a 404 status code when there are no matches or only partial matches.

  • from number

    Skips the specified number of transforms.

  • h string | array[string]

    Comma-separated list of column names to display.

    Supported values include:

    • changes_last_detection_time (or cldt): The timestamp when changes were last detected in the source indices.
    • checkpoint (or cp): The sequence number for the checkpoint.
    • checkpoint_duration_time_exp_avg (or cdtea, checkpointTimeExpAvg): Exponential moving average of the duration of the checkpoint, in milliseconds.
    • checkpoint_progress (or c, checkpointProgress): The progress of the next checkpoint that is currently in progress.
    • create_time (or ct, createTime): The time the transform was created.
    • delete_time (or dtime): The amount of time spent deleting, in milliseconds.
    • description (or d): The description of the transform.
    • dest_index (or di, destIndex): The destination index for the transform. The mappings of the destination index are deduced based on the source fields when possible. If alternate mappings are required, use the Create index API prior to starting the transform.
    • documents_deleted (or docd): The number of documents that have been deleted from the destination index due to the retention policy for this transform.
    • documents_indexed (or doci): The number of documents that have been indexed into the destination index for the transform.
    • docs_per_second (or dps): Specifies a limit on the number of input documents per second. This setting throttles the transform by adding a wait time between search requests. The default value is null, which disables throttling.
    • documents_processed (or docp): The number of documents that have been processed from the source index of the transform.
    • frequency (or f): The interval between checks for changes in the source indices when the transform is running continuously. Also determines the retry interval in the event of transient failures while the transform is searching or indexing. The minimum value is 1s and the maximum is 1h. The default value is 1m.
    • id: Identifier for the transform.
    • index_failure (or if): The number of indexing failures.
    • index_time (or itime): The amount of time spent indexing, in milliseconds.
    • index_total (or it): The number of index operations.
    • indexed_documents_exp_avg (or idea): Exponential moving average of the number of new documents that have been indexed.
    • last_search_time (or lst, lastSearchTime): The timestamp of the last search in the source indices. This field is only shown if the transform is running.
    • max_page_search_size (or mpsz): Defines the initial page size to use for the composite aggregation for each checkpoint. If circuit breaker exceptions occur, the page size is dynamically adjusted to a lower value. The minimum value is 10 and the maximum is 65,536. The default value is 500.
    • pages_processed (or pp): The number of search or bulk index operations processed. Documents are processed in batches instead of individually.
    • pipeline (or p): The unique identifier for an ingest pipeline.
    • processed_documents_exp_avg (or pdea): Exponential moving average of the number of documents that have been processed.
    • processing_time (or pt): The amount of time spent processing results, in milliseconds.
    • reason (or r): If a transform has a failed state, this property provides details about the reason for the failure.
    • search_failure (or sf): The number of search failures.
    • search_time (or stime): The amount of time spent searching, in milliseconds.
    • search_total (or st): The number of search operations on the source index for the transform.
    • source_index (or si, sourceIndex): The source indices for the transform. It can be a single index, an index pattern (for example, "my-index-*"), an array of indices (for example, ["my-index-000001", "my-index-000002"]), or an array of index patterns (for example, ["my-index-*", "my-other-index-*"]. For remote indices use the syntax "remote_name:index_name". If any indices are in remote clusters then the master node and at least one transform node must have the remote_cluster_client node role.
    • state (or s): The status of the transform, which can be one of the following values:

      • aborting: The transform is aborting.
      • failed: The transform failed. For more information about the failure, check the reason field.
      • indexing: The transform is actively processing data and creating new documents.
      • started: The transform is running but not actively indexing data.
      • stopped: The transform is stopped.
      • stopping: The transform is stopping.
    • transform_type (or tt): Indicates the type of transform: batch or continuous.

    • trigger_count (or tc): The number of times the transform has been triggered by the scheduler. For example, the scheduler triggers the transform indexer to check for updates or ingest new data at an interval specified in the frequency property.

    • version (or v): The version of Elasticsearch that existed on the node when the transform was created.

  • s string | array[string]

    Comma-separated list of column names or column aliases used to sort the response.

    Supported values include:

    • changes_last_detection_time (or cldt): The timestamp when changes were last detected in the source indices.
    • checkpoint (or cp): The sequence number for the checkpoint.
    • checkpoint_duration_time_exp_avg (or cdtea, checkpointTimeExpAvg): Exponential moving average of the duration of the checkpoint, in milliseconds.
    • checkpoint_progress (or c, checkpointProgress): The progress of the next checkpoint that is currently in progress.
    • create_time (or ct, createTime): The time the transform was created.
    • delete_time (or dtime): The amount of time spent deleting, in milliseconds.
    • description (or d): The description of the transform.
    • dest_index (or di, destIndex): The destination index for the transform. The mappings of the destination index are deduced based on the source fields when possible. If alternate mappings are required, use the Create index API prior to starting the transform.
    • documents_deleted (or docd): The number of documents that have been deleted from the destination index due to the retention policy for this transform.
    • documents_indexed (or doci): The number of documents that have been indexed into the destination index for the transform.
    • docs_per_second (or dps): Specifies a limit on the number of input documents per second. This setting throttles the transform by adding a wait time between search requests. The default value is null, which disables throttling.
    • documents_processed (or docp): The number of documents that have been processed from the source index of the transform.
    • frequency (or f): The interval between checks for changes in the source indices when the transform is running continuously. Also determines the retry interval in the event of transient failures while the transform is searching or indexing. The minimum value is 1s and the maximum is 1h. The default value is 1m.
    • id: Identifier for the transform.
    • index_failure (or if): The number of indexing failures.
    • index_time (or itime): The amount of time spent indexing, in milliseconds.
    • index_total (or it): The number of index operations.
    • indexed_documents_exp_avg (or idea): Exponential moving average of the number of new documents that have been indexed.
    • last_search_time (or lst, lastSearchTime): The timestamp of the last search in the source indices. This field is only shown if the transform is running.
    • max_page_search_size (or mpsz): Defines the initial page size to use for the composite aggregation for each checkpoint. If circuit breaker exceptions occur, the page size is dynamically adjusted to a lower value. The minimum value is 10 and the maximum is 65,536. The default value is 500.
    • pages_processed (or pp): The number of search or bulk index operations processed. Documents are processed in batches instead of individually.
    • pipeline (or p): The unique identifier for an ingest pipeline.
    • processed_documents_exp_avg (or pdea): Exponential moving average of the number of documents that have been processed.
    • processing_time (or pt): The amount of time spent processing results, in milliseconds.
    • reason (or r): If a transform has a failed state, this property provides details about the reason for the failure.
    • search_failure (or sf): The number of search failures.
    • search_time (or stime): The amount of time spent searching, in milliseconds.
    • search_total (or st): The number of search operations on the source index for the transform.
    • source_index (or si, sourceIndex): The source indices for the transform. It can be a single index, an index pattern (for example, "my-index-*"), an array of indices (for example, ["my-index-000001", "my-index-000002"]), or an array of index patterns (for example, ["my-index-*", "my-other-index-*"]. For remote indices use the syntax "remote_name:index_name". If any indices are in remote clusters then the master node and at least one transform node must have the remote_cluster_client node role.
    • state (or s): The status of the transform, which can be one of the following values:

      • aborting: The transform is aborting.
      • failed: The transform failed. For more information about the failure, check the reason field.
      • indexing: The transform is actively processing data and creating new documents.
      • started: The transform is running but not actively indexing data.
      • stopped: The transform is stopped.
      • stopping: The transform is stopping.
    • transform_type (or tt): Indicates the type of transform: batch or continuous.

    • trigger_count (or tc): The number of times the transform has been triggered by the scheduler. For example, the scheduler triggers the transform indexer to check for updates or ingest new data at an interval specified in the frequency property.

    • version (or v): The version of Elasticsearch that existed on the node when the transform was created.

  • time string

    The unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

  • size number

    The maximum number of transforms to obtain.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string
    • state string

      The status of the transform. Returned values include: aborting: The transform is aborting. failed: The transform failed. For more information about the failure, check thereasonfield. indexing: The transform is actively processing data and creating new documents. started: The transform is running but not actively indexing data. stopped: The transform is stopped. stopping`: The transform is stopping.

    • The sequence number for the checkpoint.

    • The number of documents that have been processed from the source index of the transform.

    • checkpoint_progress string | null

      The progress of the next checkpoint that is currently in progress.

    • last_search_time string | null

      The timestamp of the last search in the source indices. This field is shown only if the transform is running.

    • changes_last_detection_time string | null

      The timestamp when changes were last detected in the source indices.

    • The time the transform was created.

    • version string
    • The source indices for the transform.

    • The destination index for the transform.

    • pipeline string

      The unique identifier for the ingest pipeline.

    • The description of the transform.

    • The type of transform: batch or continuous.

    • The interval between checks for changes in the source indices when the transform is running continuously.

    • The initial page size that is used for the composite aggregation for each checkpoint.

    • The number of input documents per second.

    • reason string

      If a transform has a failed state, these details describe the reason for failure.

    • The total number of search operations on the source index for the transform.

    • The total number of search failures.

    • The total amount of search time, in milliseconds.

    • The total number of index operations done by the transform.

    • The total number of indexing failures.

    • The total time spent indexing documents, in milliseconds.

    • The number of documents that have been indexed into the destination index for the transform.

    • The total time spent deleting documents, in milliseconds.

    • The number of documents deleted from the destination index due to the retention policy for the transform.

    • The number of times the transform has been triggered by the scheduler. For example, the scheduler triggers the transform indexer to check for updates or ingest new data at an interval specified in the frequency property.

    • The number of search or bulk index operations processed. Documents are processed in batches instead of individually.

    • The total time spent processing results, in milliseconds.

    • The exponential moving average of the duration of the checkpoint, in milliseconds.

    • The exponential moving average of the number of new documents that have been indexed.

    • The exponential moving average of the number of documents that have been processed.

GET /_cat/transforms/{transform_id}
curl \
 --request GET 'http://api.example.com/_cat/transforms/{transform_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET /_cat/transforms?v=true&format=json`.
[
  {
    "id" : "ecommerce_transform",
    "state" : "started",
    "checkpoint" : "1",
    "documents_processed" : "705",
    "checkpoint_progress" : "100.00",
    "changes_last_detection_time" : null
  }
]

Get cluster info Added in 8.9.0

GET /_info/{target}

Returns basic information about the cluster.

Path parameters

  • target string | array[string] Required

    Limits the information returned to the specific target. Supports a comma-separated list, such as http,ingest.

    Supported values include: _all, http, ingest, thread_pool, script

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • cluster_name string Required
    • http object
      Hide http attributes Show http attributes object
      • Current number of open HTTP connections for the node.

      • Total number of HTTP connections opened for the node.

      • clients array[object]

        Information on current and recently-closed HTTP client connections. Clients that have been closed longer than the http.client_stats.closed_channels.max_age setting will not be represented here.

        Hide clients attributes Show clients attributes object
        • id number

          Unique ID for the HTTP client.

        • agent string

          Reported agent for the HTTP client. If unavailable, this property is not included in the response.

        • Local address for the HTTP connection.

        • Remote address for the HTTP connection.

        • last_uri string

          The URI of the client’s most recent request.

        • Time at which the client opened the connection.

        • Time at which the client closed the connection if the connection is closed.

        • Time of the most recent request from this client.

        • Number of requests from this client.

        • Cumulative size in bytes of all requests from this client.

        • Value from the client’s x-opaque-id HTTP header. If unavailable, this property is not included in the response.

    • ingest object
      Hide ingest attributes Show ingest attributes object
      • Contains statistics about ingest pipelines for the node.

        Hide pipelines attribute Show pipelines attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • count number Required

            Total number of documents ingested during the lifetime of this node.

          • current number Required

            Total number of documents currently being ingested.

          • failed number Required

            Total number of failed ingest operations during the lifetime of this node.

          • processors array[object] Required

            Total number of ingest processors.

            Hide processors attribute Show processors attribute object
            • * object Additional properties
          • Time unit for milliseconds

          • ingested_as_first_pipeline_in_bytes number Required Added in 8.15.0

            Total number of bytes of all documents ingested by the pipeline. This field is only present on pipelines which are the first to process a document. Thus, it is not present on pipelines which only serve as a final pipeline after a default pipeline, a pipeline run after a reroute processor, or pipelines in pipeline processors.

          • produced_as_first_pipeline_in_bytes number Required Added in 8.15.0

            Total number of bytes of all documents produced by the pipeline. This field is only present on pipelines which are the first to process a document. Thus, it is not present on pipelines which only serve as a final pipeline after a default pipeline, a pipeline run after a reroute processor, or pipelines in pipeline processors. In situations where there are subsequent pipelines, the value represents the size of the document after all pipelines have run.

      • total object
        Hide total attributes Show total attributes object
        • count number Required

          Total number of documents ingested during the lifetime of this node.

        • current number Required

          Total number of documents currently being ingested.

        • failed number Required

          Total number of failed ingest operations during the lifetime of this node.

        • Time unit for milliseconds

    • Hide thread_pool attribute Show thread_pool attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • active number

          Number of active threads in the thread pool.

        • Number of tasks completed by the thread pool executor.

        • largest number

          Highest number of active threads in the thread pool.

        • queue number

          Number of tasks in queue for the thread pool.

        • rejected number

          Number of tasks rejected by the thread pool executor.

        • threads number

          Number of threads in the thread pool.

    • script object
      Hide script attributes Show script attributes object
GET /_info/{target}
curl \
 --request GET 'http://api.example.com/_info/{target}' \
 --header "Authorization: $API_KEY"

Ping the cluster

HEAD /

Get information about whether the cluster is running.

Responses

HEAD /
curl \
 --request HEAD 'http://api.example.com/' \
 --header "Authorization: $API_KEY"

Connector

The connector and sync jobs APIs provide a convenient way to create and manage Elastic connectors and sync jobs in an internal index. Connectors are Elasticsearch integrations for syncing content from third-party data sources, which can be deployed on Elastic Cloud or hosted on your own infrastructure. This API provides an alternative to relying solely on Kibana UI for connector and sync job management. The API comes with a set of validations and assertions to ensure that the state representation in the internal index remains valid. This API requires the manage_connector privilege or, for read-only endpoints, the monitor_connector privilege.

Check out the connector API tutorial

Check in a connector Technical preview

PUT /_connector/{connector_id}/_check_in

Update the last_seen field in the connector and set it to the current timestamp.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be checked in

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_check_in
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_check_in' \
 --header "Authorization: $API_KEY"
Response examples (200)
{
    "result": "updated"
}

Get a connector Beta

GET /_connector/{connector_id}

Get the details about a connector.

Path parameters

Query parameters

  • A flag to indicate if the desired connector should be fetched, even if it was soft-deleted.

Responses

GET /_connector/{connector_id}
curl \
 --request GET 'http://api.example.com/_connector/{connector_id}' \
 --header "Authorization: $API_KEY"

Create or update a connector Beta

PUT /_connector/{connector_id}

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be created or updated. ID is auto-generated if not provided.

application/json

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

    • id string Required
PUT /_connector/{connector_id}
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"index_name\": \"search-google-drive\",\n  \"name\": \"My Connector\",\n  \"service_type\": \"google_drive\"\n}"'
Request examples
{
  "index_name": "search-google-drive",
  "name": "My Connector",
  "service_type": "google_drive"
}
{
  "index_name": "search-google-drive",
  "name": "My Connector",
  "description": "My Connector to sync data to Elastic index from Google Drive",
  "service_type": "google_drive",
  "language": "english"
}
Response examples (200)
{
  "result": "created",
  "id": "my-connector"
}

Delete a connector Beta

DELETE /_connector/{connector_id}

Removes a connector and associated sync jobs. This is a destructive action that is not recoverable. NOTE: This action doesn’t delete any API keys, ingest pipelines, or data indices associated with the connector. These need to be removed manually.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be deleted

Query parameters

  • A flag indicating if associated sync jobs should be also removed. Defaults to false.

  • hard boolean

    A flag indicating if the connector should be hard deleted.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_connector/{connector_id}
curl \
 --request DELETE 'http://api.example.com/_connector/{connector_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
{
    "acknowledged": true
}

Get all connectors Beta

GET /_connector

Get information about all connectors.

Query parameters

  • from number

    Starting offset (default: 0)

  • size number

    Specifies a max number of results to get

  • index_name string | array[string]

    A comma-separated list of connector index names to fetch connector documents for

  • connector_name string | array[string]

    A comma-separated list of connector names to fetch connector documents for

  • service_type string | array[string]

    A comma-separated list of connector service types to fetch connector documents for

  • A flag to indicate if the desired connector should be fetched, even if it was soft-deleted.

  • query string

    A wildcard query string that filters connectors with matching name, description or index name

Responses

GET /_connector
curl \
 --request GET 'http://api.example.com/_connector' \
 --header "Authorization: $API_KEY"
application/json

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

    • id string Required
PUT /_connector
curl \
 --request PUT 'http://api.example.com/_connector' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"index_name\": \"search-google-drive\",\n  \"name\": \"My Connector\",\n  \"service_type\": \"google_drive\"\n}"'
Request examples
{
  "index_name": "search-google-drive",
  "name": "My Connector",
  "service_type": "google_drive"
}
{
  "index_name": "search-google-drive",
  "name": "My Connector",
  "description": "My Connector to sync data to Elastic index from Google Drive",
  "service_type": "google_drive",
  "language": "english"
}
Response examples (200)
{
  "result": "created",
  "id": "my-connector"
}

Create a connector Beta

POST /_connector

Connectors are Elasticsearch integrations that bring content from third-party data sources, which can be deployed on Elastic Cloud or hosted on your own infrastructure. Elastic managed connectors (Native connectors) are a managed service on Elastic Cloud. Self-managed connectors (Connector clients) are self-managed on your infrastructure.

application/json

Body

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

    • id string Required
POST /_connector
curl \
 --request POST 'http://api.example.com/_connector' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"description":"string","index_name":"string","is_native":true,"language":"string","name":"string","service_type":"string"}'

Cancel a connector sync job Beta

PUT /_connector/_sync_job/{connector_sync_job_id}/_cancel

Cancel a connector sync job, which sets the status to cancelling and updates cancellation_requested_at to the current time. The connector service is then responsible for setting the status of connector sync jobs to cancelled.

Path parameters

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/_sync_job/{connector_sync_job_id}/_cancel
curl \
 --request PUT 'http://api.example.com/_connector/_sync_job/{connector_sync_job_id}/_cancel' \
 --header "Authorization: $API_KEY"

Get a connector sync job Beta

GET /_connector/_sync_job/{connector_sync_job_id}

Path parameters

Responses

GET /_connector/_sync_job/{connector_sync_job_id}
curl \
 --request GET 'http://api.example.com/_connector/_sync_job/{connector_sync_job_id}' \
 --header "Authorization: $API_KEY"

Delete a connector sync job Beta

DELETE /_connector/_sync_job/{connector_sync_job_id}

Remove a connector sync job and its associated data. This is a destructive action that is not recoverable.

Path parameters

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_connector/_sync_job/{connector_sync_job_id}
curl \
 --request DELETE 'http://api.example.com/_connector/_sync_job/{connector_sync_job_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
{
  "acknowledged": true
}

Get all connector sync jobs Beta

GET /_connector/_sync_job

Get information about all stored connector sync jobs listed by their creation date in ascending order.

Query parameters

  • from number

    Starting offset (default: 0)

  • size number

    Specifies a max number of results to get

  • status string

    A sync job status to fetch connector sync jobs for

    Values are canceling, canceled, completed, error, in_progress, pending, or suspended.

  • A connector id to fetch connector sync jobs for

  • job_type string | array[string]

    A comma-separated list of job types to fetch the sync jobs for

    Supported values include: full, incremental, access_control

Responses

GET /_connector/_sync_job
curl \
 --request GET 'http://api.example.com/_connector/_sync_job' \
 --header "Authorization: $API_KEY"

Create a connector sync job Beta

POST /_connector/_sync_job

Create a connector sync job document in the internal index and initialize its counters and timestamps with default values.

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • id string Required
POST /_connector/_sync_job
curl \
 --request POST 'http://api.example.com/_connector/_sync_job' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"id\": \"connector-id\",\n  \"job_type\": \"full\",\n  \"trigger_method\": \"on_demand\"\n}"'
Request example
{
  "id": "connector-id",
  "job_type": "full",
  "trigger_method": "on_demand"
}

Activate the connector draft filter Technical preview

PUT /_connector/{connector_id}/_filtering/_activate

Activates the valid draft filtering for a connector.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_filtering/_activate
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_filtering/_activate' \
 --header "Authorization: $API_KEY"

Update the connector API key ID Beta

PUT /_connector/{connector_id}/_api_key_id

Update the api_key_id and api_key_secret_id fields of a connector. You can specify the ID of the API key used for authorization and the ID of the connector secret where the API key is stored. The connector secret ID is required only for Elastic managed (native) connectors. Self-managed connectors (connector clients) do not use this field.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_api_key_id
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_api_key_id' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"api_key_id\": \"my-api-key-id\",\n    \"api_key_secret_id\": \"my-connector-secret-id\"\n}"'
Request example
{
    "api_key_id": "my-api-key-id",
    "api_key_secret_id": "my-connector-secret-id"
}
Response examples (200)
{
  "result": "updated"
}

Update the connector configuration Beta

PUT /_connector/{connector_id}/_configuration

Update the configuration field in the connector document.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_configuration
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_configuration' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"values\": {\n        \"tenant_id\": \"my-tenant-id\",\n        \"tenant_name\": \"my-sharepoint-site\",\n        \"client_id\": \"foo\",\n        \"secret_value\": \"bar\",\n        \"site_collections\": \"*\"\n    }\n}"'
{
    "values": {
        "tenant_id": "my-tenant-id",
        "tenant_name": "my-sharepoint-site",
        "client_id": "foo",
        "secret_value": "bar",
        "site_collections": "*"
    }
}
{
    "values": {
        "secret_value": "foo-bar"
    }
}
Response examples (200)
{
  "result": "updated"
}

Update the connector error field Technical preview

PUT /_connector/{connector_id}/_error

Set the error field for the connector. If the error provided in the request body is non-null, the connector’s status is updated to error. Otherwise, if the error is reset to null, the connector status is updated to connected.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

  • error string | null

    One of:

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_error
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_error' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"error\": \"Houston, we have a problem!\"\n}"'
Request example
{
    "error": "Houston, we have a problem!"
}
Response examples (200)
{
  "result": "updated"
}

Update the connector filtering Beta

PUT /_connector/{connector_id}/_filtering

Update the draft filtering configuration of a connector and marks the draft validation state as edited. The filtering draft is activated once validated by the running Elastic connector service. The filtering property is used to configure sync rules (both basic and advanced) for a connector.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

  • filtering array[object]
    Hide filtering attributes Show filtering attributes object
    • active object Required
      Hide active attributes Show active attributes object
      • advanced_snippet object Required
        Hide advanced_snippet attributes Show advanced_snippet attributes object
        • created_at string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • updated_at string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • value object Required
      • rules array[object] Required
        Hide rules attributes Show rules attributes object
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • id string Required
        • order number Required
        • policy string Required

          Values are exclude or include.

        • rule string Required

          Values are contains, ends_with, equals, regex, starts_with, >, or <.

        • value string Required
      • validation object Required
        Hide validation attributes Show validation attributes object
        • errors array[object] Required
          Hide errors attributes Show errors attributes object
        • state string Required

          Values are edited, invalid, or valid.

    • domain string
    • draft object Required
      Hide draft attributes Show draft attributes object
      • advanced_snippet object Required
        Hide advanced_snippet attributes Show advanced_snippet attributes object
        • created_at string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • updated_at string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • value object Required
      • rules array[object] Required
        Hide rules attributes Show rules attributes object
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • id string Required
        • order number Required
        • policy string Required

          Values are exclude or include.

        • rule string Required

          Values are contains, ends_with, equals, regex, starts_with, >, or <.

        • value string Required
      • validation object Required
        Hide validation attributes Show validation attributes object
        • errors array[object] Required
          Hide errors attributes Show errors attributes object
        • state string Required

          Values are edited, invalid, or valid.

  • rules array[object]
    Hide rules attributes Show rules attributes object
    • created_at string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    • field string Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • id string Required
    • order number Required
    • policy string Required

      Values are exclude or include.

    • rule string Required

      Values are contains, ends_with, equals, regex, starts_with, >, or <.

    • updated_at string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    • value string Required
  • Hide advanced_snippet attributes Show advanced_snippet attributes object
    • created_at string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    • updated_at string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    • value object Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_filtering
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_filtering' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"rules\": [\n         {\n            \"field\": \"file_extension\",\n            \"id\": \"exclude-txt-files\",\n            \"order\": 0,\n            \"policy\": \"exclude\",\n            \"rule\": \"equals\",\n            \"value\": \"txt\"\n        },\n        {\n            \"field\": \"_\",\n            \"id\": \"DEFAULT\",\n            \"order\": 1,\n            \"policy\": \"include\",\n            \"rule\": \"regex\",\n            \"value\": \".*\"\n        }\n    ]\n}"'
Request examples
{
    "rules": [
         {
            "field": "file_extension",
            "id": "exclude-txt-files",
            "order": 0,
            "policy": "exclude",
            "rule": "equals",
            "value": "txt"
        },
        {
            "field": "_",
            "id": "DEFAULT",
            "order": 1,
            "policy": "include",
            "rule": "regex",
            "value": ".*"
        }
    ]
}
{
    "advanced_snippet": {
        "value": [{
            "tables": [
                "users",
                "orders"
            ],
            "query": "SELECT users.id AS id, orders.order_id AS order_id FROM users JOIN orders ON users.id = orders.user_id"
        }]
    }
}
Response examples (200)
{
  "result": "updated"
}

Update the connector draft filtering validation Technical preview

PUT /_connector/{connector_id}/_filtering/_validation

Update the draft filtering validation info for a connector.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

  • validation object Required
    Hide validation attributes Show validation attributes object
    • errors array[object] Required
      Hide errors attributes Show errors attributes object
    • state string Required

      Values are edited, invalid, or valid.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_filtering/_validation
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_filtering/_validation' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"validation":{"errors":[{"ids":["string"],"messages":["string"]}],"state":"edited"}}'

Update the connector index name Beta

PUT /_connector/{connector_id}/_index_name

Update the index_name field of a connector, specifying the index where the data ingested by the connector is stored.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_index_name
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_index_name' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"index_name\": \"data-from-my-google-drive\"\n}"'
Request example
{
    "index_name": "data-from-my-google-drive"
}
Response examples (200)
{
  "result": "updated"
}

Update the connector name and description Beta

PUT /_connector/{connector_id}/_name

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_name
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_name' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"name\": \"Custom connector\",\n    \"description\": \"This is my customized connector\"\n}"'
Request example
{
    "name": "Custom connector",
    "description": "This is my customized connector"
}
Response examples (200)
{
  "result": "updated"
}

Update the connector is_native flag Beta

PUT /_connector/{connector_id}/_native

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_native
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_native' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"is_native":true}'

Update the connector pipeline Beta

PUT /_connector/{connector_id}/_pipeline

When you create a new connector, the configuration of an ingest pipeline is populated with default settings.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_pipeline
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_pipeline' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"pipeline\": {\n        \"extract_binary_content\": true,\n        \"name\": \"my-connector-pipeline\",\n        \"reduce_whitespace\": true,\n        \"run_ml_inference\": true\n    }\n}"'
Request example
{
    "pipeline": {
        "extract_binary_content": true,
        "name": "my-connector-pipeline",
        "reduce_whitespace": true,
        "run_ml_inference": true
    }
}
Response examples (200)
{
  "result": "updated"
}

Update the connector scheduling Beta

PUT /_connector/{connector_id}/_scheduling

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

  • scheduling object Required
    Hide scheduling attributes Show scheduling attributes object
    • Hide access_control attributes Show access_control attributes object
      • enabled boolean Required
      • interval string Required

        The interval is expressed using the crontab syntax

    • full object
      Hide full attributes Show full attributes object
      • enabled boolean Required
      • interval string Required

        The interval is expressed using the crontab syntax

    • Hide incremental attributes Show incremental attributes object
      • enabled boolean Required
      • interval string Required

        The interval is expressed using the crontab syntax

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_scheduling
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_scheduling' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"scheduling\": {\n        \"access_control\": {\n            \"enabled\": true,\n            \"interval\": \"0 10 0 * * ?\"\n        },\n        \"full\": {\n            \"enabled\": true,\n            \"interval\": \"0 20 0 * * ?\"\n        },\n        \"incremental\": {\n            \"enabled\": false,\n            \"interval\": \"0 30 0 * * ?\"\n        }\n    }\n}"'
{
    "scheduling": {
        "access_control": {
            "enabled": true,
            "interval": "0 10 0 * * ?"
        },
        "full": {
            "enabled": true,
            "interval": "0 20 0 * * ?"
        },
        "incremental": {
            "enabled": false,
            "interval": "0 30 0 * * ?"
        }
    }
}
{
    "scheduling": {
        "full": {
            "enabled": true,
            "interval": "0 10 0 * * ?"
        }
    }
}
Response examples (200)
{
  "result": "updated"
}

Update the connector service type Beta

PUT /_connector/{connector_id}/_service_type

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_service_type
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_service_type' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service_type\": \"sharepoint_online\"\n}"'
Request example
{
    "service_type": "sharepoint_online"
}
Response examples (200)
{
  "result": "updated"
}

Update the connector status Technical preview

PUT /_connector/{connector_id}/_status

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

  • status string Required

    Values are created, needs_configuration, configured, connected, or error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_status
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_status' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"status\": \"needs_configuration\"\n}"'
Request example
{
    "status": "needs_configuration"
}
Response examples (200)
{
  "result": "updated"
}

Get data streams Added in 7.9.0

GET /_data_stream/{name}

Get information about one or more data streams.

Path parameters

  • name string | array[string] Required

    Comma-separated list of data stream names used to limit the request. Wildcard (*) expressions are supported. If omitted, all data streams are returned.

Query parameters

  • expand_wildcards string | array[string]

    Type of data stream that wildcard patterns can match. Supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If true, returns all relevant default configurations for the index template.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • verbose boolean

    Whether the maximum timestamp for each data stream should be calculated and returned.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • data_streams array[object] Required
      Hide data_streams attributes Show data_streams attributes object
      • _meta object
        Hide _meta attribute Show _meta attribute object
        • * object Additional properties
      • If true, the data stream allows custom routing on write request.

      • Hide failure_store attributes Show failure_store attributes object
      • generation number Required

        Current generation for the data stream. This number acts as a cumulative count of the stream’s rollovers, starting at 1.

      • hidden boolean Required

        If true, the data stream is hidden.

      • Values are Index Lifecycle Management, Data stream lifecycle, or Unmanaged.

      • prefer_ilm boolean Required

        Indicates if ILM should take precedence over DSL in case both are configured to managed this data stream.

      • indices array[object] Required

        Array of objects containing information about the data stream’s backing indices. The last item in this array contains information about the stream’s current write index.

        Hide indices attributes Show indices attributes object
      • Hide lifecycle attributes Show lifecycle attributes object
      • name string Required
      • replicated boolean

        If true, the data stream is created and managed by cross-cluster replication and the local cluster can not write into this data stream or change its mappings.

      • rollover_on_write boolean Required

        If true, the next write to this data stream will trigger a rollover first and the document will be indexed in the new backing index. If the rollover fails the indexing request will fail too.

      • status string Required

        Values are green, GREEN, yellow, YELLOW, red, or RED.

      • system boolean

        If true, the data stream is created and managed by an Elastic stack component and cannot be modified through normal user interaction.

      • template string Required
      • timestamp_field object Required
        Hide timestamp_field attribute Show timestamp_field attribute object
        • name string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Values are standard, time_series, logsdb, or lookup.

GET /_data_stream/{name}
curl \
 --request GET 'http://api.example.com/_data_stream/{name}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response for retrieving information about a data stream.
{
  "data_streams": [
    {
      "name": "my-data-stream",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-my-data-stream-2099.03.07-000001",
          "index_uuid": "xCEhwsp8Tey0-FLNFYVwSg",
          "prefer_ilm": true,
          "ilm_policy": "my-lifecycle-policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-my-data-stream-2099.03.08-000002",
          "index_uuid": "PA_JquKGSiKcAKBA8DJ5gw",
          "prefer_ilm": true,
          "ilm_policy": "my-lifecycle-policy",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 2,
      "_meta": {
        "my-meta-field": "foo"
      },
      "status": "GREEN",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": true,
      "template": "my-index-template",
      "ilm_policy": "my-lifecycle-policy",
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false
    },
    {
      "name": "my-data-stream-two",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-my-data-stream-two-2099.03.08-000001",
          "index_uuid": "3liBu2SYS5axasRt6fUIpA",
          "prefer_ilm": true,
          "ilm_policy": "my-lifecycle-policy",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 1,
      "_meta": {
        "my-meta-field": "foo"
      },
      "status": "YELLOW",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": true,
      "template": "my-index-template",
      "ilm_policy": "my-lifecycle-policy",
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false
    }
  ]
}

Create a data stream Added in 7.9.0

PUT /_data_stream/{name}

You must have a matching index template with data stream enabled.

Path parameters

  • name string Required

    Name of the data stream, which must meet the following criteria: Lowercase only; Cannot include \, /, *, ?, ", <, >, |, ,, #, :, or a space character; Cannot start with -, _, +, or .ds-; Cannot be . or ..; Cannot be longer than 255 bytes. Multi-byte characters count towards this limit faster.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /_data_stream/{name}
curl \
 --request PUT 'http://api.example.com/_data_stream/{name}' \
 --header "Authorization: $API_KEY"

Delete data streams Added in 7.9.0

DELETE /_data_stream/{name}

Deletes one or more data streams and their backing indices.

Path parameters

  • name string | array[string] Required

    Comma-separated list of data streams to delete. Wildcard (*) expressions are supported.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • expand_wildcards string | array[string]

    Type of data stream that wildcard patterns can match. Supports comma-separated values,such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_data_stream/{name}
curl \
 --request DELETE 'http://api.example.com/_data_stream/{name}' \
 --header "Authorization: $API_KEY"

Get the status for a data stream lifecycle Added in 8.11.0

GET /{index}/_lifecycle/explain

Get information about an index or data stream's current data stream lifecycle status, such as time since index creation, time since rollover, the lifecycle configuration managing the index, or any errors encountered during lifecycle execution.

Path parameters

  • index string | array[string] Required

    The name of the index to explain

Query parameters

  • indicates if the API should return the default values the system uses for the index's lifecycle

  • Specify timeout for connection to master

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • indices object Required
      Hide indices attribute Show indices attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • index string Required
        • managed_by_lifecycle boolean Required
        • Time unit for milliseconds

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Time unit for milliseconds

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide lifecycle attributes Show lifecycle attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • error string
GET /{index}/_lifecycle/explain
curl \
 --request GET 'http://api.example.com/{index}/_lifecycle/explain' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET .ds-metrics-2023.03.22-000001/_lifecycle/explain`, which retrieves the lifecycle status for a data stream backing index. If the index is managed by a data stream lifecycle, the API will show the `managed_by_lifecycle` field set to `true` and the rest of the response will contain information about the lifecycle execution status for this index.
{
  "indices": {
    ".ds-metrics-2023.03.22-000001": {
      "index" : ".ds-metrics-2023.03.22-000001",
      "managed_by_lifecycle" : true,
      "index_creation_date_millis" : 1679475563571,
      "time_since_index_creation" : "843ms",
      "rollover_date_millis" : 1679475564293,
      "time_since_rollover" : "121ms",
      "lifecycle" : { },
      "generation_time" : "121ms"
  }
}
The API reports any errors related to the lifecycle execution for the target index.
{
  "indices": {
    ".ds-metrics-2023.03.22-000001": {
      "index" : ".ds-metrics-2023.03.22-000001",
      "managed_by_lifecycle" : true,
      "index_creation_date_millis" : 1679475563571,
      "time_since_index_creation" : "843ms",
      "lifecycle" : {
        "enabled": true
      },
      "error": "{\"type\":\"validation_exception\",\"reason\":\"Validation Failed: 1: this action would add [2] shards, but this cluster
currently has [4]/[3] maximum normal shards open;\"}"
  }
}

Get data stream lifecycles Added in 8.11.0

GET /_data_stream/{name}/_lifecycle

Get the data stream lifecycle configuration of one or more data streams.

Path parameters

  • name string | array[string] Required

    Comma-separated list of data streams to limit the request. Supports wildcards (*). To target all data streams, omit this parameter or use * or _all.

Query parameters

  • expand_wildcards string | array[string]

    Type of data stream that wildcard patterns can match. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If true, return all default settings in the response.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • data_streams array[object] Required
      Hide data_streams attributes Show data_streams attributes object
      • name string Required
      • Hide lifecycle attributes Show lifecycle attributes object
GET /_data_stream/{name}/_lifecycle
curl \
 --request GET 'http://api.example.com/_data_stream/{name}/_lifecycle' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _lifecycle/stats?human&pretty`.
{
  "data_streams": [
    {
      "name": "my-data-stream-1",
      "lifecycle": {
        "enabled": true,
        "data_retention": "7d"
      }
    },
    {
      "name": "my-data-stream-2",
      "lifecycle": {
        "enabled": true,
        "data_retention": "7d"
      }
    }
  ]
}

Update data stream lifecycles Added in 8.11.0

PUT /_data_stream/{name}/_lifecycle

Update the data stream lifecycle of the specified data streams.

Path parameters

  • name string | array[string] Required

    Comma-separated list of data streams used to limit the request. Supports wildcards (*). To target all data streams use * or _all.

Query parameters

  • expand_wildcards string | array[string]

    Type of data stream that wildcard patterns can match. Supports comma-separated values, such as open,hidden. Valid values are: all, hidden, open, closed, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

application/json

Body

  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • Hide downsampling attribute Show downsampling attribute object
    • rounds array[object] Required

      The list of downsampling rounds to execute as part of this downsampling configuration

      Hide rounds attributes Show rounds attributes object
      • after string Required

        A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • config object Required
        Hide config attribute Show config attribute object
        • fixed_interval string Required

          A date histogram interval. Similar to Duration with additional units: w (week), M (month), q (quarter) and y (year)

  • enabled boolean

    If defined, it turns data stream lifecycle on/off (true/false) for this data stream. A data stream lifecycle that's disabled (enabled: false) will have no effect on the data stream.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /_data_stream/{name}/_lifecycle
curl \
 --request PUT 'http://api.example.com/_data_stream/{name}/_lifecycle' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"data_retention\": \"7d\"\n}"'
{
  "data_retention": "7d"
}
This example configures two downsampling rounds.
{
    "downsampling": [
      {
        "after": "1d",
        "fixed_interval": "10m"
      },
      {
        "after": "7d",
        "fixed_interval": "1d"
      }
    ]
}
Response examples (200)
A successful response for configuring a data stream lifecycle.
{
  "acknowledged": true
}

Get data streams Added in 7.9.0

GET /_data_stream

Get information about one or more data streams.

Query parameters

  • expand_wildcards string | array[string]

    Type of data stream that wildcard patterns can match. Supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If true, returns all relevant default configurations for the index template.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • verbose boolean

    Whether the maximum timestamp for each data stream should be calculated and returned.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • data_streams array[object] Required
      Hide data_streams attributes Show data_streams attributes object
      • _meta object
        Hide _meta attribute Show _meta attribute object
        • * object Additional properties
      • If true, the data stream allows custom routing on write request.

      • Hide failure_store attributes Show failure_store attributes object
      • generation number Required

        Current generation for the data stream. This number acts as a cumulative count of the stream’s rollovers, starting at 1.

      • hidden boolean Required

        If true, the data stream is hidden.

      • Values are Index Lifecycle Management, Data stream lifecycle, or Unmanaged.

      • prefer_ilm boolean Required

        Indicates if ILM should take precedence over DSL in case both are configured to managed this data stream.

      • indices array[object] Required

        Array of objects containing information about the data stream’s backing indices. The last item in this array contains information about the stream’s current write index.

        Hide indices attributes Show indices attributes object
      • Hide lifecycle attributes Show lifecycle attributes object
      • name string Required
      • replicated boolean

        If true, the data stream is created and managed by cross-cluster replication and the local cluster can not write into this data stream or change its mappings.

      • rollover_on_write boolean Required

        If true, the next write to this data stream will trigger a rollover first and the document will be indexed in the new backing index. If the rollover fails the indexing request will fail too.

      • status string Required

        Values are green, GREEN, yellow, YELLOW, red, or RED.

      • system boolean

        If true, the data stream is created and managed by an Elastic stack component and cannot be modified through normal user interaction.

      • template string Required
      • timestamp_field object Required
        Hide timestamp_field attribute Show timestamp_field attribute object
        • name string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Values are standard, time_series, logsdb, or lookup.

GET /_data_stream
curl \
 --request GET 'http://api.example.com/_data_stream' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response for retrieving information about a data stream.
{
  "data_streams": [
    {
      "name": "my-data-stream",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-my-data-stream-2099.03.07-000001",
          "index_uuid": "xCEhwsp8Tey0-FLNFYVwSg",
          "prefer_ilm": true,
          "ilm_policy": "my-lifecycle-policy",
          "managed_by": "Index Lifecycle Management"
        },
        {
          "index_name": ".ds-my-data-stream-2099.03.08-000002",
          "index_uuid": "PA_JquKGSiKcAKBA8DJ5gw",
          "prefer_ilm": true,
          "ilm_policy": "my-lifecycle-policy",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 2,
      "_meta": {
        "my-meta-field": "foo"
      },
      "status": "GREEN",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": true,
      "template": "my-index-template",
      "ilm_policy": "my-lifecycle-policy",
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false
    },
    {
      "name": "my-data-stream-two",
      "timestamp_field": {
        "name": "@timestamp"
      },
      "indices": [
        {
          "index_name": ".ds-my-data-stream-two-2099.03.08-000001",
          "index_uuid": "3liBu2SYS5axasRt6fUIpA",
          "prefer_ilm": true,
          "ilm_policy": "my-lifecycle-policy",
          "managed_by": "Index Lifecycle Management"
        }
      ],
      "generation": 1,
      "_meta": {
        "my-meta-field": "foo"
      },
      "status": "YELLOW",
      "next_generation_managed_by": "Index Lifecycle Management",
      "prefer_ilm": true,
      "template": "my-index-template",
      "ilm_policy": "my-lifecycle-policy",
      "hidden": false,
      "system": false,
      "allow_custom_routing": false,
      "replicated": false,
      "rollover_on_write": false
    }
  ]
}

Convert an index alias to a data stream Added in 7.9.0

POST /_data_stream/_migrate/{name}

Converts an index alias to a data stream. You must have a matching index template that is data stream enabled. The alias must meet the following criteria: The alias must have a write index; All indices for the alias must have a @timestamp field mapping of a date or date_nanos field type; The alias must not have any filters; The alias must not use custom routing. If successful, the request removes the alias and creates a data stream with the same name. The indices for the alias become hidden backing indices for the stream. The write index for the alias becomes the write index for the stream.

Path parameters

  • name string Required

    Name of the index alias to convert to a data stream.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_data_stream/_migrate/{name}
curl \
 --request POST 'http://api.example.com/_data_stream/_migrate/{name}' \
 --header "Authorization: $API_KEY"

Update data streams Added in 7.16.0

POST /_data_stream/_modify

Performs one or more data stream modification actions in a single atomic operation.

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_data_stream/_modify
curl \
 --request POST 'http://api.example.com/_data_stream/_modify' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"actions":[{"add_backing_index":{"data_stream":"string","index":"string"},"remove_backing_index":{"data_stream":"string","index":"string"}}]}'

Bulk index or delete documents

PUT /_bulk

Perform multiple index, create, delete, and update actions in a single request. This reduces overhead and can greatly increase indexing speed.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:

  • To use the create action, you must have the create_doc, create, index, or write index privilege. Data streams support only the create action.
  • To use the index action, you must have the create, index, or write index privilege.
  • To use the delete action, you must have the delete or write index privilege.
  • To use the update action, you must have the index or write index privilege.
  • To automatically create a data stream or index with a bulk API request, you must have the auto_configure, create_index, or manage index privilege.
  • To make the result of a bulk operation visible to search using the refresh parameter, you must have the maintenance or manage index privilege.

Automatic data stream creation requires a matching index template with data stream enabled.

The actions are specified in the request body using a newline delimited JSON (NDJSON) structure:

action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n

The index and create actions expect a source on the next line and have the same semantics as the op_type parameter in the standard index API. A create action fails if a document with the same ID already exists in the target An index action adds or replaces a document as necessary.

NOTE: Data streams support only the create action. To update or delete a document in a data stream, you must target the backing index containing the document.

An update action expects that the partial doc, upsert, and script and its options are specified on the next line.

A delete action does not expect a source on the next line and has the same semantics as the standard delete API.

NOTE: The final line of data must end with a newline character (\n). Each newline character may be preceded by a carriage return (\r). When sending NDJSON data to the _bulk endpoint, use a Content-Type header of application/json or application/x-ndjson. Because this format uses literal newline characters (\n) as delimiters, make sure that the JSON actions and sources are not pretty printed.

If you provide a target in the request path, it is used for any actions that don't explicitly specify an _index argument.

A note on the format: the idea here is to make processing as fast as possible. As some of the actions are redirected to other shards on other nodes, only action_meta_data is parsed on the receiving node side.

Client libraries using this protocol should try and strive to do something similar on the client side, and reduce buffering as much as possible.

There is no "correct" number of actions to perform in a single bulk request. Experiment with different settings to find the optimal size for your particular workload. Note that Elasticsearch limits the maximum size of a HTTP request to 100mb by default so clients must ensure that no request exceeds this size. It is not possible to index a single document that exceeds the size limit, so you must pre-process any such documents into smaller pieces before sending them to Elasticsearch. For instance, split documents into pages or chapters before indexing them, or store raw binary data in a system outside Elasticsearch and replace the raw data with a link to the external system in the documents that you send to Elasticsearch.

Client suppport for bulk requests

Some of the officially supported clients provide helpers to assist with bulk requests and reindexing:

  • Go: Check out esutil.BulkIndexer
  • Perl: Check out Search::Elasticsearch::Client::5_0::Bulk and Search::Elasticsearch::Client::5_0::Scroll
  • Python: Check out elasticsearch.helpers.*
  • JavaScript: Check out client.helpers.*
  • .NET: Check out BulkAllObservable
  • PHP: Check out bulk indexing.

Submitting bulk requests with cURL

If you're providing text file input to curl, you must use the --data-binary flag instead of plain -d. The latter doesn't preserve newlines. For example:

$ cat requests
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
$ curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/_bulk --data-binary "@requests"; echo
{"took":7, "errors": false, "items":[{"index":{"_index":"test","_id":"1","_version":1,"result":"created","forced_refresh":false}}]}

Optimistic concurrency control

Each index and delete action within a bulk API call may include the if_seq_no and if_primary_term parameters in their respective action and meta data lines. The if_seq_no and if_primary_term parameters control how operations are run, based on the last modification to existing documents. See Optimistic concurrency control for more details.

Versioning

Each bulk item can include the version value using the version field. It automatically follows the behavior of the index or delete operation based on the _version mapping. It also support the version_type.

Routing

Each bulk item can include the routing value using the routing field. It automatically follows the behavior of the index or delete operation based on the _routing mapping.

NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing setting enabled in the template.

Wait for active shards

When making bulk calls, you can set the wait_for_active_shards parameter to require a minimum number of shard copies to be active before starting to process the bulk request.

Refresh

Control when the changes made by this request are visible to search.

NOTE: Only the shards that receive the bulk request will be affected by refresh. Imagine a _bulk?refresh=wait_for request with three documents in it that happen to be routed to different shards in an index with five shards. The request will only wait for those three shards to refresh. The other two shards that make up the index do not participate in the _bulk request at all.

Query parameters

  • True or false if to include the document source in the error message in case of parsing errors.

  • If true, the response will include the ingest pipelines that were run for each index or create.

  • pipeline string

    The pipeline identifier to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

  • refresh string

    If true, Elasticsearch refreshes the affected shards to make this operation visible to search. If wait_for, wait for a refresh to make this operation visible to search. If false, do nothing with refreshes. Valid values: true, false, wait_for.

    Values are true, false, or wait_for.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • _source boolean | string | array[string]

    Indicates whether to return the _source field (true or false) or contains a list of fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in _source_includes query parameter. If the _source parameter is false, this parameter is ignored.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the _source_excludes query parameter. If the _source parameter is false, this parameter is ignored.

  • timeout string

    The period each action waits for the following operations: automatic index creation, dynamic mapping updates, and waiting for active shards. The default is 1m (one minute), which guarantees Elasticsearch waits for at least the timeout before failing. The actual wait time could be longer, particularly when multiple waits occur.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default is 1, which waits for each primary shard to be active.

  • If true, the request's actions must target an index alias.

  • If true, the request's actions must target a data stream (existing or to be created).

application/json

Body object Required

One of:
  • index object
    Hide index attributes Show index attributes object
    • _id string
    • _index string
    • routing string
    • version number
    • Values are internal, external, external_gte, or force.

    • A map from the full name of fields to the name of dynamic templates. It defaults to an empty map. If a name matches a dynamic template, that template will be applied regardless of other match predicates defined in the template. If a field is already defined in the mapping, then this parameter won't be used.

      Hide dynamic_templates attribute Show dynamic_templates attribute object
      • * string Additional properties
    • pipeline string

      The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

    • If true, the request's actions must target an index alias.

  • create object
    Hide create attributes Show create attributes object
    • _id string
    • _index string
    • routing string
    • version number
    • Values are internal, external, external_gte, or force.

    • A map from the full name of fields to the name of dynamic templates. It defaults to an empty map. If a name matches a dynamic template, that template will be applied regardless of other match predicates defined in the template. If a field is already defined in the mapping, then this parameter won't be used.

      Hide dynamic_templates attribute Show dynamic_templates attribute object
      • * string Additional properties
    • pipeline string

      The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

    • If true, the request's actions must target an index alias.

  • update object
    Hide update attributes Show update attributes object
  • delete object
    Hide delete attributes Show delete attributes object

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • errors boolean Required

      If true, one or more of the operations in the bulk request did not complete successfully.

    • items array[object] Required

      The result of each operation in the bulk request, in the order they were submitted.

      Hide items attribute Show items attribute object
    • took number Required

      The length of time, in milliseconds, it took to process the bulk request.

PUT /_bulk
curl \
 --request PUT 'http://api.example.com/_bulk' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{ \"index\" : { \"_index\" : \"test\", \"_id\" : \"1\" } }\n{ \"field1\" : \"value1\" }\n{ \"delete\" : { \"_index\" : \"test\", \"_id\" : \"2\" } }\n{ \"create\" : { \"_index\" : \"test\", \"_id\" : \"3\" } }\n{ \"field1\" : \"value3\" }\n{ \"update\" : {\"_id\" : \"1\", \"_index\" : \"test\"} }\n{ \"doc\" : {\"field2\" : \"value2\"} }"'
Run `POST _bulk` to perform multiple operations.
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_id" : "2" } }
{ "create" : { "_index" : "test", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }
When you run `POST _bulk` and use the `update` action, you can use `retry_on_conflict` as a field in the action itself (not in the extra payload line) to specify how many times an update should be retried in the case of a version conflict.
{ "update" : {"_id" : "1", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"} }
{ "update" : { "_id" : "0", "_index" : "index1", "retry_on_conflict" : 3} }
{ "script" : { "source": "ctx._source.counter += params.param1", "lang" : "painless", "params" : {"param1" : 1}}, "upsert" : {"counter" : 1}}
{ "update" : {"_id" : "2", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"}, "doc_as_upsert" : true }
{ "update" : {"_id" : "3", "_index" : "index1", "_source" : true} }
{ "doc" : {"field" : "value"} }
{ "update" : {"_id" : "4", "_index" : "index1"} }
{ "doc" : {"field" : "value"}, "_source": true}
To return only information about failed operations, run `POST /_bulk?filter_path=items.*.error`.
{ "update": {"_id": "5", "_index": "index1"} }
{ "doc": {"my_field": "foo"} }
{ "update": {"_id": "6", "_index": "index1"} }
{ "doc": {"my_field": "foo"} }
{ "create": {"_id": "7", "_index": "index1"} }
{ "my_field": "foo" }
Run `POST /_bulk` to perform a bulk request that consists of index and create actions with the `dynamic_templates` parameter. The bulk request creates two new fields `work_location` and `home_location` with type `geo_point` according to the `dynamic_templates` parameter. However, the `raw_location` field is created using default dynamic mapping rules, as a text field in that case since it is supplied as a string in the JSON document.
{ "index" : { "_index" : "my_index", "_id" : "1", "dynamic_templates": {"work_location": "geo_point"}} }
{ "field" : "value1", "work_location": "41.12,-71.34", "raw_location": "41.12,-71.34"}
{ "create" : { "_index" : "my_index", "_id" : "2", "dynamic_templates": {"home_location": "geo_point"}} }
{ "field" : "value2", "home_location": "41.12,-71.34"}
Response examples (200)
{
   "took": 30,
   "errors": false,
   "items": [
      {
         "index": {
            "_index": "test",
            "_id": "1",
            "_version": 1,
            "result": "created",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 201,
            "_seq_no" : 0,
            "_primary_term": 1
         }
      },
      {
         "delete": {
            "_index": "test",
            "_id": "2",
            "_version": 1,
            "result": "not_found",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 404,
            "_seq_no" : 1,
            "_primary_term" : 2
         }
      },
      {
         "create": {
            "_index": "test",
            "_id": "3",
            "_version": 1,
            "result": "created",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 201,
            "_seq_no" : 2,
            "_primary_term" : 3
         }
      },
      {
         "update": {
            "_index": "test",
            "_id": "1",
            "_version": 2,
            "result": "updated",
            "_shards": {
                "total": 2,
                "successful": 1,
                "failed": 0
            },
            "status": 200,
            "_seq_no" : 3,
            "_primary_term" : 4
         }
      }
   ]
}
If you run `POST /_bulk` with operations that update non-existent documents, the operations cannot complete successfully. The API returns a response with an `errors` property value `true`. The response also includes an error object for any failed operations. The error object contains additional information about the failure, such as the error type and reason.
{
  "took": 486,
  "errors": true,
  "items": [
    {
      "update": {
        "_index": "index1",
        "_id": "5",
        "status": 404,
        "error": {
          "type": "document_missing_exception",
          "reason": "[5]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "update": {
        "_index": "index1",
        "_id": "6",
        "status": 404,
        "error": {
          "type": "document_missing_exception",
          "reason": "[6]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "create": {
        "_index": "index1",
        "_id": "7",
        "_version": 1,
        "result": "created",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "_seq_no": 0,
        "_primary_term": 1,
        "status": 201
      }
    }
  ]
}
An example response from `POST /_bulk?filter_path=items.*.error`, which returns only information about failed operations.
{
  "items": [
    {
      "update": {
        "error": {
          "type": "document_missing_exception",
          "reason": "[5]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "update": {
        "error": {
          "type": "document_missing_exception",
          "reason": "[6]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    }
  ]
}

Bulk index or delete documents

POST /_bulk

Perform multiple index, create, delete, and update actions in a single request. This reduces overhead and can greatly increase indexing speed.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:

  • To use the create action, you must have the create_doc, create, index, or write index privilege. Data streams support only the create action.
  • To use the index action, you must have the create, index, or write index privilege.
  • To use the delete action, you must have the delete or write index privilege.
  • To use the update action, you must have the index or write index privilege.
  • To automatically create a data stream or index with a bulk API request, you must have the auto_configure, create_index, or manage index privilege.
  • To make the result of a bulk operation visible to search using the refresh parameter, you must have the maintenance or manage index privilege.

Automatic data stream creation requires a matching index template with data stream enabled.

The actions are specified in the request body using a newline delimited JSON (NDJSON) structure:

action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n

The index and create actions expect a source on the next line and have the same semantics as the op_type parameter in the standard index API. A create action fails if a document with the same ID already exists in the target An index action adds or replaces a document as necessary.

NOTE: Data streams support only the create action. To update or delete a document in a data stream, you must target the backing index containing the document.

An update action expects that the partial doc, upsert, and script and its options are specified on the next line.

A delete action does not expect a source on the next line and has the same semantics as the standard delete API.

NOTE: The final line of data must end with a newline character (\n). Each newline character may be preceded by a carriage return (\r). When sending NDJSON data to the _bulk endpoint, use a Content-Type header of application/json or application/x-ndjson. Because this format uses literal newline characters (\n) as delimiters, make sure that the JSON actions and sources are not pretty printed.

If you provide a target in the request path, it is used for any actions that don't explicitly specify an _index argument.

A note on the format: the idea here is to make processing as fast as possible. As some of the actions are redirected to other shards on other nodes, only action_meta_data is parsed on the receiving node side.

Client libraries using this protocol should try and strive to do something similar on the client side, and reduce buffering as much as possible.

There is no "correct" number of actions to perform in a single bulk request. Experiment with different settings to find the optimal size for your particular workload. Note that Elasticsearch limits the maximum size of a HTTP request to 100mb by default so clients must ensure that no request exceeds this size. It is not possible to index a single document that exceeds the size limit, so you must pre-process any such documents into smaller pieces before sending them to Elasticsearch. For instance, split documents into pages or chapters before indexing them, or store raw binary data in a system outside Elasticsearch and replace the raw data with a link to the external system in the documents that you send to Elasticsearch.

Client suppport for bulk requests

Some of the officially supported clients provide helpers to assist with bulk requests and reindexing:

  • Go: Check out esutil.BulkIndexer
  • Perl: Check out Search::Elasticsearch::Client::5_0::Bulk and Search::Elasticsearch::Client::5_0::Scroll
  • Python: Check out elasticsearch.helpers.*
  • JavaScript: Check out client.helpers.*
  • .NET: Check out BulkAllObservable
  • PHP: Check out bulk indexing.

Submitting bulk requests with cURL

If you're providing text file input to curl, you must use the --data-binary flag instead of plain -d. The latter doesn't preserve newlines. For example:

$ cat requests
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
$ curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/_bulk --data-binary "@requests"; echo
{"took":7, "errors": false, "items":[{"index":{"_index":"test","_id":"1","_version":1,"result":"created","forced_refresh":false}}]}

Optimistic concurrency control

Each index and delete action within a bulk API call may include the if_seq_no and if_primary_term parameters in their respective action and meta data lines. The if_seq_no and if_primary_term parameters control how operations are run, based on the last modification to existing documents. See Optimistic concurrency control for more details.

Versioning

Each bulk item can include the version value using the version field. It automatically follows the behavior of the index or delete operation based on the _version mapping. It also support the version_type.

Routing

Each bulk item can include the routing value using the routing field. It automatically follows the behavior of the index or delete operation based on the _routing mapping.

NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing setting enabled in the template.

Wait for active shards

When making bulk calls, you can set the wait_for_active_shards parameter to require a minimum number of shard copies to be active before starting to process the bulk request.

Refresh

Control when the changes made by this request are visible to search.

NOTE: Only the shards that receive the bulk request will be affected by refresh. Imagine a _bulk?refresh=wait_for request with three documents in it that happen to be routed to different shards in an index with five shards. The request will only wait for those three shards to refresh. The other two shards that make up the index do not participate in the _bulk request at all.

Query parameters

  • True or false if to include the document source in the error message in case of parsing errors.

  • If true, the response will include the ingest pipelines that were run for each index or create.

  • pipeline string

    The pipeline identifier to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

  • refresh string

    If true, Elasticsearch refreshes the affected shards to make this operation visible to search. If wait_for, wait for a refresh to make this operation visible to search. If false, do nothing with refreshes. Valid values: true, false, wait_for.

    Values are true, false, or wait_for.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • _source boolean | string | array[string]

    Indicates whether to return the _source field (true or false) or contains a list of fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in _source_includes query parameter. If the _source parameter is false, this parameter is ignored.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the _source_excludes query parameter. If the _source parameter is false, this parameter is ignored.

  • timeout string

    The period each action waits for the following operations: automatic index creation, dynamic mapping updates, and waiting for active shards. The default is 1m (one minute), which guarantees Elasticsearch waits for at least the timeout before failing. The actual wait time could be longer, particularly when multiple waits occur.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default is 1, which waits for each primary shard to be active.

  • If true, the request's actions must target an index alias.

  • If true, the request's actions must target a data stream (existing or to be created).

application/json

Body object Required

One of:
  • index object
    Hide index attributes Show index attributes object
    • _id string
    • _index string
    • routing string
    • version number
    • Values are internal, external, external_gte, or force.

    • A map from the full name of fields to the name of dynamic templates. It defaults to an empty map. If a name matches a dynamic template, that template will be applied regardless of other match predicates defined in the template. If a field is already defined in the mapping, then this parameter won't be used.

      Hide dynamic_templates attribute Show dynamic_templates attribute object
      • * string Additional properties
    • pipeline string

      The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

    • If true, the request's actions must target an index alias.

  • create object
    Hide create attributes Show create attributes object
    • _id string
    • _index string
    • routing string
    • version number
    • Values are internal, external, external_gte, or force.

    • A map from the full name of fields to the name of dynamic templates. It defaults to an empty map. If a name matches a dynamic template, that template will be applied regardless of other match predicates defined in the template. If a field is already defined in the mapping, then this parameter won't be used.

      Hide dynamic_templates attribute Show dynamic_templates attribute object
      • * string Additional properties
    • pipeline string

      The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

    • If true, the request's actions must target an index alias.

  • update object
    Hide update attributes Show update attributes object
  • delete object
    Hide delete attributes Show delete attributes object

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • errors boolean Required

      If true, one or more of the operations in the bulk request did not complete successfully.

    • items array[object] Required

      The result of each operation in the bulk request, in the order they were submitted.

      Hide items attribute Show items attribute object
    • took number Required

      The length of time, in milliseconds, it took to process the bulk request.

POST /_bulk
curl \
 --request POST 'http://api.example.com/_bulk' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{ \"index\" : { \"_index\" : \"test\", \"_id\" : \"1\" } }\n{ \"field1\" : \"value1\" }\n{ \"delete\" : { \"_index\" : \"test\", \"_id\" : \"2\" } }\n{ \"create\" : { \"_index\" : \"test\", \"_id\" : \"3\" } }\n{ \"field1\" : \"value3\" }\n{ \"update\" : {\"_id\" : \"1\", \"_index\" : \"test\"} }\n{ \"doc\" : {\"field2\" : \"value2\"} }"'
Run `POST _bulk` to perform multiple operations.
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_id" : "2" } }
{ "create" : { "_index" : "test", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }
When you run `POST _bulk` and use the `update` action, you can use `retry_on_conflict` as a field in the action itself (not in the extra payload line) to specify how many times an update should be retried in the case of a version conflict.
{ "update" : {"_id" : "1", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"} }
{ "update" : { "_id" : "0", "_index" : "index1", "retry_on_conflict" : 3} }
{ "script" : { "source": "ctx._source.counter += params.param1", "lang" : "painless", "params" : {"param1" : 1}}, "upsert" : {"counter" : 1}}
{ "update" : {"_id" : "2", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"}, "doc_as_upsert" : true }
{ "update" : {"_id" : "3", "_index" : "index1", "_source" : true} }
{ "doc" : {"field" : "value"} }
{ "update" : {"_id" : "4", "_index" : "index1"} }
{ "doc" : {"field" : "value"}, "_source": true}
To return only information about failed operations, run `POST /_bulk?filter_path=items.*.error`.
{ "update": {"_id": "5", "_index": "index1"} }
{ "doc": {"my_field": "foo"} }
{ "update": {"_id": "6", "_index": "index1"} }
{ "doc": {"my_field": "foo"} }
{ "create": {"_id": "7", "_index": "index1"} }
{ "my_field": "foo" }
Run `POST /_bulk` to perform a bulk request that consists of index and create actions with the `dynamic_templates` parameter. The bulk request creates two new fields `work_location` and `home_location` with type `geo_point` according to the `dynamic_templates` parameter. However, the `raw_location` field is created using default dynamic mapping rules, as a text field in that case since it is supplied as a string in the JSON document.
{ "index" : { "_index" : "my_index", "_id" : "1", "dynamic_templates": {"work_location": "geo_point"}} }
{ "field" : "value1", "work_location": "41.12,-71.34", "raw_location": "41.12,-71.34"}
{ "create" : { "_index" : "my_index", "_id" : "2", "dynamic_templates": {"home_location": "geo_point"}} }
{ "field" : "value2", "home_location": "41.12,-71.34"}
Response examples (200)
{
   "took": 30,
   "errors": false,
   "items": [
      {
         "index": {
            "_index": "test",
            "_id": "1",
            "_version": 1,
            "result": "created",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 201,
            "_seq_no" : 0,
            "_primary_term": 1
         }
      },
      {
         "delete": {
            "_index": "test",
            "_id": "2",
            "_version": 1,
            "result": "not_found",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 404,
            "_seq_no" : 1,
            "_primary_term" : 2
         }
      },
      {
         "create": {
            "_index": "test",
            "_id": "3",
            "_version": 1,
            "result": "created",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 201,
            "_seq_no" : 2,
            "_primary_term" : 3
         }
      },
      {
         "update": {
            "_index": "test",
            "_id": "1",
            "_version": 2,
            "result": "updated",
            "_shards": {
                "total": 2,
                "successful": 1,
                "failed": 0
            },
            "status": 200,
            "_seq_no" : 3,
            "_primary_term" : 4
         }
      }
   ]
}
If you run `POST /_bulk` with operations that update non-existent documents, the operations cannot complete successfully. The API returns a response with an `errors` property value `true`. The response also includes an error object for any failed operations. The error object contains additional information about the failure, such as the error type and reason.
{
  "took": 486,
  "errors": true,
  "items": [
    {
      "update": {
        "_index": "index1",
        "_id": "5",
        "status": 404,
        "error": {
          "type": "document_missing_exception",
          "reason": "[5]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "update": {
        "_index": "index1",
        "_id": "6",
        "status": 404,
        "error": {
          "type": "document_missing_exception",
          "reason": "[6]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "create": {
        "_index": "index1",
        "_id": "7",
        "_version": 1,
        "result": "created",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "_seq_no": 0,
        "_primary_term": 1,
        "status": 201
      }
    }
  ]
}
An example response from `POST /_bulk?filter_path=items.*.error`, which returns only information about failed operations.
{
  "items": [
    {
      "update": {
        "error": {
          "type": "document_missing_exception",
          "reason": "[5]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "update": {
        "error": {
          "type": "document_missing_exception",
          "reason": "[6]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    }
  ]
}

Bulk index or delete documents

PUT /{index}/_bulk

Perform multiple index, create, delete, and update actions in a single request. This reduces overhead and can greatly increase indexing speed.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:

  • To use the create action, you must have the create_doc, create, index, or write index privilege. Data streams support only the create action.
  • To use the index action, you must have the create, index, or write index privilege.
  • To use the delete action, you must have the delete or write index privilege.
  • To use the update action, you must have the index or write index privilege.
  • To automatically create a data stream or index with a bulk API request, you must have the auto_configure, create_index, or manage index privilege.
  • To make the result of a bulk operation visible to search using the refresh parameter, you must have the maintenance or manage index privilege.

Automatic data stream creation requires a matching index template with data stream enabled.

The actions are specified in the request body using a newline delimited JSON (NDJSON) structure:

action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n

The index and create actions expect a source on the next line and have the same semantics as the op_type parameter in the standard index API. A create action fails if a document with the same ID already exists in the target An index action adds or replaces a document as necessary.

NOTE: Data streams support only the create action. To update or delete a document in a data stream, you must target the backing index containing the document.

An update action expects that the partial doc, upsert, and script and its options are specified on the next line.

A delete action does not expect a source on the next line and has the same semantics as the standard delete API.

NOTE: The final line of data must end with a newline character (\n). Each newline character may be preceded by a carriage return (\r). When sending NDJSON data to the _bulk endpoint, use a Content-Type header of application/json or application/x-ndjson. Because this format uses literal newline characters (\n) as delimiters, make sure that the JSON actions and sources are not pretty printed.

If you provide a target in the request path, it is used for any actions that don't explicitly specify an _index argument.

A note on the format: the idea here is to make processing as fast as possible. As some of the actions are redirected to other shards on other nodes, only action_meta_data is parsed on the receiving node side.

Client libraries using this protocol should try and strive to do something similar on the client side, and reduce buffering as much as possible.

There is no "correct" number of actions to perform in a single bulk request. Experiment with different settings to find the optimal size for your particular workload. Note that Elasticsearch limits the maximum size of a HTTP request to 100mb by default so clients must ensure that no request exceeds this size. It is not possible to index a single document that exceeds the size limit, so you must pre-process any such documents into smaller pieces before sending them to Elasticsearch. For instance, split documents into pages or chapters before indexing them, or store raw binary data in a system outside Elasticsearch and replace the raw data with a link to the external system in the documents that you send to Elasticsearch.

Client suppport for bulk requests

Some of the officially supported clients provide helpers to assist with bulk requests and reindexing:

  • Go: Check out esutil.BulkIndexer
  • Perl: Check out Search::Elasticsearch::Client::5_0::Bulk and Search::Elasticsearch::Client::5_0::Scroll
  • Python: Check out elasticsearch.helpers.*
  • JavaScript: Check out client.helpers.*
  • .NET: Check out BulkAllObservable
  • PHP: Check out bulk indexing.

Submitting bulk requests with cURL

If you're providing text file input to curl, you must use the --data-binary flag instead of plain -d. The latter doesn't preserve newlines. For example:

$ cat requests
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
$ curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/_bulk --data-binary "@requests"; echo
{"took":7, "errors": false, "items":[{"index":{"_index":"test","_id":"1","_version":1,"result":"created","forced_refresh":false}}]}

Optimistic concurrency control

Each index and delete action within a bulk API call may include the if_seq_no and if_primary_term parameters in their respective action and meta data lines. The if_seq_no and if_primary_term parameters control how operations are run, based on the last modification to existing documents. See Optimistic concurrency control for more details.

Versioning

Each bulk item can include the version value using the version field. It automatically follows the behavior of the index or delete operation based on the _version mapping. It also support the version_type.

Routing

Each bulk item can include the routing value using the routing field. It automatically follows the behavior of the index or delete operation based on the _routing mapping.

NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing setting enabled in the template.

Wait for active shards

When making bulk calls, you can set the wait_for_active_shards parameter to require a minimum number of shard copies to be active before starting to process the bulk request.

Refresh

Control when the changes made by this request are visible to search.

NOTE: Only the shards that receive the bulk request will be affected by refresh. Imagine a _bulk?refresh=wait_for request with three documents in it that happen to be routed to different shards in an index with five shards. The request will only wait for those three shards to refresh. The other two shards that make up the index do not participate in the _bulk request at all.

Path parameters

  • index string Required

    The name of the data stream, index, or index alias to perform bulk actions on.

Query parameters

  • True or false if to include the document source in the error message in case of parsing errors.

  • If true, the response will include the ingest pipelines that were run for each index or create.

  • pipeline string

    The pipeline identifier to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

  • refresh string

    If true, Elasticsearch refreshes the affected shards to make this operation visible to search. If wait_for, wait for a refresh to make this operation visible to search. If false, do nothing with refreshes. Valid values: true, false, wait_for.

    Values are true, false, or wait_for.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • _source boolean | string | array[string]

    Indicates whether to return the _source field (true or false) or contains a list of fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in _source_includes query parameter. If the _source parameter is false, this parameter is ignored.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the _source_excludes query parameter. If the _source parameter is false, this parameter is ignored.

  • timeout string

    The period each action waits for the following operations: automatic index creation, dynamic mapping updates, and waiting for active shards. The default is 1m (one minute), which guarantees Elasticsearch waits for at least the timeout before failing. The actual wait time could be longer, particularly when multiple waits occur.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default is 1, which waits for each primary shard to be active.

  • If true, the request's actions must target an index alias.

  • If true, the request's actions must target a data stream (existing or to be created).

application/json

Body object Required

One of:
  • index object
    Hide index attributes Show index attributes object
    • _id string
    • _index string
    • routing string
    • version number
    • Values are internal, external, external_gte, or force.

    • A map from the full name of fields to the name of dynamic templates. It defaults to an empty map. If a name matches a dynamic template, that template will be applied regardless of other match predicates defined in the template. If a field is already defined in the mapping, then this parameter won't be used.

      Hide dynamic_templates attribute Show dynamic_templates attribute object
      • * string Additional properties
    • pipeline string

      The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

    • If true, the request's actions must target an index alias.

  • create object
    Hide create attributes Show create attributes object
    • _id string
    • _index string
    • routing string
    • version number
    • Values are internal, external, external_gte, or force.

    • A map from the full name of fields to the name of dynamic templates. It defaults to an empty map. If a name matches a dynamic template, that template will be applied regardless of other match predicates defined in the template. If a field is already defined in the mapping, then this parameter won't be used.

      Hide dynamic_templates attribute Show dynamic_templates attribute object
      • * string Additional properties
    • pipeline string

      The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

    • If true, the request's actions must target an index alias.

  • update object
    Hide update attributes Show update attributes object
  • delete object
    Hide delete attributes Show delete attributes object

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • errors boolean Required

      If true, one or more of the operations in the bulk request did not complete successfully.

    • items array[object] Required

      The result of each operation in the bulk request, in the order they were submitted.

      Hide items attribute Show items attribute object
    • took number Required

      The length of time, in milliseconds, it took to process the bulk request.

PUT /{index}/_bulk
curl \
 --request PUT 'http://api.example.com/{index}/_bulk' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{ \"index\" : { \"_index\" : \"test\", \"_id\" : \"1\" } }\n{ \"field1\" : \"value1\" }\n{ \"delete\" : { \"_index\" : \"test\", \"_id\" : \"2\" } }\n{ \"create\" : { \"_index\" : \"test\", \"_id\" : \"3\" } }\n{ \"field1\" : \"value3\" }\n{ \"update\" : {\"_id\" : \"1\", \"_index\" : \"test\"} }\n{ \"doc\" : {\"field2\" : \"value2\"} }"'
Run `POST _bulk` to perform multiple operations.
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_id" : "2" } }
{ "create" : { "_index" : "test", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }
When you run `POST _bulk` and use the `update` action, you can use `retry_on_conflict` as a field in the action itself (not in the extra payload line) to specify how many times an update should be retried in the case of a version conflict.
{ "update" : {"_id" : "1", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"} }
{ "update" : { "_id" : "0", "_index" : "index1", "retry_on_conflict" : 3} }
{ "script" : { "source": "ctx._source.counter += params.param1", "lang" : "painless", "params" : {"param1" : 1}}, "upsert" : {"counter" : 1}}
{ "update" : {"_id" : "2", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"}, "doc_as_upsert" : true }
{ "update" : {"_id" : "3", "_index" : "index1", "_source" : true} }
{ "doc" : {"field" : "value"} }
{ "update" : {"_id" : "4", "_index" : "index1"} }
{ "doc" : {"field" : "value"}, "_source": true}
To return only information about failed operations, run `POST /_bulk?filter_path=items.*.error`.
{ "update": {"_id": "5", "_index": "index1"} }
{ "doc": {"my_field": "foo"} }
{ "update": {"_id": "6", "_index": "index1"} }
{ "doc": {"my_field": "foo"} }
{ "create": {"_id": "7", "_index": "index1"} }
{ "my_field": "foo" }
Run `POST /_bulk` to perform a bulk request that consists of index and create actions with the `dynamic_templates` parameter. The bulk request creates two new fields `work_location` and `home_location` with type `geo_point` according to the `dynamic_templates` parameter. However, the `raw_location` field is created using default dynamic mapping rules, as a text field in that case since it is supplied as a string in the JSON document.
{ "index" : { "_index" : "my_index", "_id" : "1", "dynamic_templates": {"work_location": "geo_point"}} }
{ "field" : "value1", "work_location": "41.12,-71.34", "raw_location": "41.12,-71.34"}
{ "create" : { "_index" : "my_index", "_id" : "2", "dynamic_templates": {"home_location": "geo_point"}} }
{ "field" : "value2", "home_location": "41.12,-71.34"}
Response examples (200)
{
   "took": 30,
   "errors": false,
   "items": [
      {
         "index": {
            "_index": "test",
            "_id": "1",
            "_version": 1,
            "result": "created",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 201,
            "_seq_no" : 0,
            "_primary_term": 1
         }
      },
      {
         "delete": {
            "_index": "test",
            "_id": "2",
            "_version": 1,
            "result": "not_found",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 404,
            "_seq_no" : 1,
            "_primary_term" : 2
         }
      },
      {
         "create": {
            "_index": "test",
            "_id": "3",
            "_version": 1,
            "result": "created",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 201,
            "_seq_no" : 2,
            "_primary_term" : 3
         }
      },
      {
         "update": {
            "_index": "test",
            "_id": "1",
            "_version": 2,
            "result": "updated",
            "_shards": {
                "total": 2,
                "successful": 1,
                "failed": 0
            },
            "status": 200,
            "_seq_no" : 3,
            "_primary_term" : 4
         }
      }
   ]
}
If you run `POST /_bulk` with operations that update non-existent documents, the operations cannot complete successfully. The API returns a response with an `errors` property value `true`. The response also includes an error object for any failed operations. The error object contains additional information about the failure, such as the error type and reason.
{
  "took": 486,
  "errors": true,
  "items": [
    {
      "update": {
        "_index": "index1",
        "_id": "5",
        "status": 404,
        "error": {
          "type": "document_missing_exception",
          "reason": "[5]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "update": {
        "_index": "index1",
        "_id": "6",
        "status": 404,
        "error": {
          "type": "document_missing_exception",
          "reason": "[6]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "create": {
        "_index": "index1",
        "_id": "7",
        "_version": 1,
        "result": "created",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "_seq_no": 0,
        "_primary_term": 1,
        "status": 201
      }
    }
  ]
}
An example response from `POST /_bulk?filter_path=items.*.error`, which returns only information about failed operations.
{
  "items": [
    {
      "update": {
        "error": {
          "type": "document_missing_exception",
          "reason": "[5]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "update": {
        "error": {
          "type": "document_missing_exception",
          "reason": "[6]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    }
  ]
}

Bulk index or delete documents

POST /{index}/_bulk

Perform multiple index, create, delete, and update actions in a single request. This reduces overhead and can greatly increase indexing speed.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:

  • To use the create action, you must have the create_doc, create, index, or write index privilege. Data streams support only the create action.
  • To use the index action, you must have the create, index, or write index privilege.
  • To use the delete action, you must have the delete or write index privilege.
  • To use the update action, you must have the index or write index privilege.
  • To automatically create a data stream or index with a bulk API request, you must have the auto_configure, create_index, or manage index privilege.
  • To make the result of a bulk operation visible to search using the refresh parameter, you must have the maintenance or manage index privilege.

Automatic data stream creation requires a matching index template with data stream enabled.

The actions are specified in the request body using a newline delimited JSON (NDJSON) structure:

action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n

The index and create actions expect a source on the next line and have the same semantics as the op_type parameter in the standard index API. A create action fails if a document with the same ID already exists in the target An index action adds or replaces a document as necessary.

NOTE: Data streams support only the create action. To update or delete a document in a data stream, you must target the backing index containing the document.

An update action expects that the partial doc, upsert, and script and its options are specified on the next line.

A delete action does not expect a source on the next line and has the same semantics as the standard delete API.

NOTE: The final line of data must end with a newline character (\n). Each newline character may be preceded by a carriage return (\r). When sending NDJSON data to the _bulk endpoint, use a Content-Type header of application/json or application/x-ndjson. Because this format uses literal newline characters (\n) as delimiters, make sure that the JSON actions and sources are not pretty printed.

If you provide a target in the request path, it is used for any actions that don't explicitly specify an _index argument.

A note on the format: the idea here is to make processing as fast as possible. As some of the actions are redirected to other shards on other nodes, only action_meta_data is parsed on the receiving node side.

Client libraries using this protocol should try and strive to do something similar on the client side, and reduce buffering as much as possible.

There is no "correct" number of actions to perform in a single bulk request. Experiment with different settings to find the optimal size for your particular workload. Note that Elasticsearch limits the maximum size of a HTTP request to 100mb by default so clients must ensure that no request exceeds this size. It is not possible to index a single document that exceeds the size limit, so you must pre-process any such documents into smaller pieces before sending them to Elasticsearch. For instance, split documents into pages or chapters before indexing them, or store raw binary data in a system outside Elasticsearch and replace the raw data with a link to the external system in the documents that you send to Elasticsearch.

Client suppport for bulk requests

Some of the officially supported clients provide helpers to assist with bulk requests and reindexing:

  • Go: Check out esutil.BulkIndexer
  • Perl: Check out Search::Elasticsearch::Client::5_0::Bulk and Search::Elasticsearch::Client::5_0::Scroll
  • Python: Check out elasticsearch.helpers.*
  • JavaScript: Check out client.helpers.*
  • .NET: Check out BulkAllObservable
  • PHP: Check out bulk indexing.

Submitting bulk requests with cURL

If you're providing text file input to curl, you must use the --data-binary flag instead of plain -d. The latter doesn't preserve newlines. For example:

$ cat requests
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
$ curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/_bulk --data-binary "@requests"; echo
{"took":7, "errors": false, "items":[{"index":{"_index":"test","_id":"1","_version":1,"result":"created","forced_refresh":false}}]}

Optimistic concurrency control

Each index and delete action within a bulk API call may include the if_seq_no and if_primary_term parameters in their respective action and meta data lines. The if_seq_no and if_primary_term parameters control how operations are run, based on the last modification to existing documents. See Optimistic concurrency control for more details.

Versioning

Each bulk item can include the version value using the version field. It automatically follows the behavior of the index or delete operation based on the _version mapping. It also support the version_type.

Routing

Each bulk item can include the routing value using the routing field. It automatically follows the behavior of the index or delete operation based on the _routing mapping.

NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing setting enabled in the template.

Wait for active shards

When making bulk calls, you can set the wait_for_active_shards parameter to require a minimum number of shard copies to be active before starting to process the bulk request.

Refresh

Control when the changes made by this request are visible to search.

NOTE: Only the shards that receive the bulk request will be affected by refresh. Imagine a _bulk?refresh=wait_for request with three documents in it that happen to be routed to different shards in an index with five shards. The request will only wait for those three shards to refresh. The other two shards that make up the index do not participate in the _bulk request at all.

Path parameters

  • index string Required

    The name of the data stream, index, or index alias to perform bulk actions on.

Query parameters

  • True or false if to include the document source in the error message in case of parsing errors.

  • If true, the response will include the ingest pipelines that were run for each index or create.

  • pipeline string

    The pipeline identifier to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

  • refresh string

    If true, Elasticsearch refreshes the affected shards to make this operation visible to search. If wait_for, wait for a refresh to make this operation visible to search. If false, do nothing with refreshes. Valid values: true, false, wait_for.

    Values are true, false, or wait_for.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • _source boolean | string | array[string]

    Indicates whether to return the _source field (true or false) or contains a list of fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in _source_includes query parameter. If the _source parameter is false, this parameter is ignored.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the _source_excludes query parameter. If the _source parameter is false, this parameter is ignored.

  • timeout string

    The period each action waits for the following operations: automatic index creation, dynamic mapping updates, and waiting for active shards. The default is 1m (one minute), which guarantees Elasticsearch waits for at least the timeout before failing. The actual wait time could be longer, particularly when multiple waits occur.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default is 1, which waits for each primary shard to be active.

  • If true, the request's actions must target an index alias.

  • If true, the request's actions must target a data stream (existing or to be created).

application/json

Body object Required

One of:
  • index object
    Hide index attributes Show index attributes object
    • _id string
    • _index string
    • routing string
    • version number
    • Values are internal, external, external_gte, or force.

    • A map from the full name of fields to the name of dynamic templates. It defaults to an empty map. If a name matches a dynamic template, that template will be applied regardless of other match predicates defined in the template. If a field is already defined in the mapping, then this parameter won't be used.

      Hide dynamic_templates attribute Show dynamic_templates attribute object
      • * string Additional properties
    • pipeline string

      The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

    • If true, the request's actions must target an index alias.

  • create object
    Hide create attributes Show create attributes object
    • _id string
    • _index string
    • routing string
    • version number
    • Values are internal, external, external_gte, or force.

    • A map from the full name of fields to the name of dynamic templates. It defaults to an empty map. If a name matches a dynamic template, that template will be applied regardless of other match predicates defined in the template. If a field is already defined in the mapping, then this parameter won't be used.

      Hide dynamic_templates attribute Show dynamic_templates attribute object
      • * string Additional properties
    • pipeline string

      The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

    • If true, the request's actions must target an index alias.

  • update object
    Hide update attributes Show update attributes object
  • delete object
    Hide delete attributes Show delete attributes object

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • errors boolean Required

      If true, one or more of the operations in the bulk request did not complete successfully.

    • items array[object] Required

      The result of each operation in the bulk request, in the order they were submitted.

      Hide items attribute Show items attribute object
    • took number Required

      The length of time, in milliseconds, it took to process the bulk request.

POST /{index}/_bulk
curl \
 --request POST 'http://api.example.com/{index}/_bulk' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{ \"index\" : { \"_index\" : \"test\", \"_id\" : \"1\" } }\n{ \"field1\" : \"value1\" }\n{ \"delete\" : { \"_index\" : \"test\", \"_id\" : \"2\" } }\n{ \"create\" : { \"_index\" : \"test\", \"_id\" : \"3\" } }\n{ \"field1\" : \"value3\" }\n{ \"update\" : {\"_id\" : \"1\", \"_index\" : \"test\"} }\n{ \"doc\" : {\"field2\" : \"value2\"} }"'
Run `POST _bulk` to perform multiple operations.
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_id" : "2" } }
{ "create" : { "_index" : "test", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }
When you run `POST _bulk` and use the `update` action, you can use `retry_on_conflict` as a field in the action itself (not in the extra payload line) to specify how many times an update should be retried in the case of a version conflict.
{ "update" : {"_id" : "1", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"} }
{ "update" : { "_id" : "0", "_index" : "index1", "retry_on_conflict" : 3} }
{ "script" : { "source": "ctx._source.counter += params.param1", "lang" : "painless", "params" : {"param1" : 1}}, "upsert" : {"counter" : 1}}
{ "update" : {"_id" : "2", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"}, "doc_as_upsert" : true }
{ "update" : {"_id" : "3", "_index" : "index1", "_source" : true} }
{ "doc" : {"field" : "value"} }
{ "update" : {"_id" : "4", "_index" : "index1"} }
{ "doc" : {"field" : "value"}, "_source": true}
To return only information about failed operations, run `POST /_bulk?filter_path=items.*.error`.
{ "update": {"_id": "5", "_index": "index1"} }
{ "doc": {"my_field": "foo"} }
{ "update": {"_id": "6", "_index": "index1"} }
{ "doc": {"my_field": "foo"} }
{ "create": {"_id": "7", "_index": "index1"} }
{ "my_field": "foo" }
Run `POST /_bulk` to perform a bulk request that consists of index and create actions with the `dynamic_templates` parameter. The bulk request creates two new fields `work_location` and `home_location` with type `geo_point` according to the `dynamic_templates` parameter. However, the `raw_location` field is created using default dynamic mapping rules, as a text field in that case since it is supplied as a string in the JSON document.
{ "index" : { "_index" : "my_index", "_id" : "1", "dynamic_templates": {"work_location": "geo_point"}} }
{ "field" : "value1", "work_location": "41.12,-71.34", "raw_location": "41.12,-71.34"}
{ "create" : { "_index" : "my_index", "_id" : "2", "dynamic_templates": {"home_location": "geo_point"}} }
{ "field" : "value2", "home_location": "41.12,-71.34"}
Response examples (200)
{
   "took": 30,
   "errors": false,
   "items": [
      {
         "index": {
            "_index": "test",
            "_id": "1",
            "_version": 1,
            "result": "created",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 201,
            "_seq_no" : 0,
            "_primary_term": 1
         }
      },
      {
         "delete": {
            "_index": "test",
            "_id": "2",
            "_version": 1,
            "result": "not_found",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 404,
            "_seq_no" : 1,
            "_primary_term" : 2
         }
      },
      {
         "create": {
            "_index": "test",
            "_id": "3",
            "_version": 1,
            "result": "created",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 201,
            "_seq_no" : 2,
            "_primary_term" : 3
         }
      },
      {
         "update": {
            "_index": "test",
            "_id": "1",
            "_version": 2,
            "result": "updated",
            "_shards": {
                "total": 2,
                "successful": 1,
                "failed": 0
            },
            "status": 200,
            "_seq_no" : 3,
            "_primary_term" : 4
         }
      }
   ]
}
If you run `POST /_bulk` with operations that update non-existent documents, the operations cannot complete successfully. The API returns a response with an `errors` property value `true`. The response also includes an error object for any failed operations. The error object contains additional information about the failure, such as the error type and reason.
{
  "took": 486,
  "errors": true,
  "items": [
    {
      "update": {
        "_index": "index1",
        "_id": "5",
        "status": 404,
        "error": {
          "type": "document_missing_exception",
          "reason": "[5]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "update": {
        "_index": "index1",
        "_id": "6",
        "status": 404,
        "error": {
          "type": "document_missing_exception",
          "reason": "[6]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "create": {
        "_index": "index1",
        "_id": "7",
        "_version": 1,
        "result": "created",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "_seq_no": 0,
        "_primary_term": 1,
        "status": 201
      }
    }
  ]
}
An example response from `POST /_bulk?filter_path=items.*.error`, which returns only information about failed operations.
{
  "items": [
    {
      "update": {
        "error": {
          "type": "document_missing_exception",
          "reason": "[5]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "update": {
        "error": {
          "type": "document_missing_exception",
          "reason": "[6]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    }
  ]
}

Create a new document in the index Added in 5.0.0

PUT /{index}/_create/{id}

You can index a new JSON document with the /<target>/_doc/ or /<target>/_create/<_id> APIs Using _create guarantees that the document is indexed only if it does not already exist. It returns a 409 response when a document with a same ID already exists in the index. To update an existing document, you must use the /<target>/_doc/ API.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:

  • To add a document using the PUT /<target>/_create/<_id> or POST /<target>/_create/<_id> request formats, you must have the create_doc, create, index, or write index privilege.
  • To automatically create a data stream or index with this API request, you must have the auto_configure, create_index, or manage index privilege.

Automatic data stream creation requires a matching index template with data stream enabled.

Automatically create data streams and indices

If the request's target doesn't exist and matches an index template with a data_stream definition, the index operation automatically creates the data stream.

If the target doesn't exist and doesn't match a data stream template, the operation automatically creates the index and applies any matching index templates.

NOTE: Elasticsearch includes several built-in index templates. To avoid naming collisions with these templates, refer to index pattern documentation.

If no mapping exists, the index operation creates a dynamic mapping. By default, new fields and objects are automatically added to the mapping if needed.

Automatic index creation is controlled by the action.auto_create_index setting. If it is true, any index can be created automatically. You can modify this setting to explicitly allow or block automatic creation of indices that match specified patterns or set it to false to turn off automatic index creation entirely. Specify a comma-separated list of patterns you want to allow or prefix each pattern with + or - to indicate whether it should be allowed or blocked. When a list is specified, the default behaviour is to disallow.

NOTE: The action.auto_create_index setting affects the automatic creation of indices only. It does not affect the creation of data streams.

Routing

By default, shard placement — or routing — is controlled by using a hash of the document's ID value. For more explicit control, the value fed into the hash function used by the router can be directly specified on a per-operation basis using the routing parameter.

When setting up explicit mapping, you can also use the _routing field to direct the index operation to extract the routing value from the document itself. This does come at the (very minimal) cost of an additional document parsing pass. If the _routing mapping is defined and set to be required, the index operation will fail if no routing value is provided or extracted.

NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing setting enabled in the template.

Distributed

The index operation is directed to the primary shard based on its route and performed on the actual node containing this shard. After the primary shard completes the operation, if needed, the update is distributed to applicable replicas.

Active shards

To improve the resiliency of writes to the system, indexing operations can be configured to wait for a certain number of active shard copies before proceeding with the operation. If the requisite number of active shard copies are not available, then the write operation must wait and retry, until either the requisite shard copies have started or a timeout occurs. By default, write operations only wait for the primary shards to be active before proceeding (that is to say wait_for_active_shards is 1). This default can be overridden in the index settings dynamically by setting index.write.wait_for_active_shards. To alter this behavior per operation, use the wait_for_active_shards request parameter.

Valid values are all or any positive integer up to the total number of configured copies per shard in the index (which is number_of_replicas+1). Specifying a negative value or a number greater than the number of shard copies will throw an error.

For example, suppose you have a cluster of three nodes, A, B, and C and you create an index index with the number of replicas set to 3 (resulting in 4 shard copies, one more copy than there are nodes). If you attempt an indexing operation, by default the operation will only ensure the primary copy of each shard is available before proceeding. This means that even if B and C went down and A hosted the primary shard copies, the indexing operation would still proceed with only one copy of the data. If wait_for_active_shards is set on the request to 3 (and all three nodes are up), the indexing operation will require 3 active shard copies before proceeding. This requirement should be met because there are 3 active nodes in the cluster, each one holding a copy of the shard. However, if you set wait_for_active_shards to all (or to 4, which is the same in this situation), the indexing operation will not proceed as you do not have all 4 copies of each shard active in the index. The operation will timeout unless a new node is brought up in the cluster to host the fourth copy of the shard.

It is important to note that this setting greatly reduces the chances of the write operation not writing to the requisite number of shard copies, but it does not completely eliminate the possibility, because this check occurs before the write operation starts. After the write operation is underway, it is still possible for replication to fail on any number of shard copies but still succeed on the primary. The _shards section of the API response reveals the number of shard copies on which replication succeeded and failed.

External documentation

Path parameters

  • index string Required

    The name of the data stream or index to target. If the target doesn't exist and matches the name or wildcard (*) pattern of an index template with a data_stream definition, this request creates the data stream. If the target doesn't exist and doesn’t match a data stream template, this request creates the index.

  • id string Required

    A unique identifier for the document. To automatically generate a document ID, use the POST /<target>/_doc/ request format.

Query parameters

  • Only perform the operation if the document has this primary term.

  • Only perform the operation if the document has this sequence number.

  • True or false if to include the document source in the error message in case of parsing errors.

  • op_type string

    Set to create to only index the document if it does not already exist (put if absent). If a document with the specified _id already exists, the indexing operation will fail. The behavior is the same as using the <index>/_create endpoint. If a document ID is specified, this paramater defaults to index. Otherwise, it defaults to create. If the request targets a data stream, an op_type of create is required.

    Supported values include:

    • index: Overwrite any documents that already exist.
    • create: Only index documents that do not already exist.

    Values are index or create.

  • pipeline string

    The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

  • refresh string

    If true, Elasticsearch refreshes the affected shards to make this operation visible to search. If wait_for, it waits for a refresh to make this operation visible to search. If false, it does nothing with refreshes.

    Values are true, false, or wait_for.

  • If true, the destination must be an index alias.

  • If true, the request's actions must target a data stream (existing or to be created).

  • routing string

    A custom value that is used to route operations to a specific shard.

  • timeout string

    The period the request waits for the following operations: automatic index creation, dynamic mapping updates, waiting for active shards. Elasticsearch waits for at least the specified timeout period before failing. The actual wait time could be longer, particularly when multiple waits occur.

    This parameter is useful for situations where the primary shard assigned to perform the operation might not be available when the operation runs. Some reasons for this might be that the primary shard is currently recovering from a gateway or undergoing relocation. By default, the operation will wait on the primary shard to become available for at least 1 minute before failing and responding with an error. The actual wait time could be longer, particularly when multiple waits occur.

  • version number

    The explicit version number for concurrency control. It must be a non-negative long number.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. You can set it to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default value of 1 means it waits for each primary shard to be active.

application/json

Body Required

object object

Responses

PUT /{index}/_create/{id}
curl \
 --request PUT 'http://api.example.com/{index}/_create/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"@timestamp\": \"2099-11-15T13:12:00\",\n  \"message\": \"GET /search HTTP/1.1 200 1070000\",\n  \"user\": {\n    \"id\": \"kimchy\"\n  }\n}"'
Request example
Run `PUT my-index-000001/_create/1` to index a document into the `my-index-000001` index if no document with that ID exists.
{
  "@timestamp": "2099-11-15T13:12:00",
  "message": "GET /search HTTP/1.1 200 1070000",
  "user": {
    "id": "kimchy"
  }
}

Create a new document in the index Added in 5.0.0

POST /{index}/_create/{id}

You can index a new JSON document with the /<target>/_doc/ or /<target>/_create/<_id> APIs Using _create guarantees that the document is indexed only if it does not already exist. It returns a 409 response when a document with a same ID already exists in the index. To update an existing document, you must use the /<target>/_doc/ API.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:

  • To add a document using the PUT /<target>/_create/<_id> or POST /<target>/_create/<_id> request formats, you must have the create_doc, create, index, or write index privilege.
  • To automatically create a data stream or index with this API request, you must have the auto_configure, create_index, or manage index privilege.

Automatic data stream creation requires a matching index template with data stream enabled.

Automatically create data streams and indices

If the request's target doesn't exist and matches an index template with a data_stream definition, the index operation automatically creates the data stream.

If the target doesn't exist and doesn't match a data stream template, the operation automatically creates the index and applies any matching index templates.

NOTE: Elasticsearch includes several built-in index templates. To avoid naming collisions with these templates, refer to index pattern documentation.

If no mapping exists, the index operation creates a dynamic mapping. By default, new fields and objects are automatically added to the mapping if needed.

Automatic index creation is controlled by the action.auto_create_index setting. If it is true, any index can be created automatically. You can modify this setting to explicitly allow or block automatic creation of indices that match specified patterns or set it to false to turn off automatic index creation entirely. Specify a comma-separated list of patterns you want to allow or prefix each pattern with + or - to indicate whether it should be allowed or blocked. When a list is specified, the default behaviour is to disallow.

NOTE: The action.auto_create_index setting affects the automatic creation of indices only. It does not affect the creation of data streams.

Routing

By default, shard placement — or routing — is controlled by using a hash of the document's ID value. For more explicit control, the value fed into the hash function used by the router can be directly specified on a per-operation basis using the routing parameter.

When setting up explicit mapping, you can also use the _routing field to direct the index operation to extract the routing value from the document itself. This does come at the (very minimal) cost of an additional document parsing pass. If the _routing mapping is defined and set to be required, the index operation will fail if no routing value is provided or extracted.

NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing setting enabled in the template.

Distributed

The index operation is directed to the primary shard based on its route and performed on the actual node containing this shard. After the primary shard completes the operation, if needed, the update is distributed to applicable replicas.

Active shards

To improve the resiliency of writes to the system, indexing operations can be configured to wait for a certain number of active shard copies before proceeding with the operation. If the requisite number of active shard copies are not available, then the write operation must wait and retry, until either the requisite shard copies have started or a timeout occurs. By default, write operations only wait for the primary shards to be active before proceeding (that is to say wait_for_active_shards is 1). This default can be overridden in the index settings dynamically by setting index.write.wait_for_active_shards. To alter this behavior per operation, use the wait_for_active_shards request parameter.

Valid values are all or any positive integer up to the total number of configured copies per shard in the index (which is number_of_replicas+1). Specifying a negative value or a number greater than the number of shard copies will throw an error.

For example, suppose you have a cluster of three nodes, A, B, and C and you create an index index with the number of replicas set to 3 (resulting in 4 shard copies, one more copy than there are nodes). If you attempt an indexing operation, by default the operation will only ensure the primary copy of each shard is available before proceeding. This means that even if B and C went down and A hosted the primary shard copies, the indexing operation would still proceed with only one copy of the data. If wait_for_active_shards is set on the request to 3 (and all three nodes are up), the indexing operation will require 3 active shard copies before proceeding. This requirement should be met because there are 3 active nodes in the cluster, each one holding a copy of the shard. However, if you set wait_for_active_shards to all (or to 4, which is the same in this situation), the indexing operation will not proceed as you do not have all 4 copies of each shard active in the index. The operation will timeout unless a new node is brought up in the cluster to host the fourth copy of the shard.

It is important to note that this setting greatly reduces the chances of the write operation not writing to the requisite number of shard copies, but it does not completely eliminate the possibility, because this check occurs before the write operation starts. After the write operation is underway, it is still possible for replication to fail on any number of shard copies but still succeed on the primary. The _shards section of the API response reveals the number of shard copies on which replication succeeded and failed.

External documentation

Path parameters

  • index string Required

    The name of the data stream or index to target. If the target doesn't exist and matches the name or wildcard (*) pattern of an index template with a data_stream definition, this request creates the data stream. If the target doesn't exist and doesn’t match a data stream template, this request creates the index.

  • id string Required

    A unique identifier for the document. To automatically generate a document ID, use the POST /<target>/_doc/ request format.

Query parameters

  • Only perform the operation if the document has this primary term.

  • Only perform the operation if the document has this sequence number.

  • True or false if to include the document source in the error message in case of parsing errors.

  • op_type string

    Set to create to only index the document if it does not already exist (put if absent). If a document with the specified _id already exists, the indexing operation will fail. The behavior is the same as using the <index>/_create endpoint. If a document ID is specified, this paramater defaults to index. Otherwise, it defaults to create. If the request targets a data stream, an op_type of create is required.

    Supported values include:

    • index: Overwrite any documents that already exist.
    • create: Only index documents that do not already exist.

    Values are index or create.

  • pipeline string

    The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

  • refresh string

    If true, Elasticsearch refreshes the affected shards to make this operation visible to search. If wait_for, it waits for a refresh to make this operation visible to search. If false, it does nothing with refreshes.

    Values are true, false, or wait_for.

  • If true, the destination must be an index alias.

  • If true, the request's actions must target a data stream (existing or to be created).

  • routing string

    A custom value that is used to route operations to a specific shard.

  • timeout string

    The period the request waits for the following operations: automatic index creation, dynamic mapping updates, waiting for active shards. Elasticsearch waits for at least the specified timeout period before failing. The actual wait time could be longer, particularly when multiple waits occur.

    This parameter is useful for situations where the primary shard assigned to perform the operation might not be available when the operation runs. Some reasons for this might be that the primary shard is currently recovering from a gateway or undergoing relocation. By default, the operation will wait on the primary shard to become available for at least 1 minute before failing and responding with an error. The actual wait time could be longer, particularly when multiple waits occur.

  • version number

    The explicit version number for concurrency control. It must be a non-negative long number.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. You can set it to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default value of 1 means it waits for each primary shard to be active.

application/json

Body Required

object object

Responses

POST /{index}/_create/{id}
curl \
 --request POST 'http://api.example.com/{index}/_create/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"@timestamp\": \"2099-11-15T13:12:00\",\n  \"message\": \"GET /search HTTP/1.1 200 1070000\",\n  \"user\": {\n    \"id\": \"kimchy\"\n  }\n}"'
Request example
Run `PUT my-index-000001/_create/1` to index a document into the `my-index-000001` index if no document with that ID exists.
{
  "@timestamp": "2099-11-15T13:12:00",
  "message": "GET /search HTTP/1.1 200 1070000",
  "user": {
    "id": "kimchy"
  }
}

Get a document by its ID

GET /{index}/_doc/{id}

Get a document and its source or stored fields from an index.

By default, this API is realtime and is not affected by the refresh rate of the index (when data will become visible for search). In the case where stored fields are requested with the stored_fields parameter and the document has been updated but is not yet refreshed, the API will have to parse and analyze the source to extract the stored fields. To turn off realtime behavior, set the realtime parameter to false.

Source filtering

By default, the API returns the contents of the _source field unless you have used the stored_fields parameter or the _source field is turned off. You can turn off _source retrieval by using the _source parameter:

GET my-index-000001/_doc/0?_source=false

If you only need one or two fields from the _source, use the _source_includes or _source_excludes parameters to include or filter out particular fields. This can be helpful with large documents where partial retrieval can save on network overhead Both parameters take a comma separated list of fields or wildcard expressions. For example:

GET my-index-000001/_doc/0?_source_includes=*.id&_source_excludes=entities

If you only want to specify includes, you can use a shorter notation:

GET my-index-000001/_doc/0?_source=*.id

Routing

If routing is used during indexing, the routing value also needs to be specified to retrieve a document. For example:

GET my-index-000001/_doc/2?routing=user1

This request gets the document with ID 2, but it is routed based on the user. The document is not fetched if the correct routing is not specified.

Distributed

The GET operation is hashed into a specific shard ID. It is then redirected to one of the replicas within that shard ID and returns the result. The replicas are the primary shard and its replicas within that shard ID group. This means that the more replicas you have, the better your GET scaling will be.

Versioning support

You can use the version parameter to retrieve the document only if its current version is equal to the specified one.

Internally, Elasticsearch has marked the old document as deleted and added an entirely new document. The old version of the document doesn't disappear immediately, although you won't be able to access it. Elasticsearch cleans up deleted documents in the background as you continue to index more data.

Path parameters

  • index string Required

    The name of the index that contains the document.

  • id string Required

    A unique document identifier.

Query parameters

  • The node or shard the operation should be performed on. By default, the operation is randomized between the shard replicas.

    If it is set to _local, the operation will prefer to be run on a local allocated shard when possible. If it is set to a custom value, the value is used to guarantee that the same shards will be used for the same custom value. This can help with "jumping values" when hitting different shards in different refresh states. A sample value can be something like the web session ID or the user name.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • refresh boolean

    If true, the request refreshes the relevant shards before retrieving the document. Setting it to true should be done after careful thought and verification that this does not cause a heavy load on the system (and slow down indexing).

  • routing string

    A custom value used to route operations to a specific shard.

  • _source boolean | string | array[string]

    Indicates whether to return the _source field (true or false) or lists the fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in _source_includes query parameter. If the _source parameter is false, this parameter is ignored.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the _source_excludes query parameter. If the _source parameter is false, this parameter is ignored.

  • stored_fields string | array[string]

    A comma-separated list of stored fields to return as part of a hit. If no fields are specified, no stored fields are included in the response. If this field is specified, the _source parameter defaults to false. Only leaf fields can be retrieved with the stored_field option. Object fields can't be returned;​if specified, the request fails.

  • version number

    The version number for concurrency control. It must match the current version of the document for the request to succeed.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • _index string Required
    • fields object

      If the stored_fields parameter is set to true and found is true, it contains the document fields stored in the index.

      Hide fields attribute Show fields attribute object
      • * object Additional properties
    • _ignored array[string]
    • found boolean Required

      Indicates whether the document exists.

    • _id string Required
    • The primary term assigned to the document for the indexing operation.

    • _routing string

      The explicit routing, if set.

    • _seq_no number
    • _source object

      If found is true, it contains the document data formatted in JSON. If the _source parameter is set to false or the stored_fields parameter is set to true, it is excluded.

    • _version number
GET /{index}/_doc/{id}
curl \
 --request GET 'http://api.example.com/{index}/_doc/{id}' \
 --header "Authorization: $API_KEY"
A successful response from `GET my-index-000001/_doc/0`. It retrieves the JSON document with the `_id` 0 from the `my-index-000001` index.
{
  "_index": "my-index-000001",
  "_id": "0",
  "_version": 1,
  "_seq_no": 0,
  "_primary_term": 1,
  "found": true,
  "_source": {
    "@timestamp": "2099-11-15T14:12:12",
    "http": {
      "request": {
        "method": "get"
      },
      "response": {
        "status_code": 200,
        "bytes": 1070000
      },
      "version": "1.1"
    },
    "source": {
      "ip": "127.0.0.1"
    },
    "message": "GET /search HTTP/1.1 200 1070000",
    "user": {
      "id": "kimchy"
    }
  }
}
A successful response from `GET my-index-000001/_doc/1?stored_fields=tags,counter`, which retrieves a set of stored fields. Field values fetched from the document itself are always returned as an array. Any requested fields that are not stored (such as the counter field in this example) are ignored.
{
  "_index": "my-index-000001",
  "_id": "1",
  "_version": 1,
  "_seq_no" : 22,
  "_primary_term" : 1,
  "found": true,
  "fields": {
      "tags": [
        "production"
      ]
  }
}
A successful response from `GET my-index-000001/_doc/2?routing=user1&stored_fields=tags,counter`, which retrieves the `_routing` metadata field.
{
  "_index": "my-index-000001",
  "_id": "2",
  "_version": 1,
  "_seq_no" : 13,
  "_primary_term" : 1,
  "_routing": "user1",
  "found": true,
  "fields": {
      "tags": [
        "env2"
      ]
  }
}

Create or update a document in an index

PUT /{index}/_doc/{id}

Add a JSON document to the specified data stream or index and make it searchable. If the target is an index and the document already exists, the request updates the document and increments its version.

NOTE: You cannot use this API to send update requests for existing documents in a data stream.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:

  • To add or overwrite a document using the PUT /<target>/_doc/<_id> request format, you must have the create, index, or write index privilege.
  • To add a document using the POST /<target>/_doc/ request format, you must have the create_doc, create, index, or write index privilege.
  • To automatically create a data stream or index with this API request, you must have the auto_configure, create_index, or manage index privilege.

Automatic data stream creation requires a matching index template with data stream enabled.

NOTE: Replica shards might not all be started when an indexing operation returns successfully. By default, only the primary is required. Set wait_for_active_shards to change this default behavior.

Automatically create data streams and indices

If the request's target doesn't exist and matches an index template with a data_stream definition, the index operation automatically creates the data stream.

If the target doesn't exist and doesn't match a data stream template, the operation automatically creates the index and applies any matching index templates.

NOTE: Elasticsearch includes several built-in index templates. To avoid naming collisions with these templates, refer to index pattern documentation.

If no mapping exists, the index operation creates a dynamic mapping. By default, new fields and objects are automatically added to the mapping if needed.

Automatic index creation is controlled by the action.auto_create_index setting. If it is true, any index can be created automatically. You can modify this setting to explicitly allow or block automatic creation of indices that match specified patterns or set it to false to turn off automatic index creation entirely. Specify a comma-separated list of patterns you want to allow or prefix each pattern with + or - to indicate whether it should be allowed or blocked. When a list is specified, the default behaviour is to disallow.

NOTE: The action.auto_create_index setting affects the automatic creation of indices only. It does not affect the creation of data streams.

Optimistic concurrency control

Index operations can be made conditional and only be performed if the last modification to the document was assigned the sequence number and primary term specified by the if_seq_no and if_primary_term parameters. If a mismatch is detected, the operation will result in a VersionConflictException and a status code of 409.

Routing

By default, shard placement — or routing — is controlled by using a hash of the document's ID value. For more explicit control, the value fed into the hash function used by the router can be directly specified on a per-operation basis using the routing parameter.

When setting up explicit mapping, you can also use the _routing field to direct the index operation to extract the routing value from the document itself. This does come at the (very minimal) cost of an additional document parsing pass. If the _routing mapping is defined and set to be required, the index operation will fail if no routing value is provided or extracted.

NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing setting enabled in the template.

Distributed

The index operation is directed to the primary shard based on its route and performed on the actual node containing this shard. After the primary shard completes the operation, if needed, the update is distributed to applicable replicas.

Active shards

To improve the resiliency of writes to the system, indexing operations can be configured to wait for a certain number of active shard copies before proceeding with the operation. If the requisite number of active shard copies are not available, then the write operation must wait and retry, until either the requisite shard copies have started or a timeout occurs. By default, write operations only wait for the primary shards to be active before proceeding (that is to say wait_for_active_shards is 1). This default can be overridden in the index settings dynamically by setting index.write.wait_for_active_shards. To alter this behavior per operation, use the wait_for_active_shards request parameter.

Valid values are all or any positive integer up to the total number of configured copies per shard in the index (which is number_of_replicas+1). Specifying a negative value or a number greater than the number of shard copies will throw an error.

For example, suppose you have a cluster of three nodes, A, B, and C and you create an index index with the number of replicas set to 3 (resulting in 4 shard copies, one more copy than there are nodes). If you attempt an indexing operation, by default the operation will only ensure the primary copy of each shard is available before proceeding. This means that even if B and C went down and A hosted the primary shard copies, the indexing operation would still proceed with only one copy of the data. If wait_for_active_shards is set on the request to 3 (and all three nodes are up), the indexing operation will require 3 active shard copies before proceeding. This requirement should be met because there are 3 active nodes in the cluster, each one holding a copy of the shard. However, if you set wait_for_active_shards to all (or to 4, which is the same in this situation), the indexing operation will not proceed as you do not have all 4 copies of each shard active in the index. The operation will timeout unless a new node is brought up in the cluster to host the fourth copy of the shard.

It is important to note that this setting greatly reduces the chances of the write operation not writing to the requisite number of shard copies, but it does not completely eliminate the possibility, because this check occurs before the write operation starts. After the write operation is underway, it is still possible for replication to fail on any number of shard copies but still succeed on the primary. The _shards section of the API response reveals the number of shard copies on which replication succeeded and failed.

No operation (noop) updates

When updating a document by using this API, a new version of the document is always created even if the document hasn't changed. If this isn't acceptable use the _update API with detect_noop set to true. The detect_noop option isn't available on this API because it doesn’t fetch the old source and isn't able to compare it against the new source.

There isn't a definitive rule for when noop updates aren't acceptable. It's a combination of lots of factors like how frequently your data source sends updates that are actually noops and how many queries per second Elasticsearch runs on the shard receiving the updates.

Versioning

Each indexed document is given a version number. By default, internal versioning is used that starts at 1 and increments with each update, deletes included. Optionally, the version number can be set to an external value (for example, if maintained in a database). To enable this functionality, version_type should be set to external. The value provided must be a numeric, long value greater than or equal to 0, and less than around 9.2e+18.

NOTE: Versioning is completely real time, and is not affected by the near real time aspects of search operations. If no version is provided, the operation runs without any version checks.

When using the external version type, the system checks to see if the version number passed to the index request is greater than the version of the currently stored document. If true, the document will be indexed and the new version number used. If the value provided is less than or equal to the stored document's version number, a version conflict will occur and the index operation will fail. For example:

PUT my-index-000001/_doc/1?version=2&version_type=external
{
  "user": {
    "id": "elkbee"
  }
}

In this example, the operation will succeed since the supplied version of 2 is higher than the current document version of 1.
If the document was already updated and its version was set to 2 or higher, the indexing command will fail and result in a conflict (409 HTTP status code).

A nice side effect is that there is no need to maintain strict ordering of async indexing operations run as a result of changes to a source database, as long as version numbers from the source database are used.
Even the simple case of updating the Elasticsearch index using data from a database is simplified if external versioning is used, as only the latest version will be used if the index operations arrive out of order.
External documentation

Path parameters

  • index string Required

    The name of the data stream or index to target. If the target doesn't exist and matches the name or wildcard (*) pattern of an index template with a data_stream definition, this request creates the data stream. If the target doesn't exist and doesn't match a data stream template, this request creates the index. You can check for existing targets with the resolve index API.

  • id string Required

    A unique identifier for the document. To automatically generate a document ID, use the POST /<target>/_doc/ request format and omit this parameter.

Query parameters

  • Only perform the operation if the document has this primary term.

  • Only perform the operation if the document has this sequence number.

  • True or false if to include the document source in the error message in case of parsing errors.

  • op_type string

    Set to create to only index the document if it does not already exist (put if absent). If a document with the specified _id already exists, the indexing operation will fail. The behavior is the same as using the <index>/_create endpoint. If a document ID is specified, this paramater defaults to index. Otherwise, it defaults to create. If the request targets a data stream, an op_type of create is required.

    Supported values include:

    • index: Overwrite any documents that already exist.
    • create: Only index documents that do not already exist.

    Values are index or create.

  • pipeline string

    The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, then setting the value to _none disables the default ingest pipeline for this request. If a final pipeline is configured it will always run, regardless of the value of this parameter.

  • refresh string

    If true, Elasticsearch refreshes the affected shards to make this operation visible to search. If wait_for, it waits for a refresh to make this operation visible to search. If false, it does nothing with refreshes.

    Values are true, false, or wait_for.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • timeout string

    The period the request waits for the following operations: automatic index creation, dynamic mapping updates, waiting for active shards.

    This parameter is useful for situations where the primary shard assigned to perform the operation might not be available when the operation runs. Some reasons for this might be that the primary shard is currently recovering from a gateway or undergoing relocation. By default, the operation will wait on the primary shard to become available for at least 1 minute before failing and responding with an error. The actual wait time could be longer, particularly when multiple waits occur.

  • version number

    An explicit version number for concurrency control. It must be a non-negative long number.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. You can set it to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default value of 1 means it waits for each primary shard to be active.

  • If true, the destination must be an index alias.

application/json

Body Required

object object

Responses

PUT /{index}/_doc/{id}
curl \
 --request PUT 'http://api.example.com/{index}/_doc/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"@timestamp\": \"2099-11-15T13:12:00\",\n  \"message\": \"GET /search HTTP/1.1 200 1070000\",\n  \"user\": {\n    \"id\": \"kimchy\"\n  }\n}"'
Request examples
Run `POST my-index-000001/_doc/` to index a document. When you use the `POST /<target>/_doc/` request format, the `op_type` is automatically set to `create` and the index operation generates a unique ID for the document.
{
  "@timestamp": "2099-11-15T13:12:00",
  "message": "GET /search HTTP/1.1 200 1070000",
  "user": {
    "id": "kimchy"
  }
}
Run `PUT my-index-000001/_doc/1` to insert a JSON document into the `my-index-000001` index with an `_id` of 1.
{
  "@timestamp": "2099-11-15T13:12:00",
  "message": "GET /search HTTP/1.1 200 1070000",
  "user": {
    "id": "kimchy"
  }
}
Response examples (200)
A successful response from `POST my-index-000001/_doc/`, which contains an automated document ID.
{
  "_shards": {
    "total": 2,
    "failed": 0,
    "successful": 2
  },
  "_index": "my-index-000001",
  "_id": "W0tpsmIBdwcYyG50zbta",
  "_version": 1,
  "_seq_no": 0,
  "_primary_term": 1,
  "result": "created"
}
A successful response from `PUT my-index-000001/_doc/1`.
{
  "_shards": {
    "total": 2,
    "failed": 0,
    "successful": 2
  },
  "_index": "my-index-000001",
  "_id": "1",
  "_version": 1,
  "_seq_no": 0,
  "_primary_term": 1,
  "result": "created"
}

Create or update a document in an index

POST /{index}/_doc/{id}

Add a JSON document to the specified data stream or index and make it searchable. If the target is an index and the document already exists, the request updates the document and increments its version.

NOTE: You cannot use this API to send update requests for existing documents in a data stream.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:

  • To add or overwrite a document using the PUT /<target>/_doc/<_id> request format, you must have the create, index, or write index privilege.
  • To add a document using the POST /<target>/_doc/ request format, you must have the create_doc, create, index, or write index privilege.
  • To automatically create a data stream or index with this API request, you must have the auto_configure, create_index, or manage index privilege.

Automatic data stream creation requires a matching index template with data stream enabled.

NOTE: Replica shards might not all be started when an indexing operation returns successfully. By default, only the primary is required. Set wait_for_active_shards to change this default behavior.

Automatically create data streams and indices

If the request's target doesn't exist and matches an index template with a data_stream definition, the index operation automatically creates the data stream.

If the target doesn't exist and doesn't match a data stream template, the operation automatically creates the index and applies any matching index templates.

NOTE: Elasticsearch includes several built-in index templates. To avoid naming collisions with these templates, refer to index pattern documentation.

If no mapping exists, the index operation creates a dynamic mapping. By default, new fields and objects are automatically added to the mapping if needed.

Automatic index creation is controlled by the action.auto_create_index setting. If it is true, any index can be created automatically. You can modify this setting to explicitly allow or block automatic creation of indices that match specified patterns or set it to false to turn off automatic index creation entirely. Specify a comma-separated list of patterns you want to allow or prefix each pattern with + or - to indicate whether it should be allowed or blocked. When a list is specified, the default behaviour is to disallow.

NOTE: The action.auto_create_index setting affects the automatic creation of indices only. It does not affect the creation of data streams.

Optimistic concurrency control

Index operations can be made conditional and only be performed if the last modification to the document was assigned the sequence number and primary term specified by the if_seq_no and if_primary_term parameters. If a mismatch is detected, the operation will result in a VersionConflictException and a status code of 409.

Routing

By default, shard placement — or routing — is controlled by using a hash of the document's ID value. For more explicit control, the value fed into the hash function used by the router can be directly specified on a per-operation basis using the routing parameter.

When setting up explicit mapping, you can also use the _routing field to direct the index operation to extract the routing value from the document itself. This does come at the (very minimal) cost of an additional document parsing pass. If the _routing mapping is defined and set to be required, the index operation will fail if no routing value is provided or extracted.

NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing setting enabled in the template.

Distributed

The index operation is directed to the primary shard based on its route and performed on the actual node containing this shard. After the primary shard completes the operation, if needed, the update is distributed to applicable replicas.

Active shards

To improve the resiliency of writes to the system, indexing operations can be configured to wait for a certain number of active shard copies before proceeding with the operation. If the requisite number of active shard copies are not available, then the write operation must wait and retry, until either the requisite shard copies have started or a timeout occurs. By default, write operations only wait for the primary shards to be active before proceeding (that is to say wait_for_active_shards is 1). This default can be overridden in the index settings dynamically by setting index.write.wait_for_active_shards. To alter this behavior per operation, use the wait_for_active_shards request parameter.

Valid values are all or any positive integer up to the total number of configured copies per shard in the index (which is number_of_replicas+1). Specifying a negative value or a number greater than the number of shard copies will throw an error.

For example, suppose you have a cluster of three nodes, A, B, and C and you create an index index with the number of replicas set to 3 (resulting in 4 shard copies, one more copy than there are nodes). If you attempt an indexing operation, by default the operation will only ensure the primary copy of each shard is available before proceeding. This means that even if B and C went down and A hosted the primary shard copies, the indexing operation would still proceed with only one copy of the data. If wait_for_active_shards is set on the request to 3 (and all three nodes are up), the indexing operation will require 3 active shard copies before proceeding. This requirement should be met because there are 3 active nodes in the cluster, each one holding a copy of the shard. However, if you set wait_for_active_shards to all (or to 4, which is the same in this situation), the indexing operation will not proceed as you do not have all 4 copies of each shard active in the index. The operation will timeout unless a new node is brought up in the cluster to host the fourth copy of the shard.

It is important to note that this setting greatly reduces the chances of the write operation not writing to the requisite number of shard copies, but it does not completely eliminate the possibility, because this check occurs before the write operation starts. After the write operation is underway, it is still possible for replication to fail on any number of shard copies but still succeed on the primary. The _shards section of the API response reveals the number of shard copies on which replication succeeded and failed.

No operation (noop) updates

When updating a document by using this API, a new version of the document is always created even if the document hasn't changed. If this isn't acceptable use the _update API with detect_noop set to true. The detect_noop option isn't available on this API because it doesn’t fetch the old source and isn't able to compare it against the new source.

There isn't a definitive rule for when noop updates aren't acceptable. It's a combination of lots of factors like how frequently your data source sends updates that are actually noops and how many queries per second Elasticsearch runs on the shard receiving the updates.

Versioning

Each indexed document is given a version number. By default, internal versioning is used that starts at 1 and increments with each update, deletes included. Optionally, the version number can be set to an external value (for example, if maintained in a database). To enable this functionality, version_type should be set to external. The value provided must be a numeric, long value greater than or equal to 0, and less than around 9.2e+18.

NOTE: Versioning is completely real time, and is not affected by the near real time aspects of search operations. If no version is provided, the operation runs without any version checks.

When using the external version type, the system checks to see if the version number passed to the index request is greater than the version of the currently stored document. If true, the document will be indexed and the new version number used. If the value provided is less than or equal to the stored document's version number, a version conflict will occur and the index operation will fail. For example:

PUT my-index-000001/_doc/1?version=2&version_type=external
{
  "user": {
    "id": "elkbee"
  }
}

In this example, the operation will succeed since the supplied version of 2 is higher than the current document version of 1.
If the document was already updated and its version was set to 2 or higher, the indexing command will fail and result in a conflict (409 HTTP status code).

A nice side effect is that there is no need to maintain strict ordering of async indexing operations run as a result of changes to a source database, as long as version numbers from the source database are used.
Even the simple case of updating the Elasticsearch index using data from a database is simplified if external versioning is used, as only the latest version will be used if the index operations arrive out of order.
External documentation

Path parameters

  • index string Required

    The name of the data stream or index to target. If the target doesn't exist and matches the name or wildcard (*) pattern of an index template with a data_stream definition, this request creates the data stream. If the target doesn't exist and doesn't match a data stream template, this request creates the index. You can check for existing targets with the resolve index API.

  • id string Required

    A unique identifier for the document. To automatically generate a document ID, use the POST /<target>/_doc/ request format and omit this parameter.

Query parameters

  • Only perform the operation if the document has this primary term.

  • Only perform the operation if the document has this sequence number.

  • True or false if to include the document source in the error message in case of parsing errors.

  • op_type string

    Set to create to only index the document if it does not already exist (put if absent). If a document with the specified _id already exists, the indexing operation will fail. The behavior is the same as using the <index>/_create endpoint. If a document ID is specified, this paramater defaults to index. Otherwise, it defaults to create. If the request targets a data stream, an op_type of create is required.

    Supported values include:

    • index: Overwrite any documents that already exist.
    • create: Only index documents that do not already exist.

    Values are index or create.

  • pipeline string

    The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, then setting the value to _none disables the default ingest pipeline for this request. If a final pipeline is configured it will always run, regardless of the value of this parameter.

  • refresh string

    If true, Elasticsearch refreshes the affected shards to make this operation visible to search. If wait_for, it waits for a refresh to make this operation visible to search. If false, it does nothing with refreshes.

    Values are true, false, or wait_for.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • timeout string

    The period the request waits for the following operations: automatic index creation, dynamic mapping updates, waiting for active shards.

    This parameter is useful for situations where the primary shard assigned to perform the operation might not be available when the operation runs. Some reasons for this might be that the primary shard is currently recovering from a gateway or undergoing relocation. By default, the operation will wait on the primary shard to become available for at least 1 minute before failing and responding with an error. The actual wait time could be longer, particularly when multiple waits occur.

  • version number

    An explicit version number for concurrency control. It must be a non-negative long number.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. You can set it to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default value of 1 means it waits for each primary shard to be active.

  • If true, the destination must be an index alias.

application/json

Body Required

object object

Responses

POST /{index}/_doc/{id}
curl \
 --request POST 'http://api.example.com/{index}/_doc/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"@timestamp\": \"2099-11-15T13:12:00\",\n  \"message\": \"GET /search HTTP/1.1 200 1070000\",\n  \"user\": {\n    \"id\": \"kimchy\"\n  }\n}"'
Request examples
Run `POST my-index-000001/_doc/` to index a document. When you use the `POST /<target>/_doc/` request format, the `op_type` is automatically set to `create` and the index operation generates a unique ID for the document.
{
  "@timestamp": "2099-11-15T13:12:00",
  "message": "GET /search HTTP/1.1 200 1070000",
  "user": {
    "id": "kimchy"
  }
}
Run `PUT my-index-000001/_doc/1` to insert a JSON document into the `my-index-000001` index with an `_id` of 1.
{
  "@timestamp": "2099-11-15T13:12:00",
  "message": "GET /search HTTP/1.1 200 1070000",
  "user": {
    "id": "kimchy"
  }
}
Response examples (200)
A successful response from `POST my-index-000001/_doc/`, which contains an automated document ID.
{
  "_shards": {
    "total": 2,
    "failed": 0,
    "successful": 2
  },
  "_index": "my-index-000001",
  "_id": "W0tpsmIBdwcYyG50zbta",
  "_version": 1,
  "_seq_no": 0,
  "_primary_term": 1,
  "result": "created"
}
A successful response from `PUT my-index-000001/_doc/1`.
{
  "_shards": {
    "total": 2,
    "failed": 0,
    "successful": 2
  },
  "_index": "my-index-000001",
  "_id": "1",
  "_version": 1,
  "_seq_no": 0,
  "_primary_term": 1,
  "result": "created"
}

Delete a document

DELETE /{index}/_doc/{id}

Remove a JSON document from the specified index.

NOTE: You cannot send deletion requests directly to a data stream. To delete a document in a data stream, you must target the backing index containing the document.

Optimistic concurrency control

Delete operations can be made conditional and only be performed if the last modification to the document was assigned the sequence number and primary term specified by the if_seq_no and if_primary_term parameters. If a mismatch is detected, the operation will result in a VersionConflictException and a status code of 409.

Versioning

Each document indexed is versioned. When deleting a document, the version can be specified to make sure the relevant document you are trying to delete is actually being deleted and it has not changed in the meantime. Every write operation run on a document, deletes included, causes its version to be incremented. The version number of a deleted document remains available for a short time after deletion to allow for control of concurrent operations. The length of time for which a deleted document's version remains available is determined by the index.gc_deletes index setting.

Routing

If routing is used during indexing, the routing value also needs to be specified to delete a document.

If the _routing mapping is set to required and no routing value is specified, the delete API throws a RoutingMissingException and rejects the request.

For example:

DELETE /my-index-000001/_doc/1?routing=shard-1

This request deletes the document with ID 1, but it is routed based on the user. The document is not deleted if the correct routing is not specified.

Distributed

The delete operation gets hashed into a specific shard ID. It then gets redirected into the primary shard within that ID group and replicated (if needed) to shard replicas within that ID group.

Path parameters

  • index string Required

    The name of the target index.

  • id string Required

    A unique identifier for the document.

Query parameters

  • Only perform the operation if the document has this primary term.

  • Only perform the operation if the document has this sequence number.

  • refresh string

    If true, Elasticsearch refreshes the affected shards to make this operation visible to search. If wait_for, it waits for a refresh to make this operation visible to search. If false, it does nothing with refreshes.

    Values are true, false, or wait_for.

  • routing string

    A custom value used to route operations to a specific shard.

  • timeout string

    The period to wait for active shards.

    This parameter is useful for situations where the primary shard assigned to perform the delete operation might not be available when the delete operation runs. Some reasons for this might be that the primary shard is currently recovering from a store or undergoing relocation. By default, the delete operation will wait on the primary shard to become available for up to 1 minute before failing and responding with an error.

  • version number

    An explicit version number for concurrency control. It must match the current version of the document for the request to succeed.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

  • wait_for_active_shards number | string

    The minimum number of shard copies that must be active before proceeding with the operation. You can set it to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default value of 1 means it waits for each primary shard to be active.

Responses

DELETE /{index}/_doc/{id}
curl \
 --request DELETE 'http://api.example.com/{index}/_doc/{id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `DELETE /my-index-000001/_doc/1`, which deletes the JSON document 1 from the `my-index-000001` index.
{
  "_shards": {
    "total": 2,
    "failed": 0,
    "successful": 2
  },
  "_index": "my-index-000001",
  "_id": "1",
  "_version": 2,
  "_primary_term": 1,
  "_seq_no": 5,
  "result": "deleted"
}

Check a document

HEAD /{index}/_doc/{id}

Verify that a document exists. For example, check to see if a document with the _id 0 exists:

HEAD my-index-000001/_doc/0

If the document exists, the API returns a status code of 200 - OK. If the document doesn’t exist, the API returns 404 - Not Found.

Versioning support

You can use the version parameter to check the document only if its current version is equal to the specified one.

Internally, Elasticsearch has marked the old document as deleted and added an entirely new document. The old version of the document doesn't disappear immediately, although you won't be able to access it. Elasticsearch cleans up deleted documents in the background as you continue to index more data.

Path parameters

  • index string Required

    A comma-separated list of data streams, indices, and aliases. It supports wildcards (*).

  • id string Required

    A unique document identifier.

Query parameters

  • The node or shard the operation should be performed on. By default, the operation is randomized between the shard replicas.

    If it is set to _local, the operation will prefer to be run on a local allocated shard when possible. If it is set to a custom value, the value is used to guarantee that the same shards will be used for the same custom value. This can help with "jumping values" when hitting different shards in different refresh states. A sample value can be something like the web session ID or the user name.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • refresh boolean

    If true, the request refreshes the relevant shards before retrieving the document. Setting it to true should be done after careful thought and verification that this does not cause a heavy load on the system (and slow down indexing).

  • routing string

    A custom value used to route operations to a specific shard.

  • _source boolean | string | array[string]

    Indicates whether to return the _source field (true or false) or lists the fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in _source_includes query parameter. If the _source parameter is false, this parameter is ignored.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the _source_excludes query parameter. If the _source parameter is false, this parameter is ignored.

  • stored_fields string | array[string]

    A comma-separated list of stored fields to return as part of a hit. If no fields are specified, no stored fields are included in the response. If this field is specified, the _source parameter defaults to false.

  • version number

    Explicit version number for concurrency control. The specified version must match the current version of the document for the request to succeed.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

Responses

HEAD /{index}/_doc/{id}
HEAD my-index-000001/_doc/0
curl -I "localhost:9200/my-index-000001/_doc/0?pretty"
const response = await client.exists({
  index: "my-index-000001",
  id: 0,
});
console.log(response);
resp = client.exists(
  index="my-index-000001",
  id="0",
)
print(resp)
response = client.exists(
  index: 'my-index-000001',
  id: 0
)
puts response

Delete documents Added in 5.0.0

POST /{index}/_delete_by_query

Deletes documents that match the specified query.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or alias:

  • read
  • delete or write

You can specify the query criteria in the request URI or the request body using the same syntax as the search API. When you submit a delete by query request, Elasticsearch gets a snapshot of the data stream or index when it begins processing the request and deletes matching documents using internal versioning. If a document changes between the time that the snapshot is taken and the delete operation is processed, it results in a version conflict and the delete operation fails.

NOTE: Documents with a version equal to 0 cannot be deleted using delete by query because internal versioning does not support 0 as a valid version number.

While processing a delete by query request, Elasticsearch performs multiple search requests sequentially to find all of the matching documents to delete. A bulk delete request is performed for each batch of matching documents. If a search or bulk request is rejected, the requests are retried up to 10 times, with exponential back off. If the maximum retry limit is reached, processing halts and all failed requests are returned in the response. Any delete requests that completed successfully still stick, they are not rolled back.

You can opt to count version conflicts instead of halting and returning by setting conflicts to proceed. Note that if you opt to count version conflicts the operation could attempt to delete more documents from the source than max_docs until it has successfully deleted max_docs documents, or it has gone through every document in the source query.

Throttling delete requests

To control the rate at which delete by query issues batches of delete operations, you can set requests_per_second to any positive decimal number. This pads each batch with a wait time to throttle the rate. Set requests_per_second to -1 to disable throttling.

Throttling uses a wait time between batches so that the internal scroll requests can be given a timeout that takes the request padding into account. The padding time is the difference between the batch size divided by the requests_per_second and the time spent writing. By default the batch size is 1000, so if requests_per_second is set to 500:

target_time = 1000 / 500 per second = 2 seconds
wait_time = target_time - write_time = 2 seconds - .5 seconds = 1.5 seconds

Since the batch is issued as a single _bulk request, large batch sizes cause Elasticsearch to create many requests and wait before starting the next set. This is "bursty" instead of "smooth".

Slicing

Delete by query supports sliced scroll to parallelize the delete process. This can improve efficiency and provide a convenient way to break the request down into smaller parts.

Setting slices to auto lets Elasticsearch choose the number of slices to use. This setting will use one slice per shard, up to a certain limit. If there are multiple source data streams or indices, it will choose the number of slices based on the index or backing index with the smallest number of shards. Adding slices to the delete by query operation creates sub-requests which means it has some quirks:

  • You can see these requests in the tasks APIs. These sub-requests are "child" tasks of the task for the request with slices.
  • Fetching the status of the task for the request with slices only contains the status of completed slices.
  • These sub-requests are individually addressable for things like cancellation and rethrottling.
  • Rethrottling the request with slices will rethrottle the unfinished sub-request proportionally.
  • Canceling the request with slices will cancel each sub-request.
  • Due to the nature of slices each sub-request won't get a perfectly even portion of the documents. All documents will be addressed, but some slices may be larger than others. Expect larger slices to have a more even distribution.
  • Parameters like requests_per_second and max_docs on a request with slices are distributed proportionally to each sub-request. Combine that with the earlier point about distribution being uneven and you should conclude that using max_docs with slices might not result in exactly max_docs documents being deleted.
  • Each sub-request gets a slightly different snapshot of the source data stream or index though these are all taken at approximately the same time.

If you're slicing manually or otherwise tuning automatic slicing, keep in mind that:

  • Query performance is most efficient when the number of slices is equal to the number of shards in the index or backing index. If that number is large (for example, 500), choose a lower number as too many slices hurts performance. Setting slices higher than the number of shards generally does not improve efficiency and adds overhead.
  • Delete performance scales linearly across available resources with the number of slices.

Whether query or delete performance dominates the runtime depends on the documents being reindexed and cluster resources.

Cancel a delete by query operation

Any delete by query can be canceled using the task cancel API. For example:

POST _tasks/r1A2WoRbTwKZ516z6NEs5A:36619/_cancel

The task ID can be found by using the get tasks API.

Cancellation should happen quickly but might take a few seconds. The get task status API will continue to list the delete by query task until this task checks that it has been cancelled and terminates itself.

Path parameters

  • index string | array[string] Required

    A comma-separated list of data streams, indices, and aliases to search. It supports wildcards (*). To search all data streams or indices, omit this parameter or use * or _all.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • analyzer string

    Analyzer to use for the query string. This parameter can be used only when the q query string parameter is specified.

  • If true, wildcard and prefix queries are analyzed. This parameter can be used only when the q query string parameter is specified.

  • What to do if delete by query hits version conflicts: abort or proceed.

    Supported values include:

    • abort: Stop reindexing if there are conflicts.
    • proceed: Continue reindexing even if there are conflicts.

    Values are abort or proceed.

  • The default operator for query string query: AND or OR. This parameter can be used only when the q query string parameter is specified.

    Values are and, AND, or, or OR.

  • df string

    The field to use as default where no field prefix is given in the query string. This parameter can be used only when the q query string parameter is specified.

  • expand_wildcards string | array[string]

    The type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. It supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • from number

    Skips the specified number of documents.

  • If false, the request returns an error if it targets a missing or closed index.

  • lenient boolean

    If true, format-based query failures (such as providing text to a numeric field) in the query string will be ignored. This parameter can be used only when the q query string parameter is specified.

  • max_docs number

    The maximum number of documents to process. Defaults to all documents. When set to a value less then or equal to scroll_size, a scroll will not be used to retrieve the results for the operation.

  • The node or shard the operation should be performed on. It is random by default.

  • refresh boolean

    If true, Elasticsearch refreshes all shards involved in the delete by query after the request completes. This is different than the delete API's refresh parameter, which causes just the shard that received the delete request to be refreshed. Unlike the delete API, it does not support wait_for.

  • If true, the request cache is used for this request. Defaults to the index-level setting.

  • The throttle for this request in sub-requests per second.

  • routing string

    A custom value used to route operations to a specific shard.

  • q string

    A query in the Lucene query string syntax.

  • scroll string

    The period to retain the search context for scrolling.

  • The size of the scroll request that powers the operation.

  • The explicit timeout for each search request. It defaults to no timeout.

  • The type of the search operation. Available options include query_then_fetch and dfs_query_then_fetch.

    Supported values include:

    • query_then_fetch: Documents are scored using local term and document frequencies for the shard. This is usually faster but less accurate.
    • dfs_query_then_fetch: Documents are scored using global term and document frequencies across all shards. This is usually slower but more accurate.

    Values are query_then_fetch or dfs_query_then_fetch.

  • slices number | string

    The number of slices this task should be divided into.

  • sort array[string]

    A comma-separated list of <field>:<direction> pairs.

  • stats array[string]

    The specific tag of the request for logging and statistical purposes.

  • The maximum number of documents to collect for each shard. If a query reaches this limit, Elasticsearch terminates the query early. Elasticsearch collects documents before sorting.

    Use with caution. Elasticsearch applies this parameter to each shard handling the request. When possible, let Elasticsearch perform early termination automatically. Avoid specifying this parameter for requests that target data streams with backing indices across multiple data tiers.

  • timeout string

    The period each deletion request waits for active shards.

  • version boolean

    If true, returns the document version as part of a hit.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The timeout value controls how long each write request waits for unavailable shards to become available.

  • If true, the request blocks until the operation is complete. If false, Elasticsearch performs some preflight checks, launches the request, and returns a task you can use to cancel or get the status of the task. Elasticsearch creates a record of this task as a document at .tasks/task/${taskId}. When you are done with a task, you should delete the task document so Elasticsearch can reclaim the space.

application/json

Body Required

  • max_docs number

    The maximum number of documents to delete.

  • query object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • slice object
    Hide slice attributes Show slice attributes object
    • field string

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • id string Required
    • max number Required

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • batches number

      The number of scroll responses pulled back by the delete by query.

    • deleted number

      The number of documents that were successfully deleted.

    • failures array[object]

      An array of failures if there were any unrecoverable errors during the process. If this array is not empty, the request ended abnormally because of those failures. Delete by query is implemented using batches and any failures cause the entire process to end but all failures in the current batch are collected into the array. You can use the conflicts option to prevent reindex from ending on version conflicts.

      Hide failures attributes Show failures attributes object
    • noops number

      This field is always equal to zero for delete by query. It exists only so that delete by query, update by query, and reindex APIs return responses with the same structure.

    • The number of requests per second effectively run during the delete by query.

    • retries object
      Hide retries attributes Show retries attributes object
      • bulk number Required

        The number of bulk actions retried.

    • slice_id number
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Time unit for milliseconds

    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Time unit for milliseconds

    • timed_out boolean

      If true, some requests run during the delete by query operation timed out.

    • took number

      Time unit for milliseconds

    • total number

      The number of documents that were successfully processed.

    • The number of version conflicts that the delete by query hit.

POST /{index}/_delete_by_query
curl \
 --request POST 'http://api.example.com/{index}/_delete_by_query' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"query\": {\n    \"match_all\": {}\n  }\n}"'
Run `POST /my-index-000001,my-index-000002/_delete_by_query` to delete all documents from multiple data streams or indices.
{
  "query": {
    "match_all": {}
  }
}
Run `POST my-index-000001/_delete_by_query` to delete a document by using a unique attribute.
{
  "query": {
    "term": {
      "user.id": "kimchy"
    }
  },
  "max_docs": 1
}
Run `POST my-index-000001/_delete_by_query` to slice a delete by query manually. Provide a slice ID and total number of slices.
{
  "slice": {
    "id": 0,
    "max": 2
  },
  "query": {
    "range": {
      "http.response.bytes": {
        "lt": 2000000
      }
    }
  }
}
Run `POST my-index-000001/_delete_by_query?refresh&slices=5` to let delete by query automatically parallelize using sliced scroll to slice on `_id`. The `slices` query parameter value specifies the number of slices to use.
{
  "query": {
    "range": {
      "http.response.bytes": {
        "lt": 2000000
      }
    }
  }
}
Response examples (200)
A successful response from `POST /my-index-000001/_delete_by_query`.
{
  "took" : 147,
  "timed_out": false,
  "total": 119,
  "deleted": 119,
  "batches": 1,
  "version_conflicts": 0,
  "noops": 0,
  "retries": {
    "bulk": 0,
    "search": 0
  },
  "throttled_millis": 0,
  "requests_per_second": -1.0,
  "throttled_until_millis": 0,
  "failures" : [ ]
}

Get a document's source

GET /{index}/_source/{id}

Get the source of a document. For example:

GET my-index-000001/_source/1

You can use the source filtering parameters to control which parts of the _source are returned:

GET my-index-000001/_source/1/?_source_includes=*.id&_source_excludes=entities
External documentation

Path parameters

  • index string Required

    The name of the index that contains the document.

  • id string Required

    A unique document identifier.

Query parameters

  • The node or shard the operation should be performed on. By default, the operation is randomized between the shard replicas.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • refresh boolean

    If true, the request refreshes the relevant shards before retrieving the document. Setting it to true should be done after careful thought and verification that this does not cause a heavy load on the system (and slow down indexing).

  • routing string

    A custom value used to route operations to a specific shard.

  • _source boolean | string | array[string]

    Indicates whether to return the _source field (true or false) or lists the fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude in the response.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response.

  • stored_fields string | array[string]

    A comma-separated list of stored fields to return as part of a hit.

  • version number

    The version number for concurrency control. It must match the current version of the document for the request to succeed.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

Responses

GET /{index}/_source/{id}
curl \
 --request GET 'http://api.example.com/{index}/_source/{id}' \
 --header "Authorization: $API_KEY"

Check for a document source Added in 5.4.0

HEAD /{index}/_source/{id}

Check whether a document source exists in an index. For example:

HEAD my-index-000001/_source/1

A document's source is not available if it is disabled in the mapping.

External documentation

Path parameters

  • index string Required

    A comma-separated list of data streams, indices, and aliases. It supports wildcards (*).

  • id string Required

    A unique identifier for the document.

Query parameters

  • The node or shard the operation should be performed on. By default, the operation is randomized between the shard replicas.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • refresh boolean

    If true, the request refreshes the relevant shards before retrieving the document. Setting it to true should be done after careful thought and verification that this does not cause a heavy load on the system (and slow down indexing).

  • routing string

    A custom value used to route operations to a specific shard.

  • _source boolean | string | array[string]

    Indicates whether to return the _source field (true or false) or lists the fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude in the response.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response.

  • version number

    The version number for concurrency control. It must match the current version of the document for the request to succeed.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

Responses

HEAD /{index}/_source/{id}
curl \
 --request HEAD 'http://api.example.com/{index}/_source/{id}' \
 --header "Authorization: $API_KEY"

Create or update a document in an index

POST /{index}/_doc

Add a JSON document to the specified data stream or index and make it searchable. If the target is an index and the document already exists, the request updates the document and increments its version.

NOTE: You cannot use this API to send update requests for existing documents in a data stream.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:

  • To add or overwrite a document using the PUT /<target>/_doc/<_id> request format, you must have the create, index, or write index privilege.
  • To add a document using the POST /<target>/_doc/ request format, you must have the create_doc, create, index, or write index privilege.
  • To automatically create a data stream or index with this API request, you must have the auto_configure, create_index, or manage index privilege.

Automatic data stream creation requires a matching index template with data stream enabled.

NOTE: Replica shards might not all be started when an indexing operation returns successfully. By default, only the primary is required. Set wait_for_active_shards to change this default behavior.

Automatically create data streams and indices

If the request's target doesn't exist and matches an index template with a data_stream definition, the index operation automatically creates the data stream.

If the target doesn't exist and doesn't match a data stream template, the operation automatically creates the index and applies any matching index templates.

NOTE: Elasticsearch includes several built-in index templates. To avoid naming collisions with these templates, refer to index pattern documentation.

If no mapping exists, the index operation creates a dynamic mapping. By default, new fields and objects are automatically added to the mapping if needed.

Automatic index creation is controlled by the action.auto_create_index setting. If it is true, any index can be created automatically. You can modify this setting to explicitly allow or block automatic creation of indices that match specified patterns or set it to false to turn off automatic index creation entirely. Specify a comma-separated list of patterns you want to allow or prefix each pattern with + or - to indicate whether it should be allowed or blocked. When a list is specified, the default behaviour is to disallow.

NOTE: The action.auto_create_index setting affects the automatic creation of indices only. It does not affect the creation of data streams.

Optimistic concurrency control

Index operations can be made conditional and only be performed if the last modification to the document was assigned the sequence number and primary term specified by the if_seq_no and if_primary_term parameters. If a mismatch is detected, the operation will result in a VersionConflictException and a status code of 409.

Routing

By default, shard placement — or routing — is controlled by using a hash of the document's ID value. For more explicit control, the value fed into the hash function used by the router can be directly specified on a per-operation basis using the routing parameter.

When setting up explicit mapping, you can also use the _routing field to direct the index operation to extract the routing value from the document itself. This does come at the (very minimal) cost of an additional document parsing pass. If the _routing mapping is defined and set to be required, the index operation will fail if no routing value is provided or extracted.

NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing setting enabled in the template.

Distributed

The index operation is directed to the primary shard based on its route and performed on the actual node containing this shard. After the primary shard completes the operation, if needed, the update is distributed to applicable replicas.

Active shards

To improve the resiliency of writes to the system, indexing operations can be configured to wait for a certain number of active shard copies before proceeding with the operation. If the requisite number of active shard copies are not available, then the write operation must wait and retry, until either the requisite shard copies have started or a timeout occurs. By default, write operations only wait for the primary shards to be active before proceeding (that is to say wait_for_active_shards is 1). This default can be overridden in the index settings dynamically by setting index.write.wait_for_active_shards. To alter this behavior per operation, use the wait_for_active_shards request parameter.

Valid values are all or any positive integer up to the total number of configured copies per shard in the index (which is number_of_replicas+1). Specifying a negative value or a number greater than the number of shard copies will throw an error.

For example, suppose you have a cluster of three nodes, A, B, and C and you create an index index with the number of replicas set to 3 (resulting in 4 shard copies, one more copy than there are nodes). If you attempt an indexing operation, by default the operation will only ensure the primary copy of each shard is available before proceeding. This means that even if B and C went down and A hosted the primary shard copies, the indexing operation would still proceed with only one copy of the data. If wait_for_active_shards is set on the request to 3 (and all three nodes are up), the indexing operation will require 3 active shard copies before proceeding. This requirement should be met because there are 3 active nodes in the cluster, each one holding a copy of the shard. However, if you set wait_for_active_shards to all (or to 4, which is the same in this situation), the indexing operation will not proceed as you do not have all 4 copies of each shard active in the index. The operation will timeout unless a new node is brought up in the cluster to host the fourth copy of the shard.

It is important to note that this setting greatly reduces the chances of the write operation not writing to the requisite number of shard copies, but it does not completely eliminate the possibility, because this check occurs before the write operation starts. After the write operation is underway, it is still possible for replication to fail on any number of shard copies but still succeed on the primary. The _shards section of the API response reveals the number of shard copies on which replication succeeded and failed.

No operation (noop) updates

When updating a document by using this API, a new version of the document is always created even if the document hasn't changed. If this isn't acceptable use the _update API with detect_noop set to true. The detect_noop option isn't available on this API because it doesn’t fetch the old source and isn't able to compare it against the new source.

There isn't a definitive rule for when noop updates aren't acceptable. It's a combination of lots of factors like how frequently your data source sends updates that are actually noops and how many queries per second Elasticsearch runs on the shard receiving the updates.

Versioning

Each indexed document is given a version number. By default, internal versioning is used that starts at 1 and increments with each update, deletes included. Optionally, the version number can be set to an external value (for example, if maintained in a database). To enable this functionality, version_type should be set to external. The value provided must be a numeric, long value greater than or equal to 0, and less than around 9.2e+18.

NOTE: Versioning is completely real time, and is not affected by the near real time aspects of search operations. If no version is provided, the operation runs without any version checks.

When using the external version type, the system checks to see if the version number passed to the index request is greater than the version of the currently stored document. If true, the document will be indexed and the new version number used. If the value provided is less than or equal to the stored document's version number, a version conflict will occur and the index operation will fail. For example:

PUT my-index-000001/_doc/1?version=2&version_type=external
{
  "user": {
    "id": "elkbee"
  }
}

In this example, the operation will succeed since the supplied version of 2 is higher than the current document version of 1.
If the document was already updated and its version was set to 2 or higher, the indexing command will fail and result in a conflict (409 HTTP status code).

A nice side effect is that there is no need to maintain strict ordering of async indexing operations run as a result of changes to a source database, as long as version numbers from the source database are used.
Even the simple case of updating the Elasticsearch index using data from a database is simplified if external versioning is used, as only the latest version will be used if the index operations arrive out of order.
External documentation

Path parameters

  • index string Required

    The name of the data stream or index to target. If the target doesn't exist and matches the name or wildcard (*) pattern of an index template with a data_stream definition, this request creates the data stream. If the target doesn't exist and doesn't match a data stream template, this request creates the index. You can check for existing targets with the resolve index API.

Query parameters

  • Only perform the operation if the document has this primary term.

  • Only perform the operation if the document has this sequence number.

  • True or false if to include the document source in the error message in case of parsing errors.

  • op_type string

    Set to create to only index the document if it does not already exist (put if absent). If a document with the specified _id already exists, the indexing operation will fail. The behavior is the same as using the <index>/_create endpoint. If a document ID is specified, this paramater defaults to index. Otherwise, it defaults to create. If the request targets a data stream, an op_type of create is required.

    Supported values include:

    • index: Overwrite any documents that already exist.
    • create: Only index documents that do not already exist.

    Values are index or create.

  • pipeline string

    The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, then setting the value to _none disables the default ingest pipeline for this request. If a final pipeline is configured it will always run, regardless of the value of this parameter.

  • refresh string

    If true, Elasticsearch refreshes the affected shards to make this operation visible to search. If wait_for, it waits for a refresh to make this operation visible to search. If false, it does nothing with refreshes.

    Values are true, false, or wait_for.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • timeout string

    The period the request waits for the following operations: automatic index creation, dynamic mapping updates, waiting for active shards.

    This parameter is useful for situations where the primary shard assigned to perform the operation might not be available when the operation runs. Some reasons for this might be that the primary shard is currently recovering from a gateway or undergoing relocation. By default, the operation will wait on the primary shard to become available for at least 1 minute before failing and responding with an error. The actual wait time could be longer, particularly when multiple waits occur.

  • version number

    An explicit version number for concurrency control. It must be a non-negative long number.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. You can set it to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default value of 1 means it waits for each primary shard to be active.

  • If true, the destination must be an index alias.

application/json

Body Required

object object

Responses

POST /{index}/_doc
curl \
 --request POST 'http://api.example.com/{index}/_doc' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"@timestamp\": \"2099-11-15T13:12:00\",\n  \"message\": \"GET /search HTTP/1.1 200 1070000\",\n  \"user\": {\n    \"id\": \"kimchy\"\n  }\n}"'
Request examples
Run `POST my-index-000001/_doc/` to index a document. When you use the `POST /<target>/_doc/` request format, the `op_type` is automatically set to `create` and the index operation generates a unique ID for the document.
{
  "@timestamp": "2099-11-15T13:12:00",
  "message": "GET /search HTTP/1.1 200 1070000",
  "user": {
    "id": "kimchy"
  }
}
Run `PUT my-index-000001/_doc/1` to insert a JSON document into the `my-index-000001` index with an `_id` of 1.
{
  "@timestamp": "2099-11-15T13:12:00",
  "message": "GET /search HTTP/1.1 200 1070000",
  "user": {
    "id": "kimchy"
  }
}
Response examples (200)
A successful response from `POST my-index-000001/_doc/`, which contains an automated document ID.
{
  "_shards": {
    "total": 2,
    "failed": 0,
    "successful": 2
  },
  "_index": "my-index-000001",
  "_id": "W0tpsmIBdwcYyG50zbta",
  "_version": 1,
  "_seq_no": 0,
  "_primary_term": 1,
  "result": "created"
}
A successful response from `PUT my-index-000001/_doc/1`.
{
  "_shards": {
    "total": 2,
    "failed": 0,
    "successful": 2
  },
  "_index": "my-index-000001",
  "_id": "1",
  "_version": 1,
  "_seq_no": 0,
  "_primary_term": 1,
  "result": "created"
}

Get multiple documents Added in 1.3.0

GET /_mget

Get multiple JSON documents by ID from one or more indices. If you specify an index in the request URI, you only need to specify the document IDs in the request body. To ensure fast responses, this multi get (mget) API responds with partial results if one or more shards fail.

Filter source fields

By default, the _source field is returned for every document (if stored). Use the _source and _source_include or source_exclude attributes to filter what fields are returned for a particular document. You can include the _source, _source_includes, and _source_excludes query parameters in the request URI to specify the defaults to use when there are no per-document instructions.

Get stored fields

Use the stored_fields attribute to specify the set of stored fields you want to retrieve. Any requested fields that are not stored are ignored. You can include the stored_fields query parameter in the request URI to specify the defaults to use when there are no per-document instructions.

Query parameters

  • Specifies the node or shard the operation should be performed on. Random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • refresh boolean

    If true, the request refreshes relevant shards before retrieving documents.

  • routing string

    Custom value used to route operations to a specific shard.

  • _source boolean | string | array[string]

    True or false to return the _source field or not, or a list of fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in _source_includes query parameter.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the _source_excludes query parameter. If the _source parameter is false, this parameter is ignored.

  • stored_fields string | array[string]

    If true, retrieves the document fields stored in the index rather than the document _source.

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • docs array[object] Required

      The response includes a docs array that contains the documents in the order specified in the request. The structure of the returned documents is similar to that returned by the get API. If there is a failure getting a particular document, the error is included in place of the document.

      One of:
      Hide attributes Show attributes
      • _index string Required
      • fields object

        If the stored_fields parameter is set to true and found is true, it contains the document fields stored in the index.

        Hide fields attribute Show fields attribute object
        • * object Additional properties
      • _ignored array[string]
      • found boolean Required

        Indicates whether the document exists.

      • _id string Required
      • The primary term assigned to the document for the indexing operation.

      • _routing string

        The explicit routing, if set.

      • _seq_no number
      • _source object

        If found is true, it contains the document data formatted in JSON. If the _source parameter is set to false or the stored_fields parameter is set to true, it is excluded.

      • _version number
GET /_mget
curl \
 --request GET 'http://api.example.com/_mget' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"docs\": [\n    {\n      \"_id\": \"1\"\n    },\n    {\n      \"_id\": \"2\"\n    }\n  ]\n}"'
Run `GET /my-index-000001/_mget`. When you specify an index in the request URI, only the document IDs are required in the request body.
{
  "docs": [
    {
      "_id": "1"
    },
    {
      "_id": "2"
    }
  ]
}
Run `GET /_mget`. This request sets `_source` to `false` for document 1 to exclude the source entirely. It retrieves `field3` and `field4` from document 2. It retrieves the `user` field from document 3 but filters out the `user.location` field.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "_source": false
    },
    {
      "_index": "test",
      "_id": "2",
      "_source": [ "field3", "field4" ]
    },
    {
      "_index": "test",
      "_id": "3",
      "_source": {
        "include": [ "user" ],
        "exclude": [ "user.location" ]
      }
    }
  ]
}
Run `GET /_mget`. This request retrieves `field1` and `field2` from document 1 and `field3` and `field4` from document 2.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "stored_fields": [ "field1", "field2" ]
    },
    {
      "_index": "test",
      "_id": "2",
      "stored_fields": [ "field3", "field4" ]
    }
  ]
}
Run `GET /_mget?routing=key1`. If routing is used during indexing, you need to specify the routing value to retrieve documents. This request fetches `test/_doc/2` from the shard corresponding to routing key `key1`. It fetches `test/_doc/1` from the shard corresponding to routing key `key2`.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "routing": "key2"
    },
    {
      "_index": "test",
      "_id": "2"
    }
  ]
}

Get multiple documents Added in 1.3.0

POST /_mget

Get multiple JSON documents by ID from one or more indices. If you specify an index in the request URI, you only need to specify the document IDs in the request body. To ensure fast responses, this multi get (mget) API responds with partial results if one or more shards fail.

Filter source fields

By default, the _source field is returned for every document (if stored). Use the _source and _source_include or source_exclude attributes to filter what fields are returned for a particular document. You can include the _source, _source_includes, and _source_excludes query parameters in the request URI to specify the defaults to use when there are no per-document instructions.

Get stored fields

Use the stored_fields attribute to specify the set of stored fields you want to retrieve. Any requested fields that are not stored are ignored. You can include the stored_fields query parameter in the request URI to specify the defaults to use when there are no per-document instructions.

Query parameters

  • Specifies the node or shard the operation should be performed on. Random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • refresh boolean

    If true, the request refreshes relevant shards before retrieving documents.

  • routing string

    Custom value used to route operations to a specific shard.

  • _source boolean | string | array[string]

    True or false to return the _source field or not, or a list of fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in _source_includes query parameter.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the _source_excludes query parameter. If the _source parameter is false, this parameter is ignored.

  • stored_fields string | array[string]

    If true, retrieves the document fields stored in the index rather than the document _source.

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • docs array[object] Required

      The response includes a docs array that contains the documents in the order specified in the request. The structure of the returned documents is similar to that returned by the get API. If there is a failure getting a particular document, the error is included in place of the document.

      One of:
      Hide attributes Show attributes
      • _index string Required
      • fields object

        If the stored_fields parameter is set to true and found is true, it contains the document fields stored in the index.

        Hide fields attribute Show fields attribute object
        • * object Additional properties
      • _ignored array[string]
      • found boolean Required

        Indicates whether the document exists.

      • _id string Required
      • The primary term assigned to the document for the indexing operation.

      • _routing string

        The explicit routing, if set.

      • _seq_no number
      • _source object

        If found is true, it contains the document data formatted in JSON. If the _source parameter is set to false or the stored_fields parameter is set to true, it is excluded.

      • _version number
POST /_mget
curl \
 --request POST 'http://api.example.com/_mget' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"docs\": [\n    {\n      \"_id\": \"1\"\n    },\n    {\n      \"_id\": \"2\"\n    }\n  ]\n}"'
Run `GET /my-index-000001/_mget`. When you specify an index in the request URI, only the document IDs are required in the request body.
{
  "docs": [
    {
      "_id": "1"
    },
    {
      "_id": "2"
    }
  ]
}
Run `GET /_mget`. This request sets `_source` to `false` for document 1 to exclude the source entirely. It retrieves `field3` and `field4` from document 2. It retrieves the `user` field from document 3 but filters out the `user.location` field.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "_source": false
    },
    {
      "_index": "test",
      "_id": "2",
      "_source": [ "field3", "field4" ]
    },
    {
      "_index": "test",
      "_id": "3",
      "_source": {
        "include": [ "user" ],
        "exclude": [ "user.location" ]
      }
    }
  ]
}
Run `GET /_mget`. This request retrieves `field1` and `field2` from document 1 and `field3` and `field4` from document 2.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "stored_fields": [ "field1", "field2" ]
    },
    {
      "_index": "test",
      "_id": "2",
      "stored_fields": [ "field3", "field4" ]
    }
  ]
}
Run `GET /_mget?routing=key1`. If routing is used during indexing, you need to specify the routing value to retrieve documents. This request fetches `test/_doc/2` from the shard corresponding to routing key `key1`. It fetches `test/_doc/1` from the shard corresponding to routing key `key2`.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "routing": "key2"
    },
    {
      "_index": "test",
      "_id": "2"
    }
  ]
}

Get multiple documents Added in 1.3.0

GET /{index}/_mget

Get multiple JSON documents by ID from one or more indices. If you specify an index in the request URI, you only need to specify the document IDs in the request body. To ensure fast responses, this multi get (mget) API responds with partial results if one or more shards fail.

Filter source fields

By default, the _source field is returned for every document (if stored). Use the _source and _source_include or source_exclude attributes to filter what fields are returned for a particular document. You can include the _source, _source_includes, and _source_excludes query parameters in the request URI to specify the defaults to use when there are no per-document instructions.

Get stored fields

Use the stored_fields attribute to specify the set of stored fields you want to retrieve. Any requested fields that are not stored are ignored. You can include the stored_fields query parameter in the request URI to specify the defaults to use when there are no per-document instructions.

Path parameters

  • index string Required

    Name of the index to retrieve documents from when ids are specified, or when a document in the docs array does not specify an index.

Query parameters

  • Specifies the node or shard the operation should be performed on. Random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • refresh boolean

    If true, the request refreshes relevant shards before retrieving documents.

  • routing string

    Custom value used to route operations to a specific shard.

  • _source boolean | string | array[string]

    True or false to return the _source field or not, or a list of fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in _source_includes query parameter.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the _source_excludes query parameter. If the _source parameter is false, this parameter is ignored.

  • stored_fields string | array[string]

    If true, retrieves the document fields stored in the index rather than the document _source.

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • docs array[object] Required

      The response includes a docs array that contains the documents in the order specified in the request. The structure of the returned documents is similar to that returned by the get API. If there is a failure getting a particular document, the error is included in place of the document.

      One of:
      Hide attributes Show attributes
      • _index string Required
      • fields object

        If the stored_fields parameter is set to true and found is true, it contains the document fields stored in the index.

        Hide fields attribute Show fields attribute object
        • * object Additional properties
      • _ignored array[string]
      • found boolean Required

        Indicates whether the document exists.

      • _id string Required
      • The primary term assigned to the document for the indexing operation.

      • _routing string

        The explicit routing, if set.

      • _seq_no number
      • _source object

        If found is true, it contains the document data formatted in JSON. If the _source parameter is set to false or the stored_fields parameter is set to true, it is excluded.

      • _version number
GET /{index}/_mget
curl \
 --request GET 'http://api.example.com/{index}/_mget' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"docs\": [\n    {\n      \"_id\": \"1\"\n    },\n    {\n      \"_id\": \"2\"\n    }\n  ]\n}"'
Run `GET /my-index-000001/_mget`. When you specify an index in the request URI, only the document IDs are required in the request body.
{
  "docs": [
    {
      "_id": "1"
    },
    {
      "_id": "2"
    }
  ]
}
Run `GET /_mget`. This request sets `_source` to `false` for document 1 to exclude the source entirely. It retrieves `field3` and `field4` from document 2. It retrieves the `user` field from document 3 but filters out the `user.location` field.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "_source": false
    },
    {
      "_index": "test",
      "_id": "2",
      "_source": [ "field3", "field4" ]
    },
    {
      "_index": "test",
      "_id": "3",
      "_source": {
        "include": [ "user" ],
        "exclude": [ "user.location" ]
      }
    }
  ]
}
Run `GET /_mget`. This request retrieves `field1` and `field2` from document 1 and `field3` and `field4` from document 2.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "stored_fields": [ "field1", "field2" ]
    },
    {
      "_index": "test",
      "_id": "2",
      "stored_fields": [ "field3", "field4" ]
    }
  ]
}
Run `GET /_mget?routing=key1`. If routing is used during indexing, you need to specify the routing value to retrieve documents. This request fetches `test/_doc/2` from the shard corresponding to routing key `key1`. It fetches `test/_doc/1` from the shard corresponding to routing key `key2`.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "routing": "key2"
    },
    {
      "_index": "test",
      "_id": "2"
    }
  ]
}

Get multiple documents Added in 1.3.0

POST /{index}/_mget

Get multiple JSON documents by ID from one or more indices. If you specify an index in the request URI, you only need to specify the document IDs in the request body. To ensure fast responses, this multi get (mget) API responds with partial results if one or more shards fail.

Filter source fields

By default, the _source field is returned for every document (if stored). Use the _source and _source_include or source_exclude attributes to filter what fields are returned for a particular document. You can include the _source, _source_includes, and _source_excludes query parameters in the request URI to specify the defaults to use when there are no per-document instructions.

Get stored fields

Use the stored_fields attribute to specify the set of stored fields you want to retrieve. Any requested fields that are not stored are ignored. You can include the stored_fields query parameter in the request URI to specify the defaults to use when there are no per-document instructions.

Path parameters

  • index string Required

    Name of the index to retrieve documents from when ids are specified, or when a document in the docs array does not specify an index.

Query parameters

  • Specifies the node or shard the operation should be performed on. Random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • refresh boolean

    If true, the request refreshes relevant shards before retrieving documents.

  • routing string

    Custom value used to route operations to a specific shard.

  • _source boolean | string | array[string]

    True or false to return the _source field or not, or a list of fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in _source_includes query parameter.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the _source_excludes query parameter. If the _source parameter is false, this parameter is ignored.

  • stored_fields string | array[string]

    If true, retrieves the document fields stored in the index rather than the document _source.

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • docs array[object] Required

      The response includes a docs array that contains the documents in the order specified in the request. The structure of the returned documents is similar to that returned by the get API. If there is a failure getting a particular document, the error is included in place of the document.

      One of:
      Hide attributes Show attributes
      • _index string Required
      • fields object

        If the stored_fields parameter is set to true and found is true, it contains the document fields stored in the index.

        Hide fields attribute Show fields attribute object
        • * object Additional properties
      • _ignored array[string]
      • found boolean Required

        Indicates whether the document exists.

      • _id string Required
      • The primary term assigned to the document for the indexing operation.

      • _routing string

        The explicit routing, if set.

      • _seq_no number
      • _source object

        If found is true, it contains the document data formatted in JSON. If the _source parameter is set to false or the stored_fields parameter is set to true, it is excluded.

      • _version number
POST /{index}/_mget
curl \
 --request POST 'http://api.example.com/{index}/_mget' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"docs\": [\n    {\n      \"_id\": \"1\"\n    },\n    {\n      \"_id\": \"2\"\n    }\n  ]\n}"'
Run `GET /my-index-000001/_mget`. When you specify an index in the request URI, only the document IDs are required in the request body.
{
  "docs": [
    {
      "_id": "1"
    },
    {
      "_id": "2"
    }
  ]
}
Run `GET /_mget`. This request sets `_source` to `false` for document 1 to exclude the source entirely. It retrieves `field3` and `field4` from document 2. It retrieves the `user` field from document 3 but filters out the `user.location` field.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "_source": false
    },
    {
      "_index": "test",
      "_id": "2",
      "_source": [ "field3", "field4" ]
    },
    {
      "_index": "test",
      "_id": "3",
      "_source": {
        "include": [ "user" ],
        "exclude": [ "user.location" ]
      }
    }
  ]
}
Run `GET /_mget`. This request retrieves `field1` and `field2` from document 1 and `field3` and `field4` from document 2.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "stored_fields": [ "field1", "field2" ]
    },
    {
      "_index": "test",
      "_id": "2",
      "stored_fields": [ "field3", "field4" ]
    }
  ]
}
Run `GET /_mget?routing=key1`. If routing is used during indexing, you need to specify the routing value to retrieve documents. This request fetches `test/_doc/2` from the shard corresponding to routing key `key1`. It fetches `test/_doc/1` from the shard corresponding to routing key `key2`.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "routing": "key2"
    },
    {
      "_index": "test",
      "_id": "2"
    }
  ]
}

Get multiple term vectors

GET /_mtermvectors

Get multiple term vectors with a single request. You can specify existing documents by index and ID or provide artificial documents in the body of the request. You can specify the index in the request body or request URI. The response contains a docs array with all the fetched termvectors. Each element has the structure provided by the termvectors API.

Artificial documents

You can also use mtermvectors to generate term vectors for artificial documents provided in the body of the request. The mapping used is determined by the specified _index.

Query parameters

  • ids array[string]

    A comma-separated list of documents ids. You must define ids as parameter or set "ids" or "docs" in the request body

  • fields string | array[string]

    A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the completion_fields or fielddata_fields parameters.

  • If true, the response includes the document count, sum of document frequencies, and sum of total term frequencies.

  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • The node or shard the operation should be performed on. It is random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • routing string

    A custom value used to route operations to a specific shard.

  • If true, the response includes term frequency and document frequency.

  • version number

    If true, returns the document version as part of a hit.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

application/json

Body

  • docs array[object]

    An array of existing or artificial documents.

    Hide docs attributes Show docs attributes object
    • _id string
    • _index string
    • doc object

      An artificial document (a document not present in the index) for which you want to retrieve term vectors.

    • fields string | array[string]
    • If true, the response includes the document count, sum of document frequencies, and sum of total term frequencies.

    • filter object
      Hide filter attributes Show filter attributes object
      • Ignore words which occur in more than this many docs. Defaults to unbounded.

      • The maximum number of terms that must be returned per field.

      • Ignore words with more than this frequency in the source doc. It defaults to unbounded.

      • The maximum word length above which words will be ignored. Defaults to unbounded.

      • Ignore terms which do not occur in at least this many docs.

      • Ignore words with less than this frequency in the source doc.

      • The minimum word length below which words will be ignored.

    • offsets boolean

      If true, the response includes term offsets.

    • payloads boolean

      If true, the response includes term payloads.

    • positions boolean

      If true, the response includes term positions.

    • routing string
    • If true, the response includes term frequency and document frequency.

    • version number
    • Values are internal, external, external_gte, or force.

  • ids array[string]

    A simplified syntax to specify documents by their ID if they're in the same index.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
GET /_mtermvectors
curl \
 --request GET 'http://api.example.com/_mtermvectors' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"docs\": [\n      {\n        \"_id\": \"2\",\n        \"fields\": [\n            \"message\"\n        ],\n        \"term_statistics\": true\n      },\n      {\n        \"_id\": \"1\"\n      }\n  ]\n}"'
Run `POST /my-index-000001/_mtermvectors`. When you specify an index in the request URI, the index does not need to be specified for each documents in the request body.
{
  "docs": [
      {
        "_id": "2",
        "fields": [
            "message"
        ],
        "term_statistics": true
      },
      {
        "_id": "1"
      }
  ]
}
Run `POST /my-index-000001/_mtermvectors`. If all requested documents are in same index and the parameters are the same, you can use a simplified syntax.
{
  "ids": [ "1", "2" ],
  "fields": [
    "message"
  ],
  "term_statistics": true
}
Run `POST /_mtermvectors` to generate term vectors for artificial documents provided in the body of the request. The mapping used is determined by the specified `_index`.
{
  "docs": [
      {
        "_index": "my-index-000001",
        "doc" : {
            "message" : "test test test"
        }
      },
      {
        "_index": "my-index-000001",
        "doc" : {
          "message" : "Another test ..."
        }
      }
  ]
}

Get multiple term vectors

POST /_mtermvectors

Get multiple term vectors with a single request. You can specify existing documents by index and ID or provide artificial documents in the body of the request. You can specify the index in the request body or request URI. The response contains a docs array with all the fetched termvectors. Each element has the structure provided by the termvectors API.

Artificial documents

You can also use mtermvectors to generate term vectors for artificial documents provided in the body of the request. The mapping used is determined by the specified _index.

Query parameters

  • ids array[string]

    A comma-separated list of documents ids. You must define ids as parameter or set "ids" or "docs" in the request body

  • fields string | array[string]

    A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the completion_fields or fielddata_fields parameters.

  • If true, the response includes the document count, sum of document frequencies, and sum of total term frequencies.

  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • The node or shard the operation should be performed on. It is random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • routing string

    A custom value used to route operations to a specific shard.

  • If true, the response includes term frequency and document frequency.

  • version number

    If true, returns the document version as part of a hit.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

application/json

Body

  • docs array[object]

    An array of existing or artificial documents.

    Hide docs attributes Show docs attributes object
    • _id string
    • _index string
    • doc object

      An artificial document (a document not present in the index) for which you want to retrieve term vectors.

    • fields string | array[string]
    • If true, the response includes the document count, sum of document frequencies, and sum of total term frequencies.

    • filter object
      Hide filter attributes Show filter attributes object
      • Ignore words which occur in more than this many docs. Defaults to unbounded.

      • The maximum number of terms that must be returned per field.

      • Ignore words with more than this frequency in the source doc. It defaults to unbounded.

      • The maximum word length above which words will be ignored. Defaults to unbounded.

      • Ignore terms which do not occur in at least this many docs.

      • Ignore words with less than this frequency in the source doc.

      • The minimum word length below which words will be ignored.

    • offsets boolean

      If true, the response includes term offsets.

    • payloads boolean

      If true, the response includes term payloads.

    • positions boolean

      If true, the response includes term positions.

    • routing string
    • If true, the response includes term frequency and document frequency.

    • version number
    • Values are internal, external, external_gte, or force.

  • ids array[string]

    A simplified syntax to specify documents by their ID if they're in the same index.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
POST /_mtermvectors
curl \
 --request POST 'http://api.example.com/_mtermvectors' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"docs\": [\n      {\n        \"_id\": \"2\",\n        \"fields\": [\n            \"message\"\n        ],\n        \"term_statistics\": true\n      },\n      {\n        \"_id\": \"1\"\n      }\n  ]\n}"'
Run `POST /my-index-000001/_mtermvectors`. When you specify an index in the request URI, the index does not need to be specified for each documents in the request body.
{
  "docs": [
      {
        "_id": "2",
        "fields": [
            "message"
        ],
        "term_statistics": true
      },
      {
        "_id": "1"
      }
  ]
}
Run `POST /my-index-000001/_mtermvectors`. If all requested documents are in same index and the parameters are the same, you can use a simplified syntax.
{
  "ids": [ "1", "2" ],
  "fields": [
    "message"
  ],
  "term_statistics": true
}
Run `POST /_mtermvectors` to generate term vectors for artificial documents provided in the body of the request. The mapping used is determined by the specified `_index`.
{
  "docs": [
      {
        "_index": "my-index-000001",
        "doc" : {
            "message" : "test test test"
        }
      },
      {
        "_index": "my-index-000001",
        "doc" : {
          "message" : "Another test ..."
        }
      }
  ]
}

Get multiple term vectors

GET /{index}/_mtermvectors

Get multiple term vectors with a single request. You can specify existing documents by index and ID or provide artificial documents in the body of the request. You can specify the index in the request body or request URI. The response contains a docs array with all the fetched termvectors. Each element has the structure provided by the termvectors API.

Artificial documents

You can also use mtermvectors to generate term vectors for artificial documents provided in the body of the request. The mapping used is determined by the specified _index.

Path parameters

  • index string Required

    The name of the index that contains the documents.

Query parameters

  • ids array[string]

    A comma-separated list of documents ids. You must define ids as parameter or set "ids" or "docs" in the request body

  • fields string | array[string]

    A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the completion_fields or fielddata_fields parameters.

  • If true, the response includes the document count, sum of document frequencies, and sum of total term frequencies.

  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • The node or shard the operation should be performed on. It is random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • routing string

    A custom value used to route operations to a specific shard.

  • If true, the response includes term frequency and document frequency.

  • version number

    If true, returns the document version as part of a hit.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

application/json

Body

  • docs array[object]

    An array of existing or artificial documents.

    Hide docs attributes Show docs attributes object
    • _id string
    • _index string
    • doc object

      An artificial document (a document not present in the index) for which you want to retrieve term vectors.

    • fields string | array[string]
    • If true, the response includes the document count, sum of document frequencies, and sum of total term frequencies.

    • filter object
      Hide filter attributes Show filter attributes object
      • Ignore words which occur in more than this many docs. Defaults to unbounded.

      • The maximum number of terms that must be returned per field.

      • Ignore words with more than this frequency in the source doc. It defaults to unbounded.

      • The maximum word length above which words will be ignored. Defaults to unbounded.

      • Ignore terms which do not occur in at least this many docs.

      • Ignore words with less than this frequency in the source doc.

      • The minimum word length below which words will be ignored.

    • offsets boolean

      If true, the response includes term offsets.

    • payloads boolean

      If true, the response includes term payloads.

    • positions boolean

      If true, the response includes term positions.

    • routing string
    • If true, the response includes term frequency and document frequency.

    • version number
    • Values are internal, external, external_gte, or force.

  • ids array[string]

    A simplified syntax to specify documents by their ID if they're in the same index.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
GET /{index}/_mtermvectors
curl \
 --request GET 'http://api.example.com/{index}/_mtermvectors' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"docs\": [\n      {\n        \"_id\": \"2\",\n        \"fields\": [\n            \"message\"\n        ],\n        \"term_statistics\": true\n      },\n      {\n        \"_id\": \"1\"\n      }\n  ]\n}"'
Run `POST /my-index-000001/_mtermvectors`. When you specify an index in the request URI, the index does not need to be specified for each documents in the request body.
{
  "docs": [
      {
        "_id": "2",
        "fields": [
            "message"
        ],
        "term_statistics": true
      },
      {
        "_id": "1"
      }
  ]
}
Run `POST /my-index-000001/_mtermvectors`. If all requested documents are in same index and the parameters are the same, you can use a simplified syntax.
{
  "ids": [ "1", "2" ],
  "fields": [
    "message"
  ],
  "term_statistics": true
}
Run `POST /_mtermvectors` to generate term vectors for artificial documents provided in the body of the request. The mapping used is determined by the specified `_index`.
{
  "docs": [
      {
        "_index": "my-index-000001",
        "doc" : {
            "message" : "test test test"
        }
      },
      {
        "_index": "my-index-000001",
        "doc" : {
          "message" : "Another test ..."
        }
      }
  ]
}

Get multiple term vectors

POST /{index}/_mtermvectors

Get multiple term vectors with a single request. You can specify existing documents by index and ID or provide artificial documents in the body of the request. You can specify the index in the request body or request URI. The response contains a docs array with all the fetched termvectors. Each element has the structure provided by the termvectors API.

Artificial documents

You can also use mtermvectors to generate term vectors for artificial documents provided in the body of the request. The mapping used is determined by the specified _index.

Path parameters

  • index string Required

    The name of the index that contains the documents.

Query parameters

  • ids array[string]

    A comma-separated list of documents ids. You must define ids as parameter or set "ids" or "docs" in the request body

  • fields string | array[string]

    A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the completion_fields or fielddata_fields parameters.

  • If true, the response includes the document count, sum of document frequencies, and sum of total term frequencies.

  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • The node or shard the operation should be performed on. It is random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • routing string

    A custom value used to route operations to a specific shard.

  • If true, the response includes term frequency and document frequency.

  • version number

    If true, returns the document version as part of a hit.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

application/json

Body

  • docs array[object]

    An array of existing or artificial documents.

    Hide docs attributes Show docs attributes object
    • _id string
    • _index string
    • doc object

      An artificial document (a document not present in the index) for which you want to retrieve term vectors.

    • fields string | array[string]
    • If true, the response includes the document count, sum of document frequencies, and sum of total term frequencies.

    • filter object
      Hide filter attributes Show filter attributes object
      • Ignore words which occur in more than this many docs. Defaults to unbounded.

      • The maximum number of terms that must be returned per field.

      • Ignore words with more than this frequency in the source doc. It defaults to unbounded.

      • The maximum word length above which words will be ignored. Defaults to unbounded.

      • Ignore terms which do not occur in at least this many docs.

      • Ignore words with less than this frequency in the source doc.

      • The minimum word length below which words will be ignored.

    • offsets boolean

      If true, the response includes term offsets.

    • payloads boolean

      If true, the response includes term payloads.

    • positions boolean

      If true, the response includes term positions.

    • routing string
    • If true, the response includes term frequency and document frequency.

    • version number
    • Values are internal, external, external_gte, or force.

  • ids array[string]

    A simplified syntax to specify documents by their ID if they're in the same index.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
POST /{index}/_mtermvectors
curl \
 --request POST 'http://api.example.com/{index}/_mtermvectors' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"docs\": [\n      {\n        \"_id\": \"2\",\n        \"fields\": [\n            \"message\"\n        ],\n        \"term_statistics\": true\n      },\n      {\n        \"_id\": \"1\"\n      }\n  ]\n}"'
Run `POST /my-index-000001/_mtermvectors`. When you specify an index in the request URI, the index does not need to be specified for each documents in the request body.
{
  "docs": [
      {
        "_id": "2",
        "fields": [
            "message"
        ],
        "term_statistics": true
      },
      {
        "_id": "1"
      }
  ]
}
Run `POST /my-index-000001/_mtermvectors`. If all requested documents are in same index and the parameters are the same, you can use a simplified syntax.
{
  "ids": [ "1", "2" ],
  "fields": [
    "message"
  ],
  "term_statistics": true
}
Run `POST /_mtermvectors` to generate term vectors for artificial documents provided in the body of the request. The mapping used is determined by the specified `_index`.
{
  "docs": [
      {
        "_index": "my-index-000001",
        "doc" : {
            "message" : "test test test"
        }
      },
      {
        "_index": "my-index-000001",
        "doc" : {
          "message" : "Another test ..."
        }
      }
  ]
}

Reindex documents Added in 2.3.0

POST /_reindex

Copy documents from a source to a destination. You can copy all documents to the destination index or reindex a subset of the documents. The source can be any existing index, alias, or data stream. The destination must differ from the source. For example, you cannot reindex a data stream into itself.

IMPORTANT: Reindex requires _source to be enabled for all documents in the source. The destination should be configured as wanted before calling the reindex API. Reindex does not copy the settings from the source or its associated template. Mappings, shard counts, and replicas, for example, must be configured ahead of time.

If the Elasticsearch security features are enabled, you must have the following security privileges:

  • The read index privilege for the source data stream, index, or alias.
  • The write index privilege for the destination data stream, index, or index alias.
  • To automatically create a data stream or index with a reindex API request, you must have the auto_configure, create_index, or manage index privilege for the destination data stream, index, or alias.
  • If reindexing from a remote cluster, the source.remote.user must have the monitor cluster privilege and the read index privilege for the source data stream, index, or alias.

If reindexing from a remote cluster, you must explicitly allow the remote host in the reindex.remote.whitelist setting. Automatic data stream creation requires a matching index template with data stream enabled.

The dest element can be configured like the index API to control optimistic concurrency control. Omitting version_type or setting it to internal causes Elasticsearch to blindly dump documents into the destination, overwriting any that happen to have the same ID.

Setting version_type to external causes Elasticsearch to preserve the version from the source, create any documents that are missing, and update any documents that have an older version in the destination than they do in the source.

Setting op_type to create causes the reindex API to create only missing documents in the destination. All existing documents will cause a version conflict.

IMPORTANT: Because data streams are append-only, any reindex request to a destination data stream must have an op_type of create. A reindex can only add new documents to a destination data stream. It cannot update existing documents in a destination data stream.

By default, version conflicts abort the reindex process. To continue reindexing if there are conflicts, set the conflicts request body property to proceed. In this case, the response includes a count of the version conflicts that were encountered. Note that the handling of other error types is unaffected by the conflicts property. Additionally, if you opt to count version conflicts, the operation could attempt to reindex more documents from the source than max_docs until it has successfully indexed max_docs documents into the target or it has gone through every document in the source query.

NOTE: The reindex API makes no effort to handle ID collisions. The last document written will "win" but the order isn't usually predictable so it is not a good idea to rely on this behavior. Instead, make sure that IDs are unique by using a script.

Running reindex asynchronously

If the request contains wait_for_completion=false, Elasticsearch performs some preflight checks, launches the request, and returns a task you can use to cancel or get the status of the task. Elasticsearch creates a record of this task as a document at _tasks/<task_id>.

Reindex from multiple sources

If you have many sources to reindex it is generally better to reindex them one at a time rather than using a glob pattern to pick up multiple sources. That way you can resume the process if there are any errors by removing the partially completed source and starting over. It also makes parallelizing the process fairly simple: split the list of sources to reindex and run each list in parallel.

For example, you can use a bash script like this:

for index in i1 i2 i3 i4 i5; do
  curl -HContent-Type:application/json -XPOST localhost:9200/_reindex?pretty -d'{
    "source": {
      "index": "'$index'"
    },
    "dest": {
      "index": "'$index'-reindexed"
    }
  }'
done

Throttling

Set requests_per_second to any positive decimal number (1.4, 6, 1000, for example) to throttle the rate at which reindex issues batches of index operations. Requests are throttled by padding each batch with a wait time. To turn off throttling, set requests_per_second to -1.

The throttling is done by waiting between batches so that the scroll that reindex uses internally can be given a timeout that takes into account the padding. The padding time is the difference between the batch size divided by the requests_per_second and the time spent writing. By default the batch size is 1000, so if requests_per_second is set to 500:

target_time = 1000 / 500 per second = 2 seconds
wait_time = target_time - write_time = 2 seconds - .5 seconds = 1.5 seconds

Since the batch is issued as a single bulk request, large batch sizes cause Elasticsearch to create many requests and then wait for a while before starting the next set. This is "bursty" instead of "smooth".

Slicing

Reindex supports sliced scroll to parallelize the reindexing process. This parallelization can improve efficiency and provide a convenient way to break the request down into smaller parts.

NOTE: Reindexing from remote clusters does not support manual or automatic slicing.

You can slice a reindex request manually by providing a slice ID and total number of slices to each request. You can also let reindex automatically parallelize by using sliced scroll to slice on _id. The slices parameter specifies the number of slices to use.

Adding slices to the reindex request just automates the manual process, creating sub-requests which means it has some quirks:

  • You can see these requests in the tasks API. These sub-requests are "child" tasks of the task for the request with slices.
  • Fetching the status of the task for the request with slices only contains the status of completed slices.
  • These sub-requests are individually addressable for things like cancellation and rethrottling.
  • Rethrottling the request with slices will rethrottle the unfinished sub-request proportionally.
  • Canceling the request with slices will cancel each sub-request.
  • Due to the nature of slices, each sub-request won't get a perfectly even portion of the documents. All documents will be addressed, but some slices may be larger than others. Expect larger slices to have a more even distribution.
  • Parameters like requests_per_second and max_docs on a request with slices are distributed proportionally to each sub-request. Combine that with the previous point about distribution being uneven and you should conclude that using max_docs with slices might not result in exactly max_docs documents being reindexed.
  • Each sub-request gets a slightly different snapshot of the source, though these are all taken at approximately the same time.

If slicing automatically, setting slices to auto will choose a reasonable number for most indices. If slicing manually or otherwise tuning automatic slicing, use the following guidelines.

Query performance is most efficient when the number of slices is equal to the number of shards in the index. If that number is large (for example, 500), choose a lower number as too many slices will hurt performance. Setting slices higher than the number of shards generally does not improve efficiency and adds overhead.

Indexing performance scales linearly across available resources with the number of slices.

Whether query or indexing performance dominates the runtime depends on the documents being reindexed and cluster resources.

Modify documents during reindexing

Like _update_by_query, reindex operations support a script that modifies the document. Unlike _update_by_query, the script is allowed to modify the document's metadata.

Just as in _update_by_query, you can set ctx.op to change the operation that is run on the destination. For example, set ctx.op to noop if your script decides that the document doesn’t have to be indexed in the destination. This "no operation" will be reported in the noop counter in the response body. Set ctx.op to delete if your script decides that the document must be deleted from the destination. The deletion will be reported in the deleted counter in the response body. Setting ctx.op to anything else will return an error, as will setting any other field in ctx.

Think of the possibilities! Just be careful; you are able to change:

  • _id
  • _index
  • _version
  • _routing

Setting _version to null or clearing it from the ctx map is just like not sending the version in an indexing request. It will cause the document to be overwritten in the destination regardless of the version on the target or the version type you use in the reindex API.

Reindex from remote

Reindex supports reindexing from a remote Elasticsearch cluster. The host parameter must contain a scheme, host, port, and optional path. The username and password parameters are optional and when they are present the reindex operation will connect to the remote Elasticsearch node using basic authentication. Be sure to use HTTPS when using basic authentication or the password will be sent in plain text. There are a range of settings available to configure the behavior of the HTTPS connection.

When using Elastic Cloud, it is also possible to authenticate against the remote cluster through the use of a valid API key. Remote hosts must be explicitly allowed with the reindex.remote.whitelist setting. It can be set to a comma delimited list of allowed remote host and port combinations. Scheme is ignored; only the host and port are used. For example:

reindex.remote.whitelist: [otherhost:9200, another:9200, 127.0.10.*:9200, localhost:*"]

The list of allowed hosts must be configured on any nodes that will coordinate the reindex. This feature should work with remote clusters of any version of Elasticsearch. This should enable you to upgrade from any version of Elasticsearch to the current version by reindexing from a cluster of the old version.

WARNING: Elasticsearch does not support forward compatibility across major versions. For example, you cannot reindex from a 7.x cluster into a 6.x cluster.

To enable queries sent to older versions of Elasticsearch, the query parameter is sent directly to the remote host without validation or modification.

NOTE: Reindexing from remote clusters does not support manual or automatic slicing.

Reindexing from a remote server uses an on-heap buffer that defaults to a maximum size of 100mb. If the remote index includes very large documents you'll need to use a smaller batch size. It is also possible to set the socket read timeout on the remote connection with the socket_timeout field and the connection timeout with the connect_timeout field. Both default to 30 seconds.

Configuring SSL parameters

Reindex from remote supports configurable SSL settings. These must be specified in the elasticsearch.yml file, with the exception of the secure settings, which you add in the Elasticsearch keystore. It is not possible to configure SSL in the body of the reindex request.

Query parameters

  • refresh boolean

    If true, the request refreshes affected shards to make this operation visible to search.

  • The throttle for this request in sub-requests per second. By default, there is no throttle.

  • scroll string

    The period of time that a consistent view of the index should be maintained for scrolled search.

  • slices number | string

    The number of slices this task should be divided into. It defaults to one slice, which means the task isn't sliced into subtasks.

    Reindex supports sliced scroll to parallelize the reindexing process. This parallelization can improve efficiency and provide a convenient way to break the request down into smaller parts.

    NOTE: Reindexing from remote clusters does not support manual or automatic slicing.

    If set to auto, Elasticsearch chooses the number of slices to use. This setting will use one slice per shard, up to a certain limit. If there are multiple sources, it will choose the number of slices based on the index or backing index with the smallest number of shards.

  • timeout string

    The period each indexing waits for automatic index creation, dynamic mapping updates, and waiting for active shards. By default, Elasticsearch waits for at least one minute before failing. The actual wait time could be longer, particularly when multiple waits occur.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. Set it to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default value is one, which means it waits for each primary shard to be active.

  • If true, the request blocks until the operation is complete.

  • If true, the destination must be an index alias.

application/json

Body Required

  • Values are abort or proceed.

  • dest object Required
    Hide dest attributes Show dest attributes object
  • max_docs number

    The maximum number of documents to reindex. By default, all documents are reindexed. If it is a value less then or equal to scroll_size, a scroll will not be used to retrieve the results for the operation.

    If conflicts is set to proceed, the reindex operation could attempt to reindex more documents from the source than max_docs until it has successfully indexed max_docs documents into the target or it has gone through every document in the source query.

  • script object
    Hide script attributes Show script attributes object
    • source string | object

      One of:
    • id string
    • params object

      Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

      Hide params attribute Show params attribute object
      • * object Additional properties
    • lang string

      Any of:

      Values are painless, expression, mustache, or java.

    • options object
      Hide options attribute Show options attribute object
      • * string Additional properties
  • size number
  • source object Required
    Hide source attributes Show source attributes object
    • index string | array[string] Required
    • query object

      An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
    • remote object
      Hide remote attributes Show remote attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • headers object

        An object containing the headers of the request.

        Hide headers attribute Show headers attribute object
        • * string Additional properties
      • host string Required
      • username string
      • password string
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • size number

      The number of documents to index per batch. Use it when you are indexing from remote to ensure that the batches fit within the on-heap buffer, which defaults to a maximum size of 100 MB.

    • slice object
      Hide slice attributes Show slice attributes object
      • field string

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • id string Required
      • max number Required
    • sort string | object | array[string | object]

      One of:

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • _source string | array[string]
    • Hide runtime_mappings attribute Show runtime_mappings attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • fields object

          For type composite

          Hide fields attribute Show fields attribute object
          • * object Additional properties
            Hide * attribute Show * attribute object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • fetch_fields array[object]

          For type lookup

          Hide fetch_fields attributes Show fetch_fields attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • format string
        • format string

          A custom format for date type runtime fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • script object
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
        • type string Required

          Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • batches number

      The number of scroll responses that were pulled back by the reindex.

    • created number

      The number of documents that were successfully created.

    • deleted number

      The number of documents that were successfully deleted.

    • failures array[object]

      If there were any unrecoverable errors during the process, it is an array of those failures. If this array is not empty, the request ended because of those failures. Reindex is implemented using batches and any failure causes the entire process to end but all failures in the current batch are collected into the array. You can use the conflicts option to prevent the reindex from ending on version conflicts.

      Hide failures attributes Show failures attributes object
    • noops number

      The number of documents that were ignored because the script used for the reindex returned a noop value for ctx.op.

    • retries object
      Hide retries attributes Show retries attributes object
      • bulk number Required

        The number of bulk actions retried.

    • The number of requests per second effectively run during the reindex.

    • slice_id number
    • Time unit for milliseconds

    • Time unit for milliseconds

    • timed_out boolean

      If any of the requests that ran during the reindex timed out, it is true.

    • took number

      Time unit for milliseconds

    • total number

      The number of documents that were successfully processed.

    • updated number

      The number of documents that were successfully updated. That is to say, a document with the same ID already existed before the reindex updated it.

    • The number of version conflicts that occurred.

POST /_reindex
curl \
 --request POST 'http://api.example.com/_reindex' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"source\": {\n    \"index\": [\"my-index-000001\", \"my-index-000002\"]\n  },\n  \"dest\": {\n    \"index\": \"my-new-index-000002\"\n  }\n}"'
Run `POST _reindex` to reindex from multiple sources. The `index` attribute in source can be a list, which enables you to copy from lots of sources in one request. This example copies documents from the `my-index-000001` and `my-index-000002` indices.
{
  "source": {
    "index": ["my-index-000001", "my-index-000002"]
  },
  "dest": {
    "index": "my-new-index-000002"
  }
}
You can use Painless to reindex daily indices to apply a new template to the existing documents. The script extracts the date from the index name and creates a new index with `-1` appended. For example, all data from `metricbeat-2016.05.31` will be reindexed into `metricbeat-2016.05.31-1`.
{
  "source": {
    "index": "metricbeat-*"
  },
  "dest": {
    "index": "metricbeat"
  },
  "script": {
    "lang": "painless",
    "source": "ctx._index = 'metricbeat-' + (ctx._index.substring('metricbeat-'.length(), ctx._index.length())) + '-1'"
  }
}
Run `POST _reindex` to extract a random subset of the source for testing. You might need to adjust the `min_score` value depending on the relative amount of data extracted from source.
{
  "max_docs": 10,
  "source": {
    "index": "my-index-000001",
    "query": {
      "function_score" : {
        "random_score" : {},
        "min_score" : 0.9
      }
    }
  },
  "dest": {
    "index": "my-new-index-000001"
  }
}
Run `POST _reindex` to modify documents during reindexing. This example bumps the version of the source document.
{
  "source": {
    "index": "my-index-000001"
  },
  "dest": {
    "index": "my-new-index-000001",
    "version_type": "external"
  },
  "script": {
    "source": "if (ctx._source.foo == 'bar') {ctx._version++; ctx._source.remove('foo')}",
    "lang": "painless"
  }
}
When using Elastic Cloud, you can run `POST _reindex` and authenticate against a remote cluster with an API key.
{
  "source": {
    "remote": {
      "host": "http://otherhost:9200",
      "username": "user",
      "password": "pass"
    },
    "index": "my-index-000001",
    "query": {
      "match": {
        "test": "data"
      }
    }
  },
  "dest": {
    "index": "my-new-index-000001"
  }
}
Run `POST _reindex` to slice a reindex request manually. Provide a slice ID and total number of slices to each request.
{
  "source": {
    "index": "my-index-000001",
    "slice": {
      "id": 0,
      "max": 2
    }
  },
  "dest": {
    "index": "my-new-index-000001"
  }
}
Run `POST _reindex?slices=5&refresh` to automatically parallelize using sliced scroll to slice on `_id`. The `slices` parameter specifies the number of slices to use.
{
  "source": {
    "index": "my-index-000001"
  },
  "dest": {
    "index": "my-new-index-000001"
  }
}
By default if reindex sees a document with routing then the routing is preserved unless it's changed by the script. You can set `routing` on the `dest` request to change this behavior. In this example, run `POST _reindex` to copy all documents from the `source` with the company name `cat` into the `dest` with routing set to `cat`.
{
  "source": {
    "index": "source",
    "query": {
      "match": {
        "company": "cat"
      }
    }
  },
  "dest": {
    "index": "dest",
    "routing": "=cat"
  }
}
Run `POST _reindex` and use the ingest pipelines feature.
{
  "source": {
    "index": "source"
  },
  "dest": {
    "index": "dest",
    "pipeline": "some_ingest_pipeline"
  }
}
Run `POST _reindex` and add a query to the `source` to limit the documents to reindex. For example, this request copies documents into `my-new-index-000001` only if they have a `user.id` of `kimchy`.
{
  "source": {
    "index": "my-index-000001",
    "query": {
      "term": {
        "user.id": "kimchy"
      }
    }
  },
  "dest": {
    "index": "my-new-index-000001"
  }
}
You can limit the number of processed documents by setting `max_docs`. For example, run `POST _reindex` to copy a single document from `my-index-000001` to `my-new-index-000001`.
{
  "max_docs": 1,
  "source": {
    "index": "my-index-000001"
  },
  "dest": {
    "index": "my-new-index-000001"
  }
}
You can use source filtering to reindex a subset of the fields in the original documents. For example, run `POST _reindex` the reindex only the `user.id` and `_doc` fields of each document.
{
  "source": {
    "index": "my-index-000001",
    "_source": ["user.id", "_doc"]
  },
  "dest": {
    "index": "my-new-index-000001"
  }
}
A reindex operation can build a copy of an index with renamed fields. If your index has documents with `text` and `flag` fields, you can change the latter field name to `tag` during the reindex.
{
  "source": {
    "index": "my-index-000001"
  },
  "dest": {
    "index": "my-new-index-000001"
  },
  "script": {
    "source": "ctx._source.tag = ctx._source.remove(\"flag\")"
  }
}

Get term vector information

GET /{index}/_termvectors/{id}

Get information and statistics about terms in the fields of a particular document.

You can retrieve term vectors for documents stored in the index or for artificial documents passed in the body of the request. You can specify the fields you are interested in through the fields parameter or by adding the fields to the request body. For example:

GET /my-index-000001/_termvectors/1?fields=message

Fields can be specified using wildcards, similar to the multi match query.

Term vectors are real-time by default, not near real-time. This can be changed by setting realtime parameter to false.

You can request three types of values: term information, term statistics, and field statistics. By default, all term information and field statistics are returned for all fields but term statistics are excluded.

Term information

  • term frequency in the field (always returned)
  • term positions (positions: true)
  • start and end offsets (offsets: true)
  • term payloads (payloads: true), as base64 encoded bytes

If the requested information wasn't stored in the index, it will be computed on the fly if possible. Additionally, term vectors could be computed for documents not even existing in the index, but instead provided by the user.


Start and end offsets assume UTF-16 encoding is being used. If you want to use these offsets in order to get the original text that produced this token, you should make sure that the string you are taking a sub-string of is also encoded using UTF-16.

Behaviour

The term and field statistics are not accurate. Deleted documents are not taken into account. The information is only retrieved for the shard the requested document resides in. The term and field statistics are therefore only useful as relative measures whereas the absolute numbers have no meaning in this context. By default, when requesting term vectors of artificial documents, a shard to get the statistics from is randomly selected. Use routing only to hit a particular shard.

Path parameters

  • index string Required

    The name of the index that contains the document.

  • id string Required

    A unique identifier for the document.

Query parameters

  • fields string | array[string]

    A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the completion_fields or fielddata_fields parameters.

  • If true, the response includes:

    • The document count (how many documents contain this field).
    • The sum of document frequencies (the sum of document frequencies for all terms in this field).
    • The sum of total term frequencies (the sum of total term frequencies of each term in this field).
  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • The node or shard the operation should be performed on. It is random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • If true, the response includes:

    • The total term frequency (how often a term occurs in all documents).
    • The document frequency (the number of documents containing the current term).

    By default these values are not returned since term statistics can have a serious performance impact.

  • version number

    If true, returns the document version as part of a hit.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

application/json

Body

  • doc object

    An artificial document (a document not present in the index) for which you want to retrieve term vectors.

  • filter object
    Hide filter attributes Show filter attributes object
    • Ignore words which occur in more than this many docs. Defaults to unbounded.

    • The maximum number of terms that must be returned per field.

    • Ignore words with more than this frequency in the source doc. It defaults to unbounded.

    • The maximum word length above which words will be ignored. Defaults to unbounded.

    • Ignore terms which do not occur in at least this many docs.

    • Ignore words with less than this frequency in the source doc.

    • The minimum word length below which words will be ignored.

  • Override the default per-field analyzer. This is useful in order to generate term vectors in any fashion, especially when using artificial documents. When providing an analyzer for a field that already stores term vectors, the term vectors will be regenerated.

    Hide per_field_analyzer attribute Show per_field_analyzer attribute object
    • * string Additional properties
  • fields string | array[string]
  • If true, the response includes:

    • The document count (how many documents contain this field).
    • The sum of document frequencies (the sum of document frequencies for all terms in this field).
    • The sum of total term frequencies (the sum of total term frequencies of each term in this field).
  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • If true, the response includes:

    • The total term frequency (how often a term occurs in all documents).
    • The document frequency (the number of documents containing the current term).

    By default these values are not returned since term statistics can have a serious performance impact.

  • routing string
  • version number
  • Values are internal, external, external_gte, or force.

Responses

GET /{index}/_termvectors/{id}
curl \
 --request GET 'http://api.example.com/{index}/_termvectors/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"fields\" : [\"text\"],\n  \"offsets\" : true,\n  \"payloads\" : true,\n  \"positions\" : true,\n  \"term_statistics\" : true,\n  \"field_statistics\" : true\n}"'
Run `GET /my-index-000001/_termvectors/1` to return all information and statistics for field `text` in document 1.
{
  "fields" : ["text"],
  "offsets" : true,
  "payloads" : true,
  "positions" : true,
  "term_statistics" : true,
  "field_statistics" : true
}
Run `GET /my-index-000001/_termvectors/1` to set per-field analyzers. A different analyzer than the one at the field may be provided by using the `per_field_analyzer` parameter.
{
  "doc" : {
    "fullname" : "John Doe",
    "text" : "test test test"
  },
  "fields": ["fullname"],
  "per_field_analyzer" : {
    "fullname": "keyword"
  }
}
Run `GET /imdb/_termvectors` to filter the terms returned based on their tf-idf scores. It returns the three most "interesting" keywords from the artificial document having the given "plot" field value. Notice that the keyword "Tony" or any stop words are not part of the response, as their tf-idf must be too low.
{
  "doc": {
    "plot": "When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil."
  },
  "term_statistics": true,
  "field_statistics": true,
  "positions": false,
  "offsets": false,
  "filter": {
    "max_num_terms": 3,
    "min_term_freq": 1,
    "min_doc_freq": 1
  }
}
Run `GET /my-index-000001/_termvectors/1`. Term vectors which are not explicitly stored in the index are automatically computed on the fly. This request returns all information and statistics for the fields in document 1, even though the terms haven't been explicitly stored in the index. Note that for the field text, the terms are not regenerated.
{
  "fields" : ["text", "some_field_without_term_vectors"],
  "offsets" : true,
  "positions" : true,
  "term_statistics" : true,
  "field_statistics" : true
}
Run `GET /my-index-000001/_termvectors`. Term vectors can be generated for artificial documents, that is for documents not present in the index. If dynamic mapping is turned on (default), the document fields not in the original mapping will be dynamically created.
{
  "doc" : {
    "fullname" : "John Doe",
    "text" : "test test test"
  }
}
Response examples (200)
A successful response from `GET /my-index-000001/_termvectors/1`.
{
  "_index": "my-index-000001",
  "_id": "1",
  "_version": 1,
  "found": true,
  "took": 6,
  "term_vectors": {
    "text": {
      "field_statistics": {
        "sum_doc_freq": 4,
        "doc_count": 2,
        "sum_ttf": 6
      },
      "terms": {
        "test": {
          "doc_freq": 2,
          "ttf": 4,
          "term_freq": 3,
          "tokens": [
            {
              "position": 0,
              "start_offset": 0,
              "end_offset": 4,
              "payload": "d29yZA=="
            },
            {
              "position": 1,
              "start_offset": 5,
              "end_offset": 9,
              "payload": "d29yZA=="
            },
            {
              "position": 2,
              "start_offset": 10,
              "end_offset": 14,
              "payload": "d29yZA=="
            }
          ]
        }
      }
    }
  }
}
A successful response from `GET /my-index-000001/_termvectors` with `per_field_analyzer` in the request body.
{
  "_index": "my-index-000001",
  "_version": 0,
  "found": true,
  "took": 6,
  "term_vectors": {
    "fullname": {
      "field_statistics": {
          "sum_doc_freq": 2,
          "doc_count": 4,
          "sum_ttf": 4
      },
      "terms": {
          "John Doe": {
            "term_freq": 1,
            "tokens": [
                {
                  "position": 0,
                  "start_offset": 0,
                  "end_offset": 8
                }
            ]
          }
      }
    }
  }
}
A successful response from `GET /my-index-000001/_termvectors` with a `filter` in the request body.
{
  "_index": "imdb",
  "_version": 0,
  "found": true,
  "term_vectors": {
      "plot": {
        "field_statistics": {
            "sum_doc_freq": 3384269,
            "doc_count": 176214,
            "sum_ttf": 3753460
        },
        "terms": {
            "armored": {
              "doc_freq": 27,
              "ttf": 27,
              "term_freq": 1,
              "score": 9.74725
            },
            "industrialist": {
              "doc_freq": 88,
              "ttf": 88,
              "term_freq": 1,
              "score": 8.590818
            },
            "stark": {
              "doc_freq": 44,
              "ttf": 47,
              "term_freq": 1,
              "score": 9.272792
            }
        }
      }
  }
}

Get term vector information

POST /{index}/_termvectors/{id}

Get information and statistics about terms in the fields of a particular document.

You can retrieve term vectors for documents stored in the index or for artificial documents passed in the body of the request. You can specify the fields you are interested in through the fields parameter or by adding the fields to the request body. For example:

GET /my-index-000001/_termvectors/1?fields=message

Fields can be specified using wildcards, similar to the multi match query.

Term vectors are real-time by default, not near real-time. This can be changed by setting realtime parameter to false.

You can request three types of values: term information, term statistics, and field statistics. By default, all term information and field statistics are returned for all fields but term statistics are excluded.

Term information

  • term frequency in the field (always returned)
  • term positions (positions: true)
  • start and end offsets (offsets: true)
  • term payloads (payloads: true), as base64 encoded bytes

If the requested information wasn't stored in the index, it will be computed on the fly if possible. Additionally, term vectors could be computed for documents not even existing in the index, but instead provided by the user.


Start and end offsets assume UTF-16 encoding is being used. If you want to use these offsets in order to get the original text that produced this token, you should make sure that the string you are taking a sub-string of is also encoded using UTF-16.

Behaviour

The term and field statistics are not accurate. Deleted documents are not taken into account. The information is only retrieved for the shard the requested document resides in. The term and field statistics are therefore only useful as relative measures whereas the absolute numbers have no meaning in this context. By default, when requesting term vectors of artificial documents, a shard to get the statistics from is randomly selected. Use routing only to hit a particular shard.

Path parameters

  • index string Required

    The name of the index that contains the document.

  • id string Required

    A unique identifier for the document.

Query parameters

  • fields string | array[string]

    A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the completion_fields or fielddata_fields parameters.

  • If true, the response includes:

    • The document count (how many documents contain this field).
    • The sum of document frequencies (the sum of document frequencies for all terms in this field).
    • The sum of total term frequencies (the sum of total term frequencies of each term in this field).
  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • The node or shard the operation should be performed on. It is random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • If true, the response includes:

    • The total term frequency (how often a term occurs in all documents).
    • The document frequency (the number of documents containing the current term).

    By default these values are not returned since term statistics can have a serious performance impact.

  • version number

    If true, returns the document version as part of a hit.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

application/json

Body

  • doc object

    An artificial document (a document not present in the index) for which you want to retrieve term vectors.

  • filter object
    Hide filter attributes Show filter attributes object
    • Ignore words which occur in more than this many docs. Defaults to unbounded.

    • The maximum number of terms that must be returned per field.

    • Ignore words with more than this frequency in the source doc. It defaults to unbounded.

    • The maximum word length above which words will be ignored. Defaults to unbounded.

    • Ignore terms which do not occur in at least this many docs.

    • Ignore words with less than this frequency in the source doc.

    • The minimum word length below which words will be ignored.

  • Override the default per-field analyzer. This is useful in order to generate term vectors in any fashion, especially when using artificial documents. When providing an analyzer for a field that already stores term vectors, the term vectors will be regenerated.

    Hide per_field_analyzer attribute Show per_field_analyzer attribute object
    • * string Additional properties
  • fields string | array[string]
  • If true, the response includes:

    • The document count (how many documents contain this field).
    • The sum of document frequencies (the sum of document frequencies for all terms in this field).
    • The sum of total term frequencies (the sum of total term frequencies of each term in this field).
  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • If true, the response includes:

    • The total term frequency (how often a term occurs in all documents).
    • The document frequency (the number of documents containing the current term).

    By default these values are not returned since term statistics can have a serious performance impact.

  • routing string
  • version number
  • Values are internal, external, external_gte, or force.

Responses

POST /{index}/_termvectors/{id}
curl \
 --request POST 'http://api.example.com/{index}/_termvectors/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"fields\" : [\"text\"],\n  \"offsets\" : true,\n  \"payloads\" : true,\n  \"positions\" : true,\n  \"term_statistics\" : true,\n  \"field_statistics\" : true\n}"'
Run `GET /my-index-000001/_termvectors/1` to return all information and statistics for field `text` in document 1.
{
  "fields" : ["text"],
  "offsets" : true,
  "payloads" : true,
  "positions" : true,
  "term_statistics" : true,
  "field_statistics" : true
}
Run `GET /my-index-000001/_termvectors/1` to set per-field analyzers. A different analyzer than the one at the field may be provided by using the `per_field_analyzer` parameter.
{
  "doc" : {
    "fullname" : "John Doe",
    "text" : "test test test"
  },
  "fields": ["fullname"],
  "per_field_analyzer" : {
    "fullname": "keyword"
  }
}
Run `GET /imdb/_termvectors` to filter the terms returned based on their tf-idf scores. It returns the three most "interesting" keywords from the artificial document having the given "plot" field value. Notice that the keyword "Tony" or any stop words are not part of the response, as their tf-idf must be too low.
{
  "doc": {
    "plot": "When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil."
  },
  "term_statistics": true,
  "field_statistics": true,
  "positions": false,
  "offsets": false,
  "filter": {
    "max_num_terms": 3,
    "min_term_freq": 1,
    "min_doc_freq": 1
  }
}
Run `GET /my-index-000001/_termvectors/1`. Term vectors which are not explicitly stored in the index are automatically computed on the fly. This request returns all information and statistics for the fields in document 1, even though the terms haven't been explicitly stored in the index. Note that for the field text, the terms are not regenerated.
{
  "fields" : ["text", "some_field_without_term_vectors"],
  "offsets" : true,
  "positions" : true,
  "term_statistics" : true,
  "field_statistics" : true
}
Run `GET /my-index-000001/_termvectors`. Term vectors can be generated for artificial documents, that is for documents not present in the index. If dynamic mapping is turned on (default), the document fields not in the original mapping will be dynamically created.
{
  "doc" : {
    "fullname" : "John Doe",
    "text" : "test test test"
  }
}
Response examples (200)
A successful response from `GET /my-index-000001/_termvectors/1`.
{
  "_index": "my-index-000001",
  "_id": "1",
  "_version": 1,
  "found": true,
  "took": 6,
  "term_vectors": {
    "text": {
      "field_statistics": {
        "sum_doc_freq": 4,
        "doc_count": 2,
        "sum_ttf": 6
      },
      "terms": {
        "test": {
          "doc_freq": 2,
          "ttf": 4,
          "term_freq": 3,
          "tokens": [
            {
              "position": 0,
              "start_offset": 0,
              "end_offset": 4,
              "payload": "d29yZA=="
            },
            {
              "position": 1,
              "start_offset": 5,
              "end_offset": 9,
              "payload": "d29yZA=="
            },
            {
              "position": 2,
              "start_offset": 10,
              "end_offset": 14,
              "payload": "d29yZA=="
            }
          ]
        }
      }
    }
  }
}
A successful response from `GET /my-index-000001/_termvectors` with `per_field_analyzer` in the request body.
{
  "_index": "my-index-000001",
  "_version": 0,
  "found": true,
  "took": 6,
  "term_vectors": {
    "fullname": {
      "field_statistics": {
          "sum_doc_freq": 2,
          "doc_count": 4,
          "sum_ttf": 4
      },
      "terms": {
          "John Doe": {
            "term_freq": 1,
            "tokens": [
                {
                  "position": 0,
                  "start_offset": 0,
                  "end_offset": 8
                }
            ]
          }
      }
    }
  }
}
A successful response from `GET /my-index-000001/_termvectors` with a `filter` in the request body.
{
  "_index": "imdb",
  "_version": 0,
  "found": true,
  "term_vectors": {
      "plot": {
        "field_statistics": {
            "sum_doc_freq": 3384269,
            "doc_count": 176214,
            "sum_ttf": 3753460
        },
        "terms": {
            "armored": {
              "doc_freq": 27,
              "ttf": 27,
              "term_freq": 1,
              "score": 9.74725
            },
            "industrialist": {
              "doc_freq": 88,
              "ttf": 88,
              "term_freq": 1,
              "score": 8.590818
            },
            "stark": {
              "doc_freq": 44,
              "ttf": 47,
              "term_freq": 1,
              "score": 9.272792
            }
        }
      }
  }
}

Get term vector information

GET /{index}/_termvectors

Get information and statistics about terms in the fields of a particular document.

You can retrieve term vectors for documents stored in the index or for artificial documents passed in the body of the request. You can specify the fields you are interested in through the fields parameter or by adding the fields to the request body. For example:

GET /my-index-000001/_termvectors/1?fields=message

Fields can be specified using wildcards, similar to the multi match query.

Term vectors are real-time by default, not near real-time. This can be changed by setting realtime parameter to false.

You can request three types of values: term information, term statistics, and field statistics. By default, all term information and field statistics are returned for all fields but term statistics are excluded.

Term information

  • term frequency in the field (always returned)
  • term positions (positions: true)
  • start and end offsets (offsets: true)
  • term payloads (payloads: true), as base64 encoded bytes

If the requested information wasn't stored in the index, it will be computed on the fly if possible. Additionally, term vectors could be computed for documents not even existing in the index, but instead provided by the user.


Start and end offsets assume UTF-16 encoding is being used. If you want to use these offsets in order to get the original text that produced this token, you should make sure that the string you are taking a sub-string of is also encoded using UTF-16.

Behaviour

The term and field statistics are not accurate. Deleted documents are not taken into account. The information is only retrieved for the shard the requested document resides in. The term and field statistics are therefore only useful as relative measures whereas the absolute numbers have no meaning in this context. By default, when requesting term vectors of artificial documents, a shard to get the statistics from is randomly selected. Use routing only to hit a particular shard.

Path parameters

  • index string Required

    The name of the index that contains the document.

Query parameters

  • fields string | array[string]

    A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the completion_fields or fielddata_fields parameters.

  • If true, the response includes:

    • The document count (how many documents contain this field).
    • The sum of document frequencies (the sum of document frequencies for all terms in this field).
    • The sum of total term frequencies (the sum of total term frequencies of each term in this field).
  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • The node or shard the operation should be performed on. It is random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • If true, the response includes:

    • The total term frequency (how often a term occurs in all documents).
    • The document frequency (the number of documents containing the current term).

    By default these values are not returned since term statistics can have a serious performance impact.

  • version number

    If true, returns the document version as part of a hit.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

application/json

Body

  • doc object

    An artificial document (a document not present in the index) for which you want to retrieve term vectors.

  • filter object
    Hide filter attributes Show filter attributes object
    • Ignore words which occur in more than this many docs. Defaults to unbounded.

    • The maximum number of terms that must be returned per field.

    • Ignore words with more than this frequency in the source doc. It defaults to unbounded.

    • The maximum word length above which words will be ignored. Defaults to unbounded.

    • Ignore terms which do not occur in at least this many docs.

    • Ignore words with less than this frequency in the source doc.

    • The minimum word length below which words will be ignored.

  • Override the default per-field analyzer. This is useful in order to generate term vectors in any fashion, especially when using artificial documents. When providing an analyzer for a field that already stores term vectors, the term vectors will be regenerated.

    Hide per_field_analyzer attribute Show per_field_analyzer attribute object
    • * string Additional properties
  • fields string | array[string]
  • If true, the response includes:

    • The document count (how many documents contain this field).
    • The sum of document frequencies (the sum of document frequencies for all terms in this field).
    • The sum of total term frequencies (the sum of total term frequencies of each term in this field).
  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • If true, the response includes:

    • The total term frequency (how often a term occurs in all documents).
    • The document frequency (the number of documents containing the current term).

    By default these values are not returned since term statistics can have a serious performance impact.

  • routing string
  • version number
  • Values are internal, external, external_gte, or force.

Responses

GET /{index}/_termvectors
curl \
 --request GET 'http://api.example.com/{index}/_termvectors' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"fields\" : [\"text\"],\n  \"offsets\" : true,\n  \"payloads\" : true,\n  \"positions\" : true,\n  \"term_statistics\" : true,\n  \"field_statistics\" : true\n}"'
Run `GET /my-index-000001/_termvectors/1` to return all information and statistics for field `text` in document 1.
{
  "fields" : ["text"],
  "offsets" : true,
  "payloads" : true,
  "positions" : true,
  "term_statistics" : true,
  "field_statistics" : true
}
Run `GET /my-index-000001/_termvectors/1` to set per-field analyzers. A different analyzer than the one at the field may be provided by using the `per_field_analyzer` parameter.
{
  "doc" : {
    "fullname" : "John Doe",
    "text" : "test test test"
  },
  "fields": ["fullname"],
  "per_field_analyzer" : {
    "fullname": "keyword"
  }
}
Run `GET /imdb/_termvectors` to filter the terms returned based on their tf-idf scores. It returns the three most "interesting" keywords from the artificial document having the given "plot" field value. Notice that the keyword "Tony" or any stop words are not part of the response, as their tf-idf must be too low.
{
  "doc": {
    "plot": "When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil."
  },
  "term_statistics": true,
  "field_statistics": true,
  "positions": false,
  "offsets": false,
  "filter": {
    "max_num_terms": 3,
    "min_term_freq": 1,
    "min_doc_freq": 1
  }
}
Run `GET /my-index-000001/_termvectors/1`. Term vectors which are not explicitly stored in the index are automatically computed on the fly. This request returns all information and statistics for the fields in document 1, even though the terms haven't been explicitly stored in the index. Note that for the field text, the terms are not regenerated.
{
  "fields" : ["text", "some_field_without_term_vectors"],
  "offsets" : true,
  "positions" : true,
  "term_statistics" : true,
  "field_statistics" : true
}
Run `GET /my-index-000001/_termvectors`. Term vectors can be generated for artificial documents, that is for documents not present in the index. If dynamic mapping is turned on (default), the document fields not in the original mapping will be dynamically created.
{
  "doc" : {
    "fullname" : "John Doe",
    "text" : "test test test"
  }
}
Response examples (200)
A successful response from `GET /my-index-000001/_termvectors/1`.
{
  "_index": "my-index-000001",
  "_id": "1",
  "_version": 1,
  "found": true,
  "took": 6,
  "term_vectors": {
    "text": {
      "field_statistics": {
        "sum_doc_freq": 4,
        "doc_count": 2,
        "sum_ttf": 6
      },
      "terms": {
        "test": {
          "doc_freq": 2,
          "ttf": 4,
          "term_freq": 3,
          "tokens": [
            {
              "position": 0,
              "start_offset": 0,
              "end_offset": 4,
              "payload": "d29yZA=="
            },
            {
              "position": 1,
              "start_offset": 5,
              "end_offset": 9,
              "payload": "d29yZA=="
            },
            {
              "position": 2,
              "start_offset": 10,
              "end_offset": 14,
              "payload": "d29yZA=="
            }
          ]
        }
      }
    }
  }
}
A successful response from `GET /my-index-000001/_termvectors` with `per_field_analyzer` in the request body.
{
  "_index": "my-index-000001",
  "_version": 0,
  "found": true,
  "took": 6,
  "term_vectors": {
    "fullname": {
      "field_statistics": {
          "sum_doc_freq": 2,
          "doc_count": 4,
          "sum_ttf": 4
      },
      "terms": {
          "John Doe": {
            "term_freq": 1,
            "tokens": [
                {
                  "position": 0,
                  "start_offset": 0,
                  "end_offset": 8
                }
            ]
          }
      }
    }
  }
}
A successful response from `GET /my-index-000001/_termvectors` with a `filter` in the request body.
{
  "_index": "imdb",
  "_version": 0,
  "found": true,
  "term_vectors": {
      "plot": {
        "field_statistics": {
            "sum_doc_freq": 3384269,
            "doc_count": 176214,
            "sum_ttf": 3753460
        },
        "terms": {
            "armored": {
              "doc_freq": 27,
              "ttf": 27,
              "term_freq": 1,
              "score": 9.74725
            },
            "industrialist": {
              "doc_freq": 88,
              "ttf": 88,
              "term_freq": 1,
              "score": 8.590818
            },
            "stark": {
              "doc_freq": 44,
              "ttf": 47,
              "term_freq": 1,
              "score": 9.272792
            }
        }
      }
  }
}

Get term vector information

POST /{index}/_termvectors

Get information and statistics about terms in the fields of a particular document.

You can retrieve term vectors for documents stored in the index or for artificial documents passed in the body of the request. You can specify the fields you are interested in through the fields parameter or by adding the fields to the request body. For example:

GET /my-index-000001/_termvectors/1?fields=message

Fields can be specified using wildcards, similar to the multi match query.

Term vectors are real-time by default, not near real-time. This can be changed by setting realtime parameter to false.

You can request three types of values: term information, term statistics, and field statistics. By default, all term information and field statistics are returned for all fields but term statistics are excluded.

Term information

  • term frequency in the field (always returned)
  • term positions (positions: true)
  • start and end offsets (offsets: true)
  • term payloads (payloads: true), as base64 encoded bytes

If the requested information wasn't stored in the index, it will be computed on the fly if possible. Additionally, term vectors could be computed for documents not even existing in the index, but instead provided by the user.


Start and end offsets assume UTF-16 encoding is being used. If you want to use these offsets in order to get the original text that produced this token, you should make sure that the string you are taking a sub-string of is also encoded using UTF-16.

Behaviour

The term and field statistics are not accurate. Deleted documents are not taken into account. The information is only retrieved for the shard the requested document resides in. The term and field statistics are therefore only useful as relative measures whereas the absolute numbers have no meaning in this context. By default, when requesting term vectors of artificial documents, a shard to get the statistics from is randomly selected. Use routing only to hit a particular shard.

Path parameters

  • index string Required

    The name of the index that contains the document.

Query parameters

  • fields string | array[string]

    A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the completion_fields or fielddata_fields parameters.

  • If true, the response includes:

    • The document count (how many documents contain this field).
    • The sum of document frequencies (the sum of document frequencies for all terms in this field).
    • The sum of total term frequencies (the sum of total term frequencies of each term in this field).
  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • The node or shard the operation should be performed on. It is random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • If true, the response includes:

    • The total term frequency (how often a term occurs in all documents).
    • The document frequency (the number of documents containing the current term).

    By default these values are not returned since term statistics can have a serious performance impact.

  • version number

    If true, returns the document version as part of a hit.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

application/json

Body

  • doc object

    An artificial document (a document not present in the index) for which you want to retrieve term vectors.

  • filter object
    Hide filter attributes Show filter attributes object
    • Ignore words which occur in more than this many docs. Defaults to unbounded.

    • The maximum number of terms that must be returned per field.

    • Ignore words with more than this frequency in the source doc. It defaults to unbounded.

    • The maximum word length above which words will be ignored. Defaults to unbounded.

    • Ignore terms which do not occur in at least this many docs.

    • Ignore words with less than this frequency in the source doc.

    • The minimum word length below which words will be ignored.

  • Override the default per-field analyzer. This is useful in order to generate term vectors in any fashion, especially when using artificial documents. When providing an analyzer for a field that already stores term vectors, the term vectors will be regenerated.

    Hide per_field_analyzer attribute Show per_field_analyzer attribute object
    • * string Additional properties
  • fields string | array[string]
  • If true, the response includes:

    • The document count (how many documents contain this field).
    • The sum of document frequencies (the sum of document frequencies for all terms in this field).
    • The sum of total term frequencies (the sum of total term frequencies of each term in this field).
  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • If true, the response includes:

    • The total term frequency (how often a term occurs in all documents).
    • The document frequency (the number of documents containing the current term).

    By default these values are not returned since term statistics can have a serious performance impact.

  • routing string
  • version number
  • Values are internal, external, external_gte, or force.

Responses

POST /{index}/_termvectors
curl \
 --request POST 'http://api.example.com/{index}/_termvectors' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"fields\" : [\"text\"],\n  \"offsets\" : true,\n  \"payloads\" : true,\n  \"positions\" : true,\n  \"term_statistics\" : true,\n  \"field_statistics\" : true\n}"'
Run `GET /my-index-000001/_termvectors/1` to return all information and statistics for field `text` in document 1.
{
  "fields" : ["text"],
  "offsets" : true,
  "payloads" : true,
  "positions" : true,
  "term_statistics" : true,
  "field_statistics" : true
}
Run `GET /my-index-000001/_termvectors/1` to set per-field analyzers. A different analyzer than the one at the field may be provided by using the `per_field_analyzer` parameter.
{
  "doc" : {
    "fullname" : "John Doe",
    "text" : "test test test"
  },
  "fields": ["fullname"],
  "per_field_analyzer" : {
    "fullname": "keyword"
  }
}
Run `GET /imdb/_termvectors` to filter the terms returned based on their tf-idf scores. It returns the three most "interesting" keywords from the artificial document having the given "plot" field value. Notice that the keyword "Tony" or any stop words are not part of the response, as their tf-idf must be too low.
{
  "doc": {
    "plot": "When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil."
  },
  "term_statistics": true,
  "field_statistics": true,
  "positions": false,
  "offsets": false,
  "filter": {
    "max_num_terms": 3,
    "min_term_freq": 1,
    "min_doc_freq": 1
  }
}
Run `GET /my-index-000001/_termvectors/1`. Term vectors which are not explicitly stored in the index are automatically computed on the fly. This request returns all information and statistics for the fields in document 1, even though the terms haven't been explicitly stored in the index. Note that for the field text, the terms are not regenerated.
{
  "fields" : ["text", "some_field_without_term_vectors"],
  "offsets" : true,
  "positions" : true,
  "term_statistics" : true,
  "field_statistics" : true
}
Run `GET /my-index-000001/_termvectors`. Term vectors can be generated for artificial documents, that is for documents not present in the index. If dynamic mapping is turned on (default), the document fields not in the original mapping will be dynamically created.
{
  "doc" : {
    "fullname" : "John Doe",
    "text" : "test test test"
  }
}
Response examples (200)
A successful response from `GET /my-index-000001/_termvectors/1`.
{
  "_index": "my-index-000001",
  "_id": "1",
  "_version": 1,
  "found": true,
  "took": 6,
  "term_vectors": {
    "text": {
      "field_statistics": {
        "sum_doc_freq": 4,
        "doc_count": 2,
        "sum_ttf": 6
      },
      "terms": {
        "test": {
          "doc_freq": 2,
          "ttf": 4,
          "term_freq": 3,
          "tokens": [
            {
              "position": 0,
              "start_offset": 0,
              "end_offset": 4,
              "payload": "d29yZA=="
            },
            {
              "position": 1,
              "start_offset": 5,
              "end_offset": 9,
              "payload": "d29yZA=="
            },
            {
              "position": 2,
              "start_offset": 10,
              "end_offset": 14,
              "payload": "d29yZA=="
            }
          ]
        }
      }
    }
  }
}
A successful response from `GET /my-index-000001/_termvectors` with `per_field_analyzer` in the request body.
{
  "_index": "my-index-000001",
  "_version": 0,
  "found": true,
  "took": 6,
  "term_vectors": {
    "fullname": {
      "field_statistics": {
          "sum_doc_freq": 2,
          "doc_count": 4,
          "sum_ttf": 4
      },
      "terms": {
          "John Doe": {
            "term_freq": 1,
            "tokens": [
                {
                  "position": 0,
                  "start_offset": 0,
                  "end_offset": 8
                }
            ]
          }
      }
    }
  }
}
A successful response from `GET /my-index-000001/_termvectors` with a `filter` in the request body.
{
  "_index": "imdb",
  "_version": 0,
  "found": true,
  "term_vectors": {
      "plot": {
        "field_statistics": {
            "sum_doc_freq": 3384269,
            "doc_count": 176214,
            "sum_ttf": 3753460
        },
        "terms": {
            "armored": {
              "doc_freq": 27,
              "ttf": 27,
              "term_freq": 1,
              "score": 9.74725
            },
            "industrialist": {
              "doc_freq": 88,
              "ttf": 88,
              "term_freq": 1,
              "score": 8.590818
            },
            "stark": {
              "doc_freq": 44,
              "ttf": 47,
              "term_freq": 1,
              "score": 9.272792
            }
        }
      }
  }
}

Update a document

POST /{index}/_update/{id}

Update a document by running a script or passing a partial document.

If the Elasticsearch security features are enabled, you must have the index or write index privilege for the target index or index alias.

The script can update, delete, or skip modifying the document. The API also supports passing a partial document, which is merged into the existing document. To fully replace an existing document, use the index API. This operation:

  • Gets the document (collocated with the shard) from the index.
  • Runs the specified script.
  • Indexes the result.

The document must still be reindexed, but using this API removes some network roundtrips and reduces chances of version conflicts between the GET and the index operation.

The _source field must be enabled to use this API. In addition to _source, you can access the following variables through the ctx map: _index, _type, _id, _version, _routing, and _now (the current timestamp).

Path parameters

  • index string Required

    The name of the target index. By default, the index is created automatically if it doesn't exist.

  • id string Required

    A unique identifier for the document to be updated.

Query parameters

  • Only perform the operation if the document has this primary term.

  • Only perform the operation if the document has this sequence number.

  • True or false if to include the document source in the error message in case of parsing errors.

  • lang string

    The script language.

  • refresh string

    If 'true', Elasticsearch refreshes the affected shards to make this operation visible to search. If 'wait_for', it waits for a refresh to make this operation visible to search. If 'false', it does nothing with refreshes.

    Values are true, false, or wait_for.

  • If true, the destination must be an index alias.

  • The number of times the operation should be retried when a conflict occurs.

  • routing string

    A custom value used to route operations to a specific shard.

  • timeout string

    The period to wait for the following operations: dynamic mapping updates and waiting for active shards. Elasticsearch waits for at least the timeout period before failing. The actual wait time could be longer, particularly when multiple waits occur.

  • wait_for_active_shards number | string

    The number of copies of each shard that must be active before proceeding with the operation. Set to 'all' or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default value of 1 means it waits for each primary shard to be active.

  • _source boolean | string | array[string]

    If false, source retrieval is turned off. You can also specify a comma-separated list of the fields you want to retrieve.

  • _source_excludes string | array[string]

    The source fields you want to exclude.

  • _source_includes string | array[string]

    The source fields you want to retrieve.

application/json

Body Required

  • If true, the result in the response is set to noop (no operation) when there are no changes to the document.

  • doc object

    A partial update to an existing document. If both doc and script are specified, doc is ignored.

  • If true, use the contents of 'doc' as the value of 'upsert'. NOTE: Using ingest pipelines with doc_as_upsert is not supported.

  • script object
    Hide script attributes Show script attributes object
    • source string | object

      One of:
    • id string
    • params object

      Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

      Hide params attribute Show params attribute object
      • * object Additional properties
    • lang string

      Any of:

      Values are painless, expression, mustache, or java.

    • options object
      Hide options attribute Show options attribute object
      • * string Additional properties
  • If true, run the script whether or not the document exists.

  • _source boolean | object

    Defines how to fetch a source. Fetching can be disabled entirely, or the source can be filtered.

    One of:
  • upsert object

    If the document does not already exist, the contents of 'upsert' are inserted as a new document. If the document exists, the 'script' is run.

Responses

POST /{index}/_update/{id}
curl \
 --request POST 'http://api.example.com/{index}/_update/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"script\" : {\n    \"source\": \"ctx._source.counter += params.count\",\n    \"lang\": \"painless\",\n    \"params\" : {\n      \"count\" : 4\n    }\n  }\n}"'
Run `POST test/_update/1` to increment a counter by using a script.
{
  "script" : {
    "source": "ctx._source.counter += params.count",
    "lang": "painless",
    "params" : {
      "count" : 4
    }
  }
}
Run `POST test/_update/1` to perform a scripted upsert. When `scripted_upsert` is `true`, the script runs whether or not the document exists.
{
  "scripted_upsert": true,
  "script": {
    "source": """
      if ( ctx.op == 'create' ) {
        ctx._source.counter = params.count
      } else {
        ctx._source.counter += params.count
      }
    """,
    "params": {
      "count": 4
    }
  },
  "upsert": {}
}
Run `POST test/_update/1` to perform a doc as upsert. Instead of sending a partial `doc` plus an `upsert` doc, you can set `doc_as_upsert` to `true` to use the contents of `doc` as the `upsert` value.
{
  "doc": {
    "name": "new_name"
  },
  "doc_as_upsert": true
}
Run `POST test/_update/1` to use a script to add a tag to a list of tags. In this example, it is just a list, so the tag is added even it exists.
{
  "script": {
    "source": "ctx._source.tags.add(params.tag)",
    "lang": "painless",
    "params": {
      "tag": "blue"
    }
  }
}
Run `POST test/_update/1` to use a script to remove a tag from a list of tags. The Painless function to remove a tag takes the array index of the element you want to remove. To avoid a possible runtime error, you first need to make sure the tag exists. If the list contains duplicates of the tag, this script just removes one occurrence.
{
  "script": {
    "source": "if (ctx._source.tags.contains(params.tag)) { ctx._source.tags.remove(ctx._source.tags.indexOf(params.tag)) }",
    "lang": "painless",
    "params": {
      "tag": "blue"
    }
  }
}
Run `POST test/_update/1` to use a script to add a field `new_field` to the document.
{
  "script" : "ctx._source.new_field = 'value_of_new_field'"
}
Run `POST test/_update/1` to use a script to remove a field `new_field` from the document.
{
  "script" : "ctx._source.remove('new_field')"
}
Run `POST test/_update/1` to use a script to remove a subfield from an object field.
{
  "script": "ctx._source['my-object'].remove('my-subfield')"
}
Run `POST test/_update/1` to change the operation that runs from within the script. For example, this request deletes the document if the `tags` field contains `green`, otherwise it does nothing (`noop`).
{
  "script": {
    "source": "if (ctx._source.tags.contains(params.tag)) { ctx.op = 'delete' } else { ctx.op = 'noop' }",
    "lang": "painless",
    "params": {
      "tag": "green"
    }
  }
}
Run `POST test/_update/1` to do a partial update that adds a new field to the existing document.
{
  "doc": {
    "name": "new_name"
  }
}
Run `POST test/_update/1` to perfom an upsert. If the document does not already exist, the contents of the upsert element are inserted as a new document. If the document exists, the script is run.
{
  "script": {
    "source": "ctx._source.counter += params.count",
    "lang": "painless",
    "params": {
      "count": 4
    }
  },
  "upsert": {
    "counter": 1
  }
}
Response examples (200)
By default updates that don't change anything detect that they don't change anything and return `"result": "noop"`.
{
   "_shards": {
        "total": 0,
        "successful": 0,
        "failed": 0
   },
   "_index": "test",
   "_id": "1",
   "_version": 2,
   "_primary_term": 1,
   "_seq_no": 1,
   "result": "noop"
}

Update documents Added in 2.4.0

POST /{index}/_update_by_query

Updates documents that match the specified query. If no query is specified, performs an update on every document in the data stream or index without modifying the source, which is useful for picking up mapping changes.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or alias:

  • read
  • index or write

You can specify the query criteria in the request URI or the request body using the same syntax as the search API.

When you submit an update by query request, Elasticsearch gets a snapshot of the data stream or index when it begins processing the request and updates matching documents using internal versioning. When the versions match, the document is updated and the version number is incremented. If a document changes between the time that the snapshot is taken and the update operation is processed, it results in a version conflict and the operation fails. You can opt to count version conflicts instead of halting and returning by setting conflicts to proceed. Note that if you opt to count version conflicts, the operation could attempt to update more documents from the source than max_docs until it has successfully updated max_docs documents or it has gone through every document in the source query.

NOTE: Documents with a version equal to 0 cannot be updated using update by query because internal versioning does not support 0 as a valid version number.

While processing an update by query request, Elasticsearch performs multiple search requests sequentially to find all of the matching documents. A bulk update request is performed for each batch of matching documents. Any query or update failures cause the update by query request to fail and the failures are shown in the response. Any update requests that completed successfully still stick, they are not rolled back.

Throttling update requests

To control the rate at which update by query issues batches of update operations, you can set requests_per_second to any positive decimal number. This pads each batch with a wait time to throttle the rate. Set requests_per_second to -1 to turn off throttling.

Throttling uses a wait time between batches so that the internal scroll requests can be given a timeout that takes the request padding into account. The padding time is the difference between the batch size divided by the requests_per_second and the time spent writing. By default the batch size is 1000, so if requests_per_second is set to 500:

target_time = 1000 / 500 per second = 2 seconds
wait_time = target_time - write_time = 2 seconds - .5 seconds = 1.5 seconds

Since the batch is issued as a single _bulk request, large batch sizes cause Elasticsearch to create many requests and wait before starting the next set. This is "bursty" instead of "smooth".

Slicing

Update by query supports sliced scroll to parallelize the update process. This can improve efficiency and provide a convenient way to break the request down into smaller parts.

Setting slices to auto chooses a reasonable number for most data streams and indices. This setting will use one slice per shard, up to a certain limit. If there are multiple source data streams or indices, it will choose the number of slices based on the index or backing index with the smallest number of shards.

Adding slices to _update_by_query just automates the manual process of creating sub-requests, which means it has some quirks:

  • You can see these requests in the tasks APIs. These sub-requests are "child" tasks of the task for the request with slices.
  • Fetching the status of the task for the request with slices only contains the status of completed slices.
  • These sub-requests are individually addressable for things like cancellation and rethrottling.
  • Rethrottling the request with slices will rethrottle the unfinished sub-request proportionally.
  • Canceling the request with slices will cancel each sub-request.
  • Due to the nature of slices each sub-request won't get a perfectly even portion of the documents. All documents will be addressed, but some slices may be larger than others. Expect larger slices to have a more even distribution.
  • Parameters like requests_per_second and max_docs on a request with slices are distributed proportionally to each sub-request. Combine that with the point above about distribution being uneven and you should conclude that using max_docs with slices might not result in exactly max_docs documents being updated.
  • Each sub-request gets a slightly different snapshot of the source data stream or index though these are all taken at approximately the same time.

If you're slicing manually or otherwise tuning automatic slicing, keep in mind that:

  • Query performance is most efficient when the number of slices is equal to the number of shards in the index or backing index. If that number is large (for example, 500), choose a lower number as too many slices hurts performance. Setting slices higher than the number of shards generally does not improve efficiency and adds overhead.
  • Update performance scales linearly across available resources with the number of slices.

Whether query or update performance dominates the runtime depends on the documents being reindexed and cluster resources.

Update the document source

Update by query supports scripts to update the document source. As with the update API, you can set ctx.op to change the operation that is performed.

Set ctx.op = "noop" if your script decides that it doesn't have to make any changes. The update by query operation skips updating the document and increments the noop counter.

Set ctx.op = "delete" if your script decides that the document should be deleted. The update by query operation deletes the document and increments the deleted counter.

Update by query supports only index, noop, and delete. Setting ctx.op to anything else is an error. Setting any other field in ctx is an error. This API enables you to only modify the source of matching documents; you cannot move them.

Path parameters

  • index string | array[string] Required

    A comma-separated list of data streams, indices, and aliases to search. It supports wildcards (*). To search all data streams or indices, omit this parameter or use * or _all.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • analyzer string

    The analyzer to use for the query string. This parameter can be used only when the q query string parameter is specified.

  • If true, wildcard and prefix queries are analyzed. This parameter can be used only when the q query string parameter is specified.

  • The preferred behavior when update by query hits version conflicts: abort or proceed.

    Supported values include:

    • abort: Stop reindexing if there are conflicts.
    • proceed: Continue reindexing even if there are conflicts.

    Values are abort or proceed.

  • The default operator for query string query: AND or OR. This parameter can be used only when the q query string parameter is specified.

    Values are and, AND, or, or OR.

  • df string

    The field to use as default where no field prefix is given in the query string. This parameter can be used only when the q query string parameter is specified.

  • expand_wildcards string | array[string]

    The type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. It supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • from number

    Skips the specified number of documents.

  • If false, the request returns an error if it targets a missing or closed index.

  • lenient boolean

    If true, format-based query failures (such as providing text to a numeric field) in the query string will be ignored. This parameter can be used only when the q query string parameter is specified.

  • max_docs number

    The maximum number of documents to process. It defaults to all documents. When set to a value less then or equal to scroll_size then a scroll will not be used to retrieve the results for the operation.

  • pipeline string

    The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, then setting the value to _none disables the default ingest pipeline for this request. If a final pipeline is configured it will always run, regardless of the value of this parameter.

  • The node or shard the operation should be performed on. It is random by default.

  • q string

    A query in the Lucene query string syntax.

  • refresh boolean

    If true, Elasticsearch refreshes affected shards to make the operation visible to search after the request completes. This is different than the update API's refresh parameter, which causes just the shard that received the request to be refreshed.

  • If true, the request cache is used for this request. It defaults to the index-level setting.

  • The throttle for this request in sub-requests per second.

  • routing string

    A custom value used to route operations to a specific shard.

  • scroll string

    The period to retain the search context for scrolling.

  • The size of the scroll request that powers the operation.

  • An explicit timeout for each search request. By default, there is no timeout.

  • The type of the search operation. Available options include query_then_fetch and dfs_query_then_fetch.

    Supported values include:

    • query_then_fetch: Documents are scored using local term and document frequencies for the shard. This is usually faster but less accurate.
    • dfs_query_then_fetch: Documents are scored using global term and document frequencies across all shards. This is usually slower but more accurate.

    Values are query_then_fetch or dfs_query_then_fetch.

  • slices number | string

    The number of slices this task should be divided into.

  • sort array[string]

    A comma-separated list of : pairs.

  • stats array[string]

    The specific tag of the request for logging and statistical purposes.

  • The maximum number of documents to collect for each shard. If a query reaches this limit, Elasticsearch terminates the query early. Elasticsearch collects documents before sorting.

    IMPORTANT: Use with caution. Elasticsearch applies this parameter to each shard handling the request. When possible, let Elasticsearch perform early termination automatically. Avoid specifying this parameter for requests that target data streams with backing indices across multiple data tiers.

  • timeout string

    The period each update request waits for the following operations: dynamic mapping updates, waiting for active shards. By default, it is one minute. This guarantees Elasticsearch waits for at least the timeout before failing. The actual wait time could be longer, particularly when multiple waits occur.

  • version boolean

    If true, returns the document version as part of a hit.

  • Should the document increment the version number (internal) on hit or not (reindex)

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The timeout parameter controls how long each write request waits for unavailable shards to become available. Both work exactly the way they work in the bulk API.

  • If true, the request blocks until the operation is complete. If false, Elasticsearch performs some preflight checks, launches the request, and returns a task ID that you can use to cancel or get the status of the task. Elasticsearch creates a record of this task as a document at .tasks/task/${taskId}.

application/json

Body

  • max_docs number

    The maximum number of documents to update.

  • query object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • script object
    Hide script attributes Show script attributes object
    • source string | object

      One of:
    • id string
    • params object

      Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

      Hide params attribute Show params attribute object
      • * object Additional properties
    • lang string

      Any of:

      Values are painless, expression, mustache, or java.

    • options object
      Hide options attribute Show options attribute object
      • * string Additional properties
  • slice object
    Hide slice attributes Show slice attributes object
    • field string

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • id string Required
    • max number Required
  • Values are abort or proceed.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • batches number

      The number of scroll responses pulled back by the update by query.

    • failures array[object]

      Array of failures if there were any unrecoverable errors during the process. If this is non-empty then the request ended because of those failures. Update by query is implemented using batches. Any failure causes the entire process to end, but all failures in the current batch are collected into the array. You can use the conflicts option to prevent reindex from ending when version conflicts occur.

      Hide failures attributes Show failures attributes object
    • noops number

      The number of documents that were ignored because the script used for the update by query returned a noop value for ctx.op.

    • deleted number

      The number of documents that were successfully deleted.

    • The number of requests per second effectively run during the update by query.

    • retries object
      Hide retries attributes Show retries attributes object
      • bulk number Required

        The number of bulk actions retried.

    • timed_out boolean

      If true, some requests timed out during the update by query.

    • took number

      Time unit for milliseconds

    • total number

      The number of documents that were successfully processed.

    • updated number

      The number of documents that were successfully updated.

    • The number of version conflicts that the update by query hit.

    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Time unit for milliseconds

    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Time unit for milliseconds

POST /{index}/_update_by_query
curl \
 --request POST 'http://api.example.com/{index}/_update_by_query' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"query\": { \n    \"term\": {\n      \"user.id\": \"kimchy\"\n    }\n  }\n}"'
Run `POST my-index-000001/_update_by_query?conflicts=proceed` to update documents that match a query.
{
  "query": { 
    "term": {
      "user.id": "kimchy"
    }
  }
}
Run `POST my-index-000001/_update_by_query` with a script to update the document source. It increments the `count` field for all documents with a `user.id` of `kimchy` in `my-index-000001`.
{
  "script": {
    "source": "ctx._source.count++",
    "lang": "painless"
  },
  "query": {
    "term": {
      "user.id": "kimchy"
    }
  }
}
Run `POST my-index-000001/_update_by_query` to slice an update by query manually. Provide a slice ID and total number of slices to each request.
{
  "slice": {
    "id": 0,
    "max": 2
  },
  "script": {
    "source": "ctx._source['extra'] = 'test'"
  }
}
Run `POST my-index-000001/_update_by_query?refresh&slices=5` to use automatic slicing. It automatically parallelizes using sliced scroll to slice on `_id`.
{
  "script": {
    "source": "ctx._source['extra'] = 'test'"
  }
}

Get an enrich policy Added in 7.5.0

GET /_enrich/policy/{name}

Returns information about an enrich policy.

Path parameters

  • name string | array[string] Required

    Comma-separated list of enrich policy names used to limit the request. To return information for all enrich policies, omit this parameter.

Query parameters

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • policies array[object] Required
      Hide policies attribute Show policies attribute object
GET /_enrich/policy/{name}
curl \
 --request GET 'http://api.example.com/_enrich/policy/{name}' \
 --header "Authorization: $API_KEY"

Create an enrich policy Added in 7.5.0

PUT /_enrich/policy/{name}

Creates an enrich policy.

Path parameters

  • name string Required

    Name of the enrich policy to create or update.

Query parameters

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /_enrich/policy/{name}
curl \
 --request PUT 'http://api.example.com/_enrich/policy/{name}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"additionalProperty1":{"enrich_fields":"string","indices":"string","match_field":"string","query":{},"name":"string","elasticsearch_version":"string"},"additionalProperty2":{"enrich_fields":"string","indices":"string","match_field":"string","query":{},"name":"string","elasticsearch_version":"string"}}'

Delete an enrich policy Added in 7.5.0

DELETE /_enrich/policy/{name}

Deletes an existing enrich policy and its enrich index.

Path parameters

  • name string Required

    Enrich policy to delete.

Query parameters

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_enrich/policy/{name}
curl \
 --request DELETE 'http://api.example.com/_enrich/policy/{name}' \
 --header "Authorization: $API_KEY"

Run an enrich policy Added in 7.5.0

PUT /_enrich/policy/{name}/_execute

Create the enrich index for an existing enrich policy.

Path parameters

  • name string Required

    Enrich policy to execute.

Query parameters

  • Period to wait for a connection to the master node.

  • If true, the request blocks other enrich policy execution requests until complete.

Responses

PUT /_enrich/policy/{name}/_execute
curl \
 --request PUT 'http://api.example.com/_enrich/policy/{name}/_execute' \
 --header "Authorization: $API_KEY"

Get an enrich policy Added in 7.5.0

GET /_enrich/policy

Returns information about an enrich policy.

Query parameters

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • policies array[object] Required
      Hide policies attribute Show policies attribute object
GET /_enrich/policy
curl \
 --request GET 'http://api.example.com/_enrich/policy' \
 --header "Authorization: $API_KEY"

EQL

Event Query Language (EQL) is a query language for event-based time series data, such as logs, metrics, and traces.

Learn more about EQL search

Get async EQL search results Added in 7.9.0

GET /_eql/search/{id}

Get the current status and available results for an async EQL search or a stored synchronous EQL search.

Path parameters

  • id string Required

    Identifier for the search.

Query parameters

  • Period for which the search and its results are stored on the cluster. Defaults to the keep_alive value set by the search’s EQL search API request.

  • Timeout duration to wait for the request to finish. Defaults to no timeout, meaning the request waits for complete search results.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string
    • is_partial boolean

      If true, the response does not contain complete search results.

    • is_running boolean

      If true, the search request is still executing.

    • took number

      Time unit for milliseconds

    • timed_out boolean

      If true, the request timed out before completion.

    • hits object Required
      Hide hits attributes Show hits attributes object
      • total object
        Hide total attributes Show total attributes object
      • events array[object]

        Contains events matching the query. Each object represents a matching event.

        Hide events attributes Show events attributes object
        • _index string Required
        • _id string Required
        • _source object Required

          Original JSON body passed for the event at index time.

        • missing boolean

          Set to true for events in a timespan-constrained sequence that do not meet a given condition.

        • fields object
          Hide fields attribute Show fields attribute object
          • * array[object] Additional properties
      • sequences array[object]

        Contains event sequences matching the query. Each object represents a matching sequence. This parameter is only returned for EQL queries containing a sequence.

        Hide sequences attributes Show sequences attributes object
        • events array[object] Required

          Contains events matching the query. Each object represents a matching event.

          Hide events attributes Show events attributes object
          • _index string Required
          • _id string Required
          • _source object Required

            Original JSON body passed for the event at index time.

          • missing boolean

            Set to true for events in a timespan-constrained sequence that do not meet a given condition.

          • fields object
        • join_keys array[object]

          Shared field values used to constrain matches in the sequence. These are defined using the by keyword in the EQL query syntax.

    • shard_failures array[object]

      Contains information about shard failures (if any), in case allow_partial_search_results=true

      Hide shard_failures attributes Show shard_failures attributes object
GET /_eql/search/{id}
curl \
 --request GET 'http://api.example.com/_eql/search/{id}' \
 --header "Authorization: $API_KEY"

Delete an async EQL search Added in 7.9.0

DELETE /_eql/search/{id}

Delete an async EQL search or a stored synchronous EQL search. The API also deletes results for the search.

Path parameters

  • id string Required

    Identifier for the search to delete. A search ID is provided in the EQL search API's response for an async search. A search ID is also provided if the request’s keep_on_completion parameter is true.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_eql/search/{id}
curl \
 --request DELETE 'http://api.example.com/_eql/search/{id}' \
 --header "Authorization: $API_KEY"

Get the async EQL status Added in 7.9.0

GET /_eql/search/status/{id}

Get the current status for an async EQL search or a stored synchronous EQL search without returning results.

Path parameters

  • id string Required

    Identifier for the search.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string Required
    • is_partial boolean Required

      If true, the search request is still executing. If false, the search is completed.

    • is_running boolean Required

      If true, the response does not contain complete search results. This could be because either the search is still running (is_running status is false), or because it is already completed (is_running status is true) and results are partial due to failures or timeouts.

    • Time unit for milliseconds

    • Time unit for milliseconds

    • For a completed search shows the http status code of the completed search.

GET /_eql/search/status/{id}
curl \
 --request GET 'http://api.example.com/_eql/search/status/{id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response for getting status information for an async EQL search.
{
  "id": "FmNJRUZ1YWZCU3dHY1BIOUhaenVSRkEaaXFlZ3h4c1RTWFNocDdnY2FSaERnUTozNDE=",
  "is_running" : true,
  "is_partial" : true,
  "start_time_in_millis" : 1611690235000,
  "expiration_time_in_millis" : 1611690295000
}

Get EQL search results Added in 7.9.0

GET /{index}/_eql/search

Returns search results for an Event Query Language (EQL) query. EQL assumes each document in a data stream or index corresponds to an event.

External documentation

Path parameters

  • index string | array[string] Required

    The name of the index to scope the operation

Query parameters

  • If true, returns partial results if there are shard failures. If false, returns an error with no partial results.

  • If true, sequence queries will return partial results in case of shard failures. If false, they will return no results at all. This flag has effect only if allow_partial_search_results is true.

  • expand_wildcards string | array[string]

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If true, missing or closed indices are not included in the response.

  • Period for which the search and its results are stored on the cluster.

  • If true, the search and its results are stored on the cluster.

  • Timeout duration to wait for the request to finish. Defaults to no timeout, meaning the request waits for complete search results.

application/json

Body Required

  • query string Required

    EQL query you wish to run.

  • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • filter object | array[object]

    Query, written in Query DSL, used to filter the events on which the EQL query runs.

    One of:

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • Allow query execution also in case of shard failures. If true, the query will keep running and will return results based on the available shards. For sequences, the behavior can be further refined using allow_partial_sequence_results

  • This flag applies only to sequences and has effect only if allow_partial_search_results=true. If true, the sequence query will return results based on the available shards, ignoring the others. If false, the sequence query will return successfully, but will always have empty results.

  • size number
  • fields object | array[object]

    Array of wildcard (*) patterns. The response returns values for field names matching these patterns in the fields property of each hit.

    One of:
    Hide attributes Show attributes
    • field string Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • format string

      The format in which the values are returned.

  • Values are tail or head.

  • Hide runtime_mappings attribute Show runtime_mappings attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • fields object

        For type composite

        Hide fields attribute Show fields attribute object
        • * object Additional properties
          Hide * attribute Show * attribute object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • fetch_fields array[object]

        For type lookup

        Hide fetch_fields attributes Show fetch_fields attributes object
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • format string
      • format string

        A custom format for date type runtime fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • script object
        Hide script attributes Show script attributes object
        • source string | object

          One of:
        • id string
        • params object

          Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          Hide params attribute Show params attribute object
          • * object Additional properties
        • lang string

          Any of:

          Values are painless, expression, mustache, or java.

        • options object
          Hide options attribute Show options attribute object
          • * string Additional properties
      • type string Required

        Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

  • By default, the response of a sample query contains up to 10 samples, with one sample per unique set of join keys. Use the size parameter to get a smaller or larger set of samples. To retrieve more than one sample per set of join keys, use the max_samples_per_key parameter. Pipes are not supported for sample queries.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string
    • is_partial boolean

      If true, the response does not contain complete search results.

    • is_running boolean

      If true, the search request is still executing.

    • took number

      Time unit for milliseconds

    • timed_out boolean

      If true, the request timed out before completion.

    • hits object Required
      Hide hits attributes Show hits attributes object
      • total object
        Hide total attributes Show total attributes object
      • events array[object]

        Contains events matching the query. Each object represents a matching event.

        Hide events attributes Show events attributes object
        • _index string Required
        • _id string Required
        • _source object Required

          Original JSON body passed for the event at index time.

        • missing boolean

          Set to true for events in a timespan-constrained sequence that do not meet a given condition.

        • fields object
          Hide fields attribute Show fields attribute object
          • * array[object] Additional properties
      • sequences array[object]

        Contains event sequences matching the query. Each object represents a matching sequence. This parameter is only returned for EQL queries containing a sequence.

        Hide sequences attributes Show sequences attributes object
        • events array[object] Required

          Contains events matching the query. Each object represents a matching event.

          Hide events attributes Show events attributes object
          • _index string Required
          • _id string Required
          • _source object Required

            Original JSON body passed for the event at index time.

          • missing boolean

            Set to true for events in a timespan-constrained sequence that do not meet a given condition.

          • fields object
        • join_keys array[object]

          Shared field values used to constrain matches in the sequence. These are defined using the by keyword in the EQL query syntax.

    • shard_failures array[object]

      Contains information about shard failures (if any), in case allow_partial_search_results=true

      Hide shard_failures attributes Show shard_failures attributes object
GET /{index}/_eql/search
curl \
 --request GET 'http://api.example.com/{index}/_eql/search' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"query\": \"\"\"\n    process where (process.name == \"cmd.exe\" and process.pid != 2013)\n  \"\"\"\n}"'
Request examples
Run `GET /my-data-stream/_eql/search` to search for events that have a `process.name` of `cmd.exe` and a `process.pid` other than `2013`.
{
  "query": """
    process where (process.name == "cmd.exe" and process.pid != 2013)
  """
}
Run `GET /my-data-stream/_eql/search` to search for a sequence of events. The sequence starts with an event with an `event.category` of `file`, a `file.name` of `cmd.exe`, and a `process.pid` other than `2013`. It is followed by an event with an `event.category` of `process` and a `process.executable` that contains the substring `regsvr32`. These events must also share the same `process.pid` value.
{
  "query": """
    sequence by process.pid
      [ file where file.name == "cmd.exe" and process.pid != 2013 ]
      [ process where stringContains(process.executable, "regsvr32") ]
  """
}
Response examples (200)
{
  "is_partial": false,
  "is_running": false,
  "took": 6,
  "timed_out": false,
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "sequences": [
      {
        "join_keys": [
          2012
        ],
        "events": [
          {
            "_index": ".ds-my-data-stream-2099.12.07-000001",
            "_id": "AtOJ4UjUBAAx3XR5kcCM",
            "_source": {
              "@timestamp": "2099-12-06T11:04:07.000Z",
              "event": {
                "category": "file",
                "id": "dGCHwoeS",
                "sequence": 2
              },
              "file": {
                "accessed": "2099-12-07T11:07:08.000Z",
                "name": "cmd.exe",
                "path": "C:\\Windows\\System32\\cmd.exe",
                "type": "file",
                "size": 16384
              },
              "process": {
                "pid": 2012,
                "name": "cmd.exe",
                "executable": "C:\\Windows\\System32\\cmd.exe"
              }
            }
          },
          {
            "_index": ".ds-my-data-stream-2099.12.07-000001",
            "_id": "OQmfCaduce8zoHT93o4H",
            "_source": {
              "@timestamp": "2099-12-07T11:07:09.000Z",
              "event": {
                "category": "process",
                "id": "aR3NWVOs",
                "sequence": 4
              },
              "process": {
                "pid": 2012,
                "name": "regsvr32.exe",
                "command_line": "regsvr32.exe  /s /u /i:https://...RegSvr32.sct scrobj.dll",
                "executable": "C:\\Windows\\System32\\regsvr32.exe"
              }
            }
          }
        ]
      }
    ]
  }
}

Get EQL search results Added in 7.9.0

POST /{index}/_eql/search

Returns search results for an Event Query Language (EQL) query. EQL assumes each document in a data stream or index corresponds to an event.

External documentation

Path parameters

  • index string | array[string] Required

    The name of the index to scope the operation

Query parameters

  • If true, returns partial results if there are shard failures. If false, returns an error with no partial results.

  • If true, sequence queries will return partial results in case of shard failures. If false, they will return no results at all. This flag has effect only if allow_partial_search_results is true.

  • expand_wildcards string | array[string]

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If true, missing or closed indices are not included in the response.

  • Period for which the search and its results are stored on the cluster.

  • If true, the search and its results are stored on the cluster.

  • Timeout duration to wait for the request to finish. Defaults to no timeout, meaning the request waits for complete search results.

application/json

Body Required

  • query string Required

    EQL query you wish to run.

  • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • filter object | array[object]

    Query, written in Query DSL, used to filter the events on which the EQL query runs.

    One of:

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • Allow query execution also in case of shard failures. If true, the query will keep running and will return results based on the available shards. For sequences, the behavior can be further refined using allow_partial_sequence_results

  • This flag applies only to sequences and has effect only if allow_partial_search_results=true. If true, the sequence query will return results based on the available shards, ignoring the others. If false, the sequence query will return successfully, but will always have empty results.

  • size number
  • fields object | array[object]

    Array of wildcard (*) patterns. The response returns values for field names matching these patterns in the fields property of each hit.

    One of:
    Hide attributes Show attributes
    • field string Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • format string

      The format in which the values are returned.

  • Values are tail or head.

  • Hide runtime_mappings attribute Show runtime_mappings attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • fields object

        For type composite

        Hide fields attribute Show fields attribute object
        • * object Additional properties
          Hide * attribute Show * attribute object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • fetch_fields array[object]

        For type lookup

        Hide fetch_fields attributes Show fetch_fields attributes object
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • format string
      • format string

        A custom format for date type runtime fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • script object
        Hide script attributes Show script attributes object
        • source string | object

          One of:
        • id string
        • params object

          Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          Hide params attribute Show params attribute object
          • * object Additional properties
        • lang string

          Any of:

          Values are painless, expression, mustache, or java.

        • options object
          Hide options attribute Show options attribute object
          • * string Additional properties
      • type string Required

        Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

  • By default, the response of a sample query contains up to 10 samples, with one sample per unique set of join keys. Use the size parameter to get a smaller or larger set of samples. To retrieve more than one sample per set of join keys, use the max_samples_per_key parameter. Pipes are not supported for sample queries.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string
    • is_partial boolean

      If true, the response does not contain complete search results.

    • is_running boolean

      If true, the search request is still executing.

    • took number

      Time unit for milliseconds

    • timed_out boolean

      If true, the request timed out before completion.

    • hits object Required
      Hide hits attributes Show hits attributes object
      • total object
        Hide total attributes Show total attributes object
      • events array[object]

        Contains events matching the query. Each object represents a matching event.

        Hide events attributes Show events attributes object
        • _index string Required
        • _id string Required
        • _source object Required

          Original JSON body passed for the event at index time.

        • missing boolean

          Set to true for events in a timespan-constrained sequence that do not meet a given condition.

        • fields object
          Hide fields attribute Show fields attribute object
          • * array[object] Additional properties
      • sequences array[object]

        Contains event sequences matching the query. Each object represents a matching sequence. This parameter is only returned for EQL queries containing a sequence.

        Hide sequences attributes Show sequences attributes object
        • events array[object] Required

          Contains events matching the query. Each object represents a matching event.

          Hide events attributes Show events attributes object
          • _index string Required
          • _id string Required
          • _source object Required

            Original JSON body passed for the event at index time.

          • missing boolean

            Set to true for events in a timespan-constrained sequence that do not meet a given condition.

          • fields object
        • join_keys array[object]

          Shared field values used to constrain matches in the sequence. These are defined using the by keyword in the EQL query syntax.

    • shard_failures array[object]

      Contains information about shard failures (if any), in case allow_partial_search_results=true

      Hide shard_failures attributes Show shard_failures attributes object
POST /{index}/_eql/search
curl \
 --request POST 'http://api.example.com/{index}/_eql/search' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"query\": \"\"\"\n    process where (process.name == \"cmd.exe\" and process.pid != 2013)\n  \"\"\"\n}"'
Request examples
Run `GET /my-data-stream/_eql/search` to search for events that have a `process.name` of `cmd.exe` and a `process.pid` other than `2013`.
{
  "query": """
    process where (process.name == "cmd.exe" and process.pid != 2013)
  """
}
Run `GET /my-data-stream/_eql/search` to search for a sequence of events. The sequence starts with an event with an `event.category` of `file`, a `file.name` of `cmd.exe`, and a `process.pid` other than `2013`. It is followed by an event with an `event.category` of `process` and a `process.executable` that contains the substring `regsvr32`. These events must also share the same `process.pid` value.
{
  "query": """
    sequence by process.pid
      [ file where file.name == "cmd.exe" and process.pid != 2013 ]
      [ process where stringContains(process.executable, "regsvr32") ]
  """
}
Response examples (200)
{
  "is_partial": false,
  "is_running": false,
  "took": 6,
  "timed_out": false,
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "sequences": [
      {
        "join_keys": [
          2012
        ],
        "events": [
          {
            "_index": ".ds-my-data-stream-2099.12.07-000001",
            "_id": "AtOJ4UjUBAAx3XR5kcCM",
            "_source": {
              "@timestamp": "2099-12-06T11:04:07.000Z",
              "event": {
                "category": "file",
                "id": "dGCHwoeS",
                "sequence": 2
              },
              "file": {
                "accessed": "2099-12-07T11:07:08.000Z",
                "name": "cmd.exe",
                "path": "C:\\Windows\\System32\\cmd.exe",
                "type": "file",
                "size": 16384
              },
              "process": {
                "pid": 2012,
                "name": "cmd.exe",
                "executable": "C:\\Windows\\System32\\cmd.exe"
              }
            }
          },
          {
            "_index": ".ds-my-data-stream-2099.12.07-000001",
            "_id": "OQmfCaduce8zoHT93o4H",
            "_source": {
              "@timestamp": "2099-12-07T11:07:09.000Z",
              "event": {
                "category": "process",
                "id": "aR3NWVOs",
                "sequence": 4
              },
              "process": {
                "pid": 2012,
                "name": "regsvr32.exe",
                "command_line": "regsvr32.exe  /s /u /i:https://...RegSvr32.sct scrobj.dll",
                "executable": "C:\\Windows\\System32\\regsvr32.exe"
              }
            }
          }
        ]
      }
    ]
  }
}

ES|QL

The Elasticsearch Query Language (ES|QL) provides a powerful way to filter, transform, and analyze data stored in Elasticsearch, and in the future in other runtimes.

Learn more about ES|QL

Get a specific running ES|QL query information Technical preview

GET /_query/queries/{id}

Returns an object extended information about a running ES|QL query.

Path parameters

  • id string Required

    The query ID

Responses

GET /_query/queries/{id}
curl \
 --request GET 'http://api.example.com/_query/queries/{id}' \
 --header "Authorization: $API_KEY"

Get running ES|QL queries information Technical preview

GET /_query/queries

Returns an object containing IDs and other information about the running ES|QL queries.

Responses

GET /_query/queries
curl \
 --request GET 'http://api.example.com/_query/queries' \
 --header "Authorization: $API_KEY"

Run an ES|QL query

POST /_query

Get search results for an ES|QL (Elasticsearch query language) query.

External documentation

Query parameters

  • format string

    A short version of the Accept header, e.g. json, yaml.

    Values are csv, json, tsv, txt, yaml, cbor, smile, or arrow.

  • The character to use between values within a CSV row. Only valid for the CSV format.

  • Should columns that are entirely null be removed from the columns and values portion of the results? Defaults to false. If true then the response will include an extra section under the name all_columns which has the name of all columns.

  • If true, partial results will be returned if there are shard failures, but the query can continue to execute on other clusters and shards. If false, the query will fail if there are any failures.

    To override the default behavior, you can set the esql.query.allow_partial_results cluster setting to false.

application/json

Body Required

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • took number

      Time unit for milliseconds

    • is_partial boolean
    • all_columns array[object]
      Hide all_columns attributes Show all_columns attributes object
    • columns array[object] Required
      Hide columns attributes Show columns attributes object
    • values array[array] Required

      A field value.

      A field value.

    • Hide _clusters attributes Show _clusters attributes object
    • profile object

      Profiling information. Present if profile was true in the request. The contents of this field are currently unstable.

POST /_query
curl \
 --request POST 'http://api.example.com/_query' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"query\": \"\"\"\n    FROM library,remote-*:library\n    | EVAL year = DATE_TRUNC(1 YEARS, release_date)\n    | STATS MAX(page_count) BY year\n    | SORT year\n    | LIMIT 5\n  \"\"\",\n  \"include_ccs_metadata\": true\n}"'
Request example
Run `POST /_query` to get results for an ES|QL query.
{
  "query": """
    FROM library,remote-*:library
    | EVAL year = DATE_TRUNC(1 YEARS, release_date)
    | STATS MAX(page_count) BY year
    | SORT year
    | LIMIT 5
  """,
  "include_ccs_metadata": true
}

Graph explore

The graph explore API enables you to extract and summarize information about the documents and terms in an Elasticsearch data stream or index.

Get started with Graph

Explore graph analytics

GET /{index}/_graph/explore

Extract and summarize information about the documents and terms in an Elasticsearch data stream or index. The easiest way to understand the behavior of this API is to use the Graph UI to explore connections. An initial request to the _explore API contains a seed query that identifies the documents of interest and specifies the fields that define the vertices and connections you want to include in the graph. Subsequent requests enable you to spider out from one more vertices of interest. You can exclude vertices that have already been returned.

External documentation

Path parameters

  • index string | array[string] Required

    Name of the index.

Query parameters

  • routing string

    Custom value used to route operations to a specific shard.

  • timeout string

    Specifies the period of time to wait for a response from each shard. If no response is received before the timeout expires, the request fails and returns an error. Defaults to no timeout.

application/json

Body

  • Hide connections attributes Show connections attributes object
    • query object

      An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
    • vertices array[object] Required

      Contains the fields you are interested in.

      Hide vertices attributes Show vertices attributes object
      • exclude array[string]

        Prevents the specified terms from being included in the results.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • include array[object]

        Identifies the terms of interest that form the starting points from which you want to spider out.

        Hide include attributes Show include attributes object
      • Specifies how many documents must contain a pair of terms before it is considered to be a useful connection. This setting acts as a certainty threshold.

      • Controls how many documents on a particular shard have to contain a pair of terms before the connection is returned for global consideration.

      • size number

        Specifies the maximum number of vertex terms returned for each field.

  • controls object
    Hide controls attributes Show controls attributes object
    • Hide sample_diversity attributes Show sample_diversity attributes object
      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • max_docs_per_value number Required
    • Each hop considers a sample of the best-matching documents on each shard. Using samples improves the speed of execution and keeps exploration focused on meaningfully-connected terms. Very small values (less than 50) might not provide sufficient weight-of-evidence to identify significant connections between terms. Very large sample sizes can dilute the quality of the results and increase execution times.

    • timeout string

      A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • use_significance boolean Required

      Filters associated terms so only those that are significantly associated with your query are included.

  • query object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • vertices array[object]

    Specifies one or more fields that contain the terms you want to include in the graph as vertices.

    Hide vertices attributes Show vertices attributes object
    • exclude array[string]

      Prevents the specified terms from being included in the results.

    • field string Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • include array[object]

      Identifies the terms of interest that form the starting points from which you want to spider out.

      Hide include attributes Show include attributes object
    • Specifies how many documents must contain a pair of terms before it is considered to be a useful connection. This setting acts as a certainty threshold.

    • Controls how many documents on a particular shard have to contain a pair of terms before the connection is returned for global consideration.

    • size number

      Specifies the maximum number of vertex terms returned for each field.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
GET /{index}/_graph/explore
curl \
 --request GET 'http://api.example.com/{index}/_graph/explore' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"query\": {\n    \"match\": {\n      \"query.raw\": \"midi\"\n    }\n  },\n  \"vertices\": [\n    {\n      \"field\": \"product\"\n    }\n  ],\n  \"connections\": {\n    \"vertices\": [\n      {\n        \"field\": \"query.raw\"\n      }\n    ]\n  }\n}"'
Request example
Run `POST clicklogs/_graph/explore` for a basic exploration An initial graph explore query typically begins with a query to identify strongly related terms. Seed the exploration with a query. This example is searching `clicklogs` for people who searched for the term `midi`.Identify the vertices to include in the graph. This example is looking for product codes that are significantly associated with searches for `midi`. Find the connections. This example is looking for other search terms that led people to click on the products that are associated with searches for `midi`.
{
  "query": {
    "match": {
      "query.raw": "midi"
    }
  },
  "vertices": [
    {
      "field": "product"
    }
  ],
  "connections": {
    "vertices": [
      {
        "field": "query.raw"
      }
    ]
  }
}

Explore graph analytics

POST /{index}/_graph/explore

Extract and summarize information about the documents and terms in an Elasticsearch data stream or index. The easiest way to understand the behavior of this API is to use the Graph UI to explore connections. An initial request to the _explore API contains a seed query that identifies the documents of interest and specifies the fields that define the vertices and connections you want to include in the graph. Subsequent requests enable you to spider out from one more vertices of interest. You can exclude vertices that have already been returned.

External documentation

Path parameters

  • index string | array[string] Required

    Name of the index.

Query parameters

  • routing string

    Custom value used to route operations to a specific shard.

  • timeout string

    Specifies the period of time to wait for a response from each shard. If no response is received before the timeout expires, the request fails and returns an error. Defaults to no timeout.

application/json

Body

  • Hide connections attributes Show connections attributes object
    • query object

      An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
    • vertices array[object] Required

      Contains the fields you are interested in.

      Hide vertices attributes Show vertices attributes object
      • exclude array[string]

        Prevents the specified terms from being included in the results.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • include array[object]

        Identifies the terms of interest that form the starting points from which you want to spider out.

        Hide include attributes Show include attributes object
      • Specifies how many documents must contain a pair of terms before it is considered to be a useful connection. This setting acts as a certainty threshold.

      • Controls how many documents on a particular shard have to contain a pair of terms before the connection is returned for global consideration.

      • size number

        Specifies the maximum number of vertex terms returned for each field.

  • controls object
    Hide controls attributes Show controls attributes object
    • Hide sample_diversity attributes Show sample_diversity attributes object
      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • max_docs_per_value number Required
    • Each hop considers a sample of the best-matching documents on each shard. Using samples improves the speed of execution and keeps exploration focused on meaningfully-connected terms. Very small values (less than 50) might not provide sufficient weight-of-evidence to identify significant connections between terms. Very large sample sizes can dilute the quality of the results and increase execution times.

    • timeout string

      A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • use_significance boolean Required

      Filters associated terms so only those that are significantly associated with your query are included.

  • query object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • vertices array[object]

    Specifies one or more fields that contain the terms you want to include in the graph as vertices.

    Hide vertices attributes Show vertices attributes object
    • exclude array[string]

      Prevents the specified terms from being included in the results.

    • field string Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • include array[object]

      Identifies the terms of interest that form the starting points from which you want to spider out.

      Hide include attributes Show include attributes object
    • Specifies how many documents must contain a pair of terms before it is considered to be a useful connection. This setting acts as a certainty threshold.

    • Controls how many documents on a particular shard have to contain a pair of terms before the connection is returned for global consideration.

    • size number

      Specifies the maximum number of vertex terms returned for each field.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
POST /{index}/_graph/explore
curl \
 --request POST 'http://api.example.com/{index}/_graph/explore' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"query\": {\n    \"match\": {\n      \"query.raw\": \"midi\"\n    }\n  },\n  \"vertices\": [\n    {\n      \"field\": \"product\"\n    }\n  ],\n  \"connections\": {\n    \"vertices\": [\n      {\n        \"field\": \"query.raw\"\n      }\n    ]\n  }\n}"'
Request example
Run `POST clicklogs/_graph/explore` for a basic exploration An initial graph explore query typically begins with a query to identify strongly related terms. Seed the exploration with a query. This example is searching `clicklogs` for people who searched for the term `midi`.Identify the vertices to include in the graph. This example is looking for product codes that are significantly associated with searches for `midi`. Find the connections. This example is looking for other search terms that led people to click on the products that are associated with searches for `midi`.
{
  "query": {
    "match": {
      "query.raw": "midi"
    }
  },
  "vertices": [
    {
      "field": "product"
    }
  ],
  "connections": {
    "vertices": [
      {
        "field": "query.raw"
      }
    ]
  }
}

Index

Index APIs enable you to manage individual indices, index settings, aliases, mappings, and index templates.

Get component templates Added in 7.8.0

GET /_component_template/{name}

Get information about component templates.

Path parameters

  • name string Required

    Comma-separated list of component template names used to limit the request. Wildcard (*) expressions are supported.

Query parameters

  • If true, returns settings in flat format.

  • Return all default configurations for the component template (default: false)

  • local boolean

    If true, the request retrieves information from the local node only. If false, information is retrieved from the master node.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

GET /_component_template/{name}
curl \
 --request GET 'http://api.example.com/_component_template/{name}' \
 --header "Authorization: $API_KEY"

Create or update a component template Added in 7.8.0

PUT /_component_template/{name}

Component templates are building blocks for constructing index templates that specify index mappings, settings, and aliases.

An index template can be composed of multiple component templates. To use a component template, specify it in an index template’s composed_of list. Component templates are only applied to new data streams and indices as part of a matching index template.

Settings and mappings specified directly in the index template or the create index request override any settings or mappings specified in a component template.

Component templates are only used during index creation. For data streams, this includes data stream creation and the creation of a stream’s backing indices. Changes to component templates do not affect existing indices, including a stream’s backing indices.

You can use C-style /* *\/ block comments in component templates. You can include comments anywhere in the request body except before the opening curly bracket.

Applying component templates

You cannot directly apply a component template to a data stream or index. To be applied, a component template must be included in an index template's composed_of list.

Path parameters

  • name string Required

    Name of the component template to create. Elasticsearch includes the following built-in component templates: logs-mappings; logs-settings; metrics-mappings; metrics-settings;synthetics-mapping; synthetics-settings. Elastic Agent uses these templates to configure backing indices for its data streams. If you use Elastic Agent and want to overwrite one of these templates, set the version for your replacement template higher than the current version. If you don’t use Elastic Agent and want to disable all built-in component and index templates, set stack.templates.enabled to false using the cluster update settings API.

Query parameters

  • create boolean

    If true, this request cannot replace or update existing component templates.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

application/json

Body Required

  • template object Required
    Hide template attributes Show template attributes object
    • aliases object
      Hide aliases attribute Show aliases attribute object
    • mappings object
      Hide mappings attributes Show mappings attributes object
    • settings object Additional properties
      Hide settings attributes Show settings attributes object
      • index object Additional properties
      • mode string
      • Hide soft_deletes attributes Show soft_deletes attributes object
        • enabled boolean

          Indicates whether soft deletes are enabled on the index.

        • Hide retention_lease attribute Show retention_lease attribute object
          • period string Required

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • sort object
        Hide sort attributes Show sort attributes object
      • Values are true, false, or checksum.

      • codec string
      • routing_partition_size number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • auto_expand_replicas string | null

        One of:
      • merge object
        Hide merge attribute Show merge attribute object
        • Hide scheduler attributes Show scheduler attributes object
          • max_thread_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • max_merge_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • blocks object
        Hide blocks attributes Show blocks attributes object
        • read_only boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • read_only_allow_delete boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • read boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • write boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • metadata boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • analyze object
        Hide analyze attribute Show analyze attribute object
        • max_token_count number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • Hide highlight attribute Show highlight attribute object
      • routing object
        Hide routing attributes Show routing attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide lifecycle attributes Show lifecycle attributes object
        • name string
        • indexing_complete boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

        • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

        • step object
          Hide step attribute Show step attribute object
          • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

        • prefer_ilm boolean | string

          Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

      • creation_date number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • creation_date_string string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • uuid string
      • version object
        Hide version attributes Show version attributes object
      • translog object
        Hide translog attributes Show translog attributes object
      • Hide query_string attribute Show query_string attribute object
        • lenient boolean | string Required

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • analysis object
        Hide analysis attributes Show analysis attributes object
      • settings object Additional properties
      • Hide time_series attributes Show time_series attributes object
        • end_time string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • start_time string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • queries object
        Hide queries attribute Show queries attribute object
        • cache object
          Hide cache attribute Show cache attribute object
      • Configure custom similarity settings to customize how search results are scored.

      • mapping object
        Hide mapping attributes Show mapping attributes object
        • coerce boolean
        • Hide total_fields attributes Show total_fields attributes object
          • limit number | string

            The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

          • ignore_dynamic_beyond_limit boolean | string

            This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

        • depth object
          Hide depth attribute Show depth attribute object
          • limit number

            The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

        • Hide nested_fields attribute Show nested_fields attribute object
          • limit number

            The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

        • Hide nested_objects attribute Show nested_objects attribute object
          • limit number

            The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

        • Hide field_name_length attribute Show field_name_length attribute object
          • limit number

            Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

        • Hide dimension_fields attribute Show dimension_fields attribute object
          • limit number

            [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

        • source object
          Hide source attribute Show source attribute object
          • mode string Required

            Values are disabled, stored, or synthetic.

      • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
        • level string
        • source number
        • reformat boolean
        • Hide threshold attribute Show threshold attribute object
          • index object
            Hide index attributes Show index attributes object
            • warn string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • info string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • debug string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • trace string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide indexing_pressure attribute Show indexing_pressure attribute object
        • memory object Required
          Hide memory attribute Show memory attribute object
          • limit number

            Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

      • store object
        Hide store attributes Show store attributes object
        • type string Required

          Any of:

          Values are fs, niofs, mmapfs, or hybridfs.

        • allow_mmap boolean

          You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

    • defaults object Additional properties
      Hide defaults attributes Show defaults attributes object
      • index object Additional properties
      • mode string
      • Hide soft_deletes attributes Show soft_deletes attributes object
        • enabled boolean

          Indicates whether soft deletes are enabled on the index.

        • Hide retention_lease attribute Show retention_lease attribute object
          • period string Required

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • sort object
        Hide sort attributes Show sort attributes object
      • Values are true, false, or checksum.

      • codec string
      • routing_partition_size number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • auto_expand_replicas string | null

        One of:
      • merge object
        Hide merge attribute Show merge attribute object
        • Hide scheduler attributes Show scheduler attributes object
          • max_thread_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • max_merge_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • blocks object
        Hide blocks attributes Show blocks attributes object
        • read_only boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • read_only_allow_delete boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • read boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • write boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • metadata boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • analyze object
        Hide analyze attribute Show analyze attribute object
        • max_token_count number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • Hide highlight attribute Show highlight attribute object
      • routing object
        Hide routing attributes Show routing attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide lifecycle attributes Show lifecycle attributes object
        • name string
        • indexing_complete boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

        • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

        • step object
          Hide step attribute Show step attribute object
          • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

        • prefer_ilm boolean | string

          Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

      • creation_date number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • creation_date_string string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • uuid string
      • version object
        Hide version attributes Show version attributes object
      • translog object
        Hide translog attributes Show translog attributes object
      • Hide query_string attribute Show query_string attribute object
        • lenient boolean | string Required

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • analysis object
        Hide analysis attributes Show analysis attributes object
      • settings object Additional properties
      • Hide time_series attributes Show time_series attributes object
        • end_time string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • start_time string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • queries object
        Hide queries attribute Show queries attribute object
        • cache object
          Hide cache attribute Show cache attribute object
      • Configure custom similarity settings to customize how search results are scored.

      • mapping object
        Hide mapping attributes Show mapping attributes object
        • coerce boolean
        • Hide total_fields attributes Show total_fields attributes object
          • limit number | string

            The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

          • ignore_dynamic_beyond_limit boolean | string

            This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

        • depth object
          Hide depth attribute Show depth attribute object
          • limit number

            The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

        • Hide nested_fields attribute Show nested_fields attribute object
          • limit number

            The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

        • Hide nested_objects attribute Show nested_objects attribute object
          • limit number

            The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

        • Hide field_name_length attribute Show field_name_length attribute object
          • limit number

            Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

        • Hide dimension_fields attribute Show dimension_fields attribute object
          • limit number

            [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

        • source object
          Hide source attribute Show source attribute object
          • mode string Required

            Values are disabled, stored, or synthetic.

      • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
        • level string
        • source number
        • reformat boolean
        • Hide threshold attribute Show threshold attribute object
          • index object
            Hide index attributes Show index attributes object
            • warn string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • info string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • debug string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • trace string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide indexing_pressure attribute Show indexing_pressure attribute object
        • memory object Required
          Hide memory attribute Show memory attribute object
          • limit number

            Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

      • store object
        Hide store attributes Show store attributes object
        • type string Required

          Any of:

          Values are fs, niofs, mmapfs, or hybridfs.

        • allow_mmap boolean

          You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

    • Hide lifecycle attributes Show lifecycle attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide downsampling attribute Show downsampling attribute object
        • rounds array[object] Required

          The list of downsampling rounds to execute as part of this downsampling configuration

          Hide rounds attributes Show rounds attributes object
          • after string Required

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • config object Required
            Hide config attribute Show config attribute object
            • fixed_interval string Required

              A date histogram interval. Similar to Duration with additional units: w (week), M (month), q (quarter) and y (year)

      • enabled boolean

        If defined, it turns data stream lifecycle on/off (true/false) for this data stream. A data stream lifecycle that's disabled (enabled: false) will have no effect on the data stream.

  • version number
  • _meta object
    Hide _meta attribute Show _meta attribute object
    • * object Additional properties
  • deprecated boolean

    Marks this index template as deprecated. When creating or updating a non-deprecated index template that uses deprecated components, Elasticsearch will emit a deprecation warning.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /_component_template/{name}
curl \
 --request PUT 'http://api.example.com/_component_template/{name}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"template\": null,\n  \"settings\": {\n    \"number_of_shards\": 1\n  },\n  \"mappings\": {\n    \"_source\": {\n      \"enabled\": false\n    },\n    \"properties\": {\n      \"host_name\": {\n        \"type\": \"keyword\"\n      },\n      \"created_at\": {\n        \"type\": \"date\",\n        \"format\": \"EEE MMM dd HH:mm:ss Z yyyy\"\n      }\n    }\n  }\n}"'
Request examples
{
  "template": null,
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "_source": {
      "enabled": false
    },
    "properties": {
      "host_name": {
        "type": "keyword"
      },
      "created_at": {
        "type": "date",
        "format": "EEE MMM dd HH:mm:ss Z yyyy"
      }
    }
  }
}
You can include index aliases in a component template. During index creation, the `{index}` placeholder in the alias name will be replaced with the actual index name that the template gets applied to.
{
  "template": null,
  "settings": {
    "number_of_shards": 1
  },
  "aliases": {
    "alias1": {},
    "alias2": {
      "filter": {
        "term": {
          "user.id": "kimchy"
        }
      },
      "routing": "shard-1"
    },
    "{index}-alias": {}
  }
}

Create or update a component template Added in 7.8.0

POST /_component_template/{name}

Component templates are building blocks for constructing index templates that specify index mappings, settings, and aliases.

An index template can be composed of multiple component templates. To use a component template, specify it in an index template’s composed_of list. Component templates are only applied to new data streams and indices as part of a matching index template.

Settings and mappings specified directly in the index template or the create index request override any settings or mappings specified in a component template.

Component templates are only used during index creation. For data streams, this includes data stream creation and the creation of a stream’s backing indices. Changes to component templates do not affect existing indices, including a stream’s backing indices.

You can use C-style /* *\/ block comments in component templates. You can include comments anywhere in the request body except before the opening curly bracket.

Applying component templates

You cannot directly apply a component template to a data stream or index. To be applied, a component template must be included in an index template's composed_of list.

Path parameters

  • name string Required

    Name of the component template to create. Elasticsearch includes the following built-in component templates: logs-mappings; logs-settings; metrics-mappings; metrics-settings;synthetics-mapping; synthetics-settings. Elastic Agent uses these templates to configure backing indices for its data streams. If you use Elastic Agent and want to overwrite one of these templates, set the version for your replacement template higher than the current version. If you don’t use Elastic Agent and want to disable all built-in component and index templates, set stack.templates.enabled to false using the cluster update settings API.

Query parameters

  • create boolean

    If true, this request cannot replace or update existing component templates.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

application/json

Body Required

  • template object Required
    Hide template attributes Show template attributes object
    • aliases object
      Hide aliases attribute Show aliases attribute object
    • mappings object
      Hide mappings attributes Show mappings attributes object
    • settings object Additional properties
      Hide settings attributes Show settings attributes object
      • index object Additional properties
      • mode string
      • Hide soft_deletes attributes Show soft_deletes attributes object
        • enabled boolean

          Indicates whether soft deletes are enabled on the index.

        • Hide retention_lease attribute Show retention_lease attribute object
          • period string Required

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • sort object
        Hide sort attributes Show sort attributes object
      • Values are true, false, or checksum.

      • codec string
      • routing_partition_size number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • auto_expand_replicas string | null

        One of:
      • merge object
        Hide merge attribute Show merge attribute object
        • Hide scheduler attributes Show scheduler attributes object
          • max_thread_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • max_merge_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • blocks object
        Hide blocks attributes Show blocks attributes object
        • read_only boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • read_only_allow_delete boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • read boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • write boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • metadata boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • analyze object
        Hide analyze attribute Show analyze attribute object
        • max_token_count number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • Hide highlight attribute Show highlight attribute object
      • routing object
        Hide routing attributes Show routing attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide lifecycle attributes Show lifecycle attributes object
        • name string
        • indexing_complete boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

        • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

        • step object
          Hide step attribute Show step attribute object
          • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

        • prefer_ilm boolean | string

          Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

      • creation_date number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • creation_date_string string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • uuid string
      • version object
        Hide version attributes Show version attributes object
      • translog object
        Hide translog attributes Show translog attributes object
      • Hide query_string attribute Show query_string attribute object
        • lenient boolean | string Required

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • analysis object
        Hide analysis attributes Show analysis attributes object
      • settings object Additional properties
      • Hide time_series attributes Show time_series attributes object
        • end_time string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • start_time string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • queries object
        Hide queries attribute Show queries attribute object
        • cache object
          Hide cache attribute Show cache attribute object
      • Configure custom similarity settings to customize how search results are scored.

      • mapping object
        Hide mapping attributes Show mapping attributes object
        • coerce boolean
        • Hide total_fields attributes Show total_fields attributes object
          • limit number | string

            The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

          • ignore_dynamic_beyond_limit boolean | string

            This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

        • depth object
          Hide depth attribute Show depth attribute object
          • limit number

            The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

        • Hide nested_fields attribute Show nested_fields attribute object
          • limit number

            The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

        • Hide nested_objects attribute Show nested_objects attribute object
          • limit number

            The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

        • Hide field_name_length attribute Show field_name_length attribute object
          • limit number

            Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

        • Hide dimension_fields attribute Show dimension_fields attribute object
          • limit number

            [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

        • source object
          Hide source attribute Show source attribute object
          • mode string Required

            Values are disabled, stored, or synthetic.

      • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
        • level string
        • source number
        • reformat boolean
        • Hide threshold attribute Show threshold attribute object
          • index object
            Hide index attributes Show index attributes object
            • warn string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • info string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • debug string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • trace string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide indexing_pressure attribute Show indexing_pressure attribute object
        • memory object Required
          Hide memory attribute Show memory attribute object
          • limit number

            Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

      • store object
        Hide store attributes Show store attributes object
        • type string Required

          Any of:

          Values are fs, niofs, mmapfs, or hybridfs.

        • allow_mmap boolean

          You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

    • defaults object Additional properties
      Hide defaults attributes Show defaults attributes object
      • index object Additional properties
      • mode string
      • Hide soft_deletes attributes Show soft_deletes attributes object
        • enabled boolean

          Indicates whether soft deletes are enabled on the index.

        • Hide retention_lease attribute Show retention_lease attribute object
          • period string Required

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • sort object
        Hide sort attributes Show sort attributes object
      • Values are true, false, or checksum.

      • codec string
      • routing_partition_size number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • auto_expand_replicas string | null

        One of:
      • merge object
        Hide merge attribute Show merge attribute object
        • Hide scheduler attributes Show scheduler attributes object
          • max_thread_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • max_merge_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • blocks object
        Hide blocks attributes Show blocks attributes object
        • read_only boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • read_only_allow_delete boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • read boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • write boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • metadata boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • analyze object
        Hide analyze attribute Show analyze attribute object
        • max_token_count number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • Hide highlight attribute Show highlight attribute object
      • routing object
        Hide routing attributes Show routing attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide lifecycle attributes Show lifecycle attributes object
        • name string
        • indexing_complete boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

        • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

        • step object
          Hide step attribute Show step attribute object
          • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

        • prefer_ilm boolean | string

          Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

      • creation_date number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • creation_date_string string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • uuid string
      • version object
        Hide version attributes Show version attributes object
      • translog object
        Hide translog attributes Show translog attributes object
      • Hide query_string attribute Show query_string attribute object
        • lenient boolean | string Required

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • analysis object
        Hide analysis attributes Show analysis attributes object
      • settings object Additional properties
      • Hide time_series attributes Show time_series attributes object
        • end_time string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • start_time string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • queries object
        Hide queries attribute Show queries attribute object
        • cache object
          Hide cache attribute Show cache attribute object
      • Configure custom similarity settings to customize how search results are scored.

      • mapping object
        Hide mapping attributes Show mapping attributes object
        • coerce boolean
        • Hide total_fields attributes Show total_fields attributes object
          • limit number | string

            The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

          • ignore_dynamic_beyond_limit boolean | string

            This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

        • depth object
          Hide depth attribute Show depth attribute object
          • limit number

            The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

        • Hide nested_fields attribute Show nested_fields attribute object
          • limit number

            The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

        • Hide nested_objects attribute Show nested_objects attribute object
          • limit number

            The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

        • Hide field_name_length attribute Show field_name_length attribute object
          • limit number

            Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

        • Hide dimension_fields attribute Show dimension_fields attribute object
          • limit number

            [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

        • source object
          Hide source attribute Show source attribute object
          • mode string Required

            Values are disabled, stored, or synthetic.

      • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
        • level string
        • source number
        • reformat boolean
        • Hide threshold attribute Show threshold attribute object
          • index object
            Hide index attributes Show index attributes object
            • warn string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • info string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • debug string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • trace string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide indexing_pressure attribute Show indexing_pressure attribute object
        • memory object Required
          Hide memory attribute Show memory attribute object
          • limit number

            Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

      • store object
        Hide store attributes Show store attributes object
        • type string Required

          Any of:

          Values are fs, niofs, mmapfs, or hybridfs.

        • allow_mmap boolean

          You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

    • Hide lifecycle attributes Show lifecycle attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide downsampling attribute Show downsampling attribute object
        • rounds array[object] Required

          The list of downsampling rounds to execute as part of this downsampling configuration

          Hide rounds attributes Show rounds attributes object
          • after string Required

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • config object Required
            Hide config attribute Show config attribute object
            • fixed_interval string Required

              A date histogram interval. Similar to Duration with additional units: w (week), M (month), q (quarter) and y (year)

      • enabled boolean

        If defined, it turns data stream lifecycle on/off (true/false) for this data stream. A data stream lifecycle that's disabled (enabled: false) will have no effect on the data stream.

  • version number
  • _meta object
    Hide _meta attribute Show _meta attribute object
    • * object Additional properties
  • deprecated boolean

    Marks this index template as deprecated. When creating or updating a non-deprecated index template that uses deprecated components, Elasticsearch will emit a deprecation warning.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_component_template/{name}
curl \
 --request POST 'http://api.example.com/_component_template/{name}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"template\": null,\n  \"settings\": {\n    \"number_of_shards\": 1\n  },\n  \"mappings\": {\n    \"_source\": {\n      \"enabled\": false\n    },\n    \"properties\": {\n      \"host_name\": {\n        \"type\": \"keyword\"\n      },\n      \"created_at\": {\n        \"type\": \"date\",\n        \"format\": \"EEE MMM dd HH:mm:ss Z yyyy\"\n      }\n    }\n  }\n}"'
Request examples
{
  "template": null,
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "_source": {
      "enabled": false
    },
    "properties": {
      "host_name": {
        "type": "keyword"
      },
      "created_at": {
        "type": "date",
        "format": "EEE MMM dd HH:mm:ss Z yyyy"
      }
    }
  }
}
You can include index aliases in a component template. During index creation, the `{index}` placeholder in the alias name will be replaced with the actual index name that the template gets applied to.
{
  "template": null,
  "settings": {
    "number_of_shards": 1
  },
  "aliases": {
    "alias1": {},
    "alias2": {
      "filter": {
        "term": {
          "user.id": "kimchy"
        }
      },
      "routing": "shard-1"
    },
    "{index}-alias": {}
  }
}

Delete component templates Added in 7.8.0

DELETE /_component_template/{name}

Component templates are building blocks for constructing index templates that specify index mappings, settings, and aliases.

Path parameters

  • name string | array[string] Required

    Comma-separated list or wildcard expression of component template names used to limit the request.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_component_template/{name}
curl \
 --request DELETE 'http://api.example.com/_component_template/{name}' \
 --header "Authorization: $API_KEY"

Check component templates Added in 7.8.0

HEAD /_component_template/{name}

Returns information about whether a particular component template exists.

Path parameters

  • name string | array[string] Required

    Comma-separated list of component template names used to limit the request. Wildcard (*) expressions are supported.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • local boolean

    If true, the request retrieves information from the local node only. Defaults to false, which means information is retrieved from the master node.

Responses

HEAD /_component_template/{name}
curl \
 --request HEAD 'http://api.example.com/_component_template/{name}' \
 --header "Authorization: $API_KEY"

Get component templates Added in 7.8.0

GET /_component_template

Get information about component templates.

Query parameters

  • If true, returns settings in flat format.

  • Return all default configurations for the component template (default: false)

  • local boolean

    If true, the request retrieves information from the local node only. If false, information is retrieved from the master node.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

GET /_component_template
curl \
 --request GET 'http://api.example.com/_component_template' \
 --header "Authorization: $API_KEY"

Add an index block Added in 7.9.0

PUT /{index}/_block/{block}

Add an index block to an index. Index blocks limit the operations allowed on an index by blocking specific operation types.

Path parameters

  • index string Required

    A comma-separated list or wildcard expression of index names used to limit the request. By default, you must explicitly name the indices you are adding blocks to. To allow the adding of blocks to indices with _all, *, or other wildcard expressions, change the action.destructive_requires_name setting to false. You can update this setting in the elasticsearch.yml file or by using the cluster update settings API.

  • block string Required

    The block type to add to the index.

    Supported values include:

    • metadata: Disable metadata changes, such as closing the index.
    • read: Disable read operations.
    • read_only: Disable write operations and metadata changes.
    • write: Disable write operations. However, metadata changes are still allowed.

    Values are metadata, read, read_only, or write.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • expand_wildcards string | array[string]

    The type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. It supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

  • The period to wait for the master node. If the master node is not available before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

  • timeout string

    The period to wait for a response from all relevant nodes in the cluster after updating the cluster metadata. If no response is received before the timeout expires, the cluster metadata update still applies but the response will indicate that it was not completely acknowledged. It can also be set to -1 to indicate that the request should never timeout.

Responses

PUT /{index}/_block/{block}
curl \
 --request PUT 'http://api.example.com/{index}/_block/{block}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `PUT /my-index-000001/_block/write`, which adds an index block to an index.'
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "indices" : [ {
    "name" : "my-index-000001",
    "blocked" : true
  } ]
}

Get tokens from text analysis

GET /_analyze

The analyze API performs analysis on a text string and returns the resulting tokens.

Generating excessive amount of tokens may cause a node to run out of memory. The index.analyze.max_token_count setting enables you to limit the number of tokens that can be produced. If more than this limit of tokens gets generated, an error occurs. The _analyze endpoint without a specified index will always use 10000 as its limit.

External documentation

Query parameters

  • index string

    Index used to derive the analyzer. If specified, the analyzer or field parameter overrides this value. If no index is specified or the index does not have a default analyzer, the analyze API uses the standard analyzer.

application/json

Body

Responses

GET /_analyze
curl \
 --request GET 'http://api.example.com/_analyze' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"analyzer\": \"standard\",\n  \"text\": \"this is a test\"\n}"'
You can apply any of the built-in analyzers to the text string without specifying an index.
{
  "analyzer": "standard",
  "text": "this is a test"
}
If the text parameter is provided as array of strings, it is analyzed as a multi-value field.
{
  "analyzer": "standard",
  "text": [
    "this is a test",
    "the second text"
  ]
}
You can test a custom transient analyzer built from tokenizers, token filters, and char filters. Token filters use the filter parameter.
{
  "tokenizer": "keyword",
  "filter": [
    "lowercase"
  ],
  "char_filter": [
    "html_strip"
  ],
  "text": "this is a <b>test</b>"
}
Custom tokenizers, token filters, and character filters can be specified in the request body.
{
  "tokenizer": "whitespace",
  "filter": [
    "lowercase",
    {
      "type": "stop",
      "stopwords": [
        "a",
        "is",
        "this"
      ]
    }
  ],
  "text": "this is a test"
}
Run `GET /analyze_sample/_analyze` to run an analysis on the text using the default index analyzer associated with the `analyze_sample` index. Alternatively, the analyzer can be derived based on a field mapping.
{
  "field": "obj1.field1",
  "text": "this is a test"
}
Run `GET /analyze_sample/_analyze` and supply a normalizer for a keyword field if there is a normalizer associated with the specified index.
{
  "normalizer": "my_normalizer",
  "text": "BaR"
}
If you want to get more advanced details, set `explain` to `true`. It will output all token attributes for each token. You can filter token attributes you want to output by setting the `attributes` option. NOTE: The format of the additional detail information is labelled as experimental in Lucene and it may change in the future.
{
  "tokenizer": "standard",
  "filter": [
    "snowball"
  ],
  "text": "detailed output",
  "explain": true,
  "attributes": [
    "keyword"
  ]
}
Response examples (200)
A successful response for an analysis with `explain` set to `true`.
{
  "detail": {
    "custom_analyzer": true,
    "charfilters": [],
    "tokenizer": {
      "name": "standard",
      "tokens": [
        {
          "token": "detailed",
          "start_offset": 0,
          "end_offset": 8,
          "type": "<ALPHANUM>",
          "position": 0
        },
        {
          "token": "output",
          "start_offset": 9,
          "end_offset": 15,
          "type": "<ALPHANUM>",
          "position": 1
        }
      ]
    },
    "tokenfilters": [
      {
        "name": "snowball",
        "tokens": [
          {
            "token": "detail",
            "start_offset": 0,
            "end_offset": 8,
            "type": "<ALPHANUM>",
            "position": 0,
            "keyword": false
          },
          {
            "token": "output",
            "start_offset": 9,
            "end_offset": 15,
            "type": "<ALPHANUM>",
            "position": 1,
            "keyword": false
          }
        ]
      }
    ]
  }
}

Get tokens from text analysis

POST /_analyze

The analyze API performs analysis on a text string and returns the resulting tokens.

Generating excessive amount of tokens may cause a node to run out of memory. The index.analyze.max_token_count setting enables you to limit the number of tokens that can be produced. If more than this limit of tokens gets generated, an error occurs. The _analyze endpoint without a specified index will always use 10000 as its limit.

External documentation

Query parameters

  • index string

    Index used to derive the analyzer. If specified, the analyzer or field parameter overrides this value. If no index is specified or the index does not have a default analyzer, the analyze API uses the standard analyzer.

application/json

Body

Responses

POST /_analyze
curl \
 --request POST 'http://api.example.com/_analyze' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"analyzer\": \"standard\",\n  \"text\": \"this is a test\"\n}"'
You can apply any of the built-in analyzers to the text string without specifying an index.
{
  "analyzer": "standard",
  "text": "this is a test"
}
If the text parameter is provided as array of strings, it is analyzed as a multi-value field.
{
  "analyzer": "standard",
  "text": [
    "this is a test",
    "the second text"
  ]
}
You can test a custom transient analyzer built from tokenizers, token filters, and char filters. Token filters use the filter parameter.
{
  "tokenizer": "keyword",
  "filter": [
    "lowercase"
  ],
  "char_filter": [
    "html_strip"
  ],
  "text": "this is a <b>test</b>"
}
Custom tokenizers, token filters, and character filters can be specified in the request body.
{
  "tokenizer": "whitespace",
  "filter": [
    "lowercase",
    {
      "type": "stop",
      "stopwords": [
        "a",
        "is",
        "this"
      ]
    }
  ],
  "text": "this is a test"
}
Run `GET /analyze_sample/_analyze` to run an analysis on the text using the default index analyzer associated with the `analyze_sample` index. Alternatively, the analyzer can be derived based on a field mapping.
{
  "field": "obj1.field1",
  "text": "this is a test"
}
Run `GET /analyze_sample/_analyze` and supply a normalizer for a keyword field if there is a normalizer associated with the specified index.
{
  "normalizer": "my_normalizer",
  "text": "BaR"
}
If you want to get more advanced details, set `explain` to `true`. It will output all token attributes for each token. You can filter token attributes you want to output by setting the `attributes` option. NOTE: The format of the additional detail information is labelled as experimental in Lucene and it may change in the future.
{
  "tokenizer": "standard",
  "filter": [
    "snowball"
  ],
  "text": "detailed output",
  "explain": true,
  "attributes": [
    "keyword"
  ]
}
Response examples (200)
A successful response for an analysis with `explain` set to `true`.
{
  "detail": {
    "custom_analyzer": true,
    "charfilters": [],
    "tokenizer": {
      "name": "standard",
      "tokens": [
        {
          "token": "detailed",
          "start_offset": 0,
          "end_offset": 8,
          "type": "<ALPHANUM>",
          "position": 0
        },
        {
          "token": "output",
          "start_offset": 9,
          "end_offset": 15,
          "type": "<ALPHANUM>",
          "position": 1
        }
      ]
    },
    "tokenfilters": [
      {
        "name": "snowball",
        "tokens": [
          {
            "token": "detail",
            "start_offset": 0,
            "end_offset": 8,
            "type": "<ALPHANUM>",
            "position": 0,
            "keyword": false
          },
          {
            "token": "output",
            "start_offset": 9,
            "end_offset": 15,
            "type": "<ALPHANUM>",
            "position": 1,
            "keyword": false
          }
        ]
      }
    ]
  }
}

Get tokens from text analysis

GET /{index}/_analyze

The analyze API performs analysis on a text string and returns the resulting tokens.

Generating excessive amount of tokens may cause a node to run out of memory. The index.analyze.max_token_count setting enables you to limit the number of tokens that can be produced. If more than this limit of tokens gets generated, an error occurs. The _analyze endpoint without a specified index will always use 10000 as its limit.

External documentation

Path parameters

  • index string Required

    Index used to derive the analyzer. If specified, the analyzer or field parameter overrides this value. If no index is specified or the index does not have a default analyzer, the analyze API uses the standard analyzer.

Query parameters

  • index string

    Index used to derive the analyzer. If specified, the analyzer or field parameter overrides this value. If no index is specified or the index does not have a default analyzer, the analyze API uses the standard analyzer.

application/json

Body

Responses

GET /{index}/_analyze
curl \
 --request GET 'http://api.example.com/{index}/_analyze' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"analyzer\": \"standard\",\n  \"text\": \"this is a test\"\n}"'
You can apply any of the built-in analyzers to the text string without specifying an index.
{
  "analyzer": "standard",
  "text": "this is a test"
}
If the text parameter is provided as array of strings, it is analyzed as a multi-value field.
{
  "analyzer": "standard",
  "text": [
    "this is a test",
    "the second text"
  ]
}
You can test a custom transient analyzer built from tokenizers, token filters, and char filters. Token filters use the filter parameter.
{
  "tokenizer": "keyword",
  "filter": [
    "lowercase"
  ],
  "char_filter": [
    "html_strip"
  ],
  "text": "this is a <b>test</b>"
}
Custom tokenizers, token filters, and character filters can be specified in the request body.
{
  "tokenizer": "whitespace",
  "filter": [
    "lowercase",
    {
      "type": "stop",
      "stopwords": [
        "a",
        "is",
        "this"
      ]
    }
  ],
  "text": "this is a test"
}
Run `GET /analyze_sample/_analyze` to run an analysis on the text using the default index analyzer associated with the `analyze_sample` index. Alternatively, the analyzer can be derived based on a field mapping.
{
  "field": "obj1.field1",
  "text": "this is a test"
}
Run `GET /analyze_sample/_analyze` and supply a normalizer for a keyword field if there is a normalizer associated with the specified index.
{
  "normalizer": "my_normalizer",
  "text": "BaR"
}
If you want to get more advanced details, set `explain` to `true`. It will output all token attributes for each token. You can filter token attributes you want to output by setting the `attributes` option. NOTE: The format of the additional detail information is labelled as experimental in Lucene and it may change in the future.
{
  "tokenizer": "standard",
  "filter": [
    "snowball"
  ],
  "text": "detailed output",
  "explain": true,
  "attributes": [
    "keyword"
  ]
}
Response examples (200)
A successful response for an analysis with `explain` set to `true`.
{
  "detail": {
    "custom_analyzer": true,
    "charfilters": [],
    "tokenizer": {
      "name": "standard",
      "tokens": [
        {
          "token": "detailed",
          "start_offset": 0,
          "end_offset": 8,
          "type": "<ALPHANUM>",
          "position": 0
        },
        {
          "token": "output",
          "start_offset": 9,
          "end_offset": 15,
          "type": "<ALPHANUM>",
          "position": 1
        }
      ]
    },
    "tokenfilters": [
      {
        "name": "snowball",
        "tokens": [
          {
            "token": "detail",
            "start_offset": 0,
            "end_offset": 8,
            "type": "<ALPHANUM>",
            "position": 0,
            "keyword": false
          },
          {
            "token": "output",
            "start_offset": 9,
            "end_offset": 15,
            "type": "<ALPHANUM>",
            "position": 1,
            "keyword": false
          }
        ]
      }
    ]
  }
}

Get tokens from text analysis

POST /{index}/_analyze

The analyze API performs analysis on a text string and returns the resulting tokens.

Generating excessive amount of tokens may cause a node to run out of memory. The index.analyze.max_token_count setting enables you to limit the number of tokens that can be produced. If more than this limit of tokens gets generated, an error occurs. The _analyze endpoint without a specified index will always use 10000 as its limit.

External documentation

Path parameters

  • index string Required

    Index used to derive the analyzer. If specified, the analyzer or field parameter overrides this value. If no index is specified or the index does not have a default analyzer, the analyze API uses the standard analyzer.

Query parameters

  • index string

    Index used to derive the analyzer. If specified, the analyzer or field parameter overrides this value. If no index is specified or the index does not have a default analyzer, the analyze API uses the standard analyzer.

application/json

Body

Responses

POST /{index}/_analyze
curl \
 --request POST 'http://api.example.com/{index}/_analyze' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"analyzer\": \"standard\",\n  \"text\": \"this is a test\"\n}"'
You can apply any of the built-in analyzers to the text string without specifying an index.
{
  "analyzer": "standard",
  "text": "this is a test"
}
If the text parameter is provided as array of strings, it is analyzed as a multi-value field.
{
  "analyzer": "standard",
  "text": [
    "this is a test",
    "the second text"
  ]
}
You can test a custom transient analyzer built from tokenizers, token filters, and char filters. Token filters use the filter parameter.
{
  "tokenizer": "keyword",
  "filter": [
    "lowercase"
  ],
  "char_filter": [
    "html_strip"
  ],
  "text": "this is a <b>test</b>"
}
Custom tokenizers, token filters, and character filters can be specified in the request body.
{
  "tokenizer": "whitespace",
  "filter": [
    "lowercase",
    {
      "type": "stop",
      "stopwords": [
        "a",
        "is",
        "this"
      ]
    }
  ],
  "text": "this is a test"
}
Run `GET /analyze_sample/_analyze` to run an analysis on the text using the default index analyzer associated with the `analyze_sample` index. Alternatively, the analyzer can be derived based on a field mapping.
{
  "field": "obj1.field1",
  "text": "this is a test"
}
Run `GET /analyze_sample/_analyze` and supply a normalizer for a keyword field if there is a normalizer associated with the specified index.
{
  "normalizer": "my_normalizer",
  "text": "BaR"
}
If you want to get more advanced details, set `explain` to `true`. It will output all token attributes for each token. You can filter token attributes you want to output by setting the `attributes` option. NOTE: The format of the additional detail information is labelled as experimental in Lucene and it may change in the future.
{
  "tokenizer": "standard",
  "filter": [
    "snowball"
  ],
  "text": "detailed output",
  "explain": true,
  "attributes": [
    "keyword"
  ]
}
Response examples (200)
A successful response for an analysis with `explain` set to `true`.
{
  "detail": {
    "custom_analyzer": true,
    "charfilters": [],
    "tokenizer": {
      "name": "standard",
      "tokens": [
        {
          "token": "detailed",
          "start_offset": 0,
          "end_offset": 8,
          "type": "<ALPHANUM>",
          "position": 0
        },
        {
          "token": "output",
          "start_offset": 9,
          "end_offset": 15,
          "type": "<ALPHANUM>",
          "position": 1
        }
      ]
    },
    "tokenfilters": [
      {
        "name": "snowball",
        "tokens": [
          {
            "token": "detail",
            "start_offset": 0,
            "end_offset": 8,
            "type": "<ALPHANUM>",
            "position": 0,
            "keyword": false
          },
          {
            "token": "output",
            "start_offset": 9,
            "end_offset": 15,
            "type": "<ALPHANUM>",
            "position": 1,
            "keyword": false
          }
        ]
      }
    ]
  }
}

Get index information

GET /{index}

Get information about one or more indices. For data streams, the API returns information about the stream’s backing indices.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and index aliases used to limit the request. Wildcard expressions (*) are supported.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • expand_wildcards string | array[string]

    Type of index that wildcard expressions can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If true, returns settings in flat format.

  • If false, requests that target a missing index return an error.

  • If true, return all default settings in the response.

  • local boolean

    If true, the request retrieves information from the local node only. Defaults to false, which means information is retrieved from the master node.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • features string | array[string]

    Return only information on specified index features

    Supported values include: aliases, mappings, settings

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object
      Hide * attributes Show * attributes object
      • aliases object
        Hide aliases attribute Show aliases attribute object
      • mappings object
        Hide mappings attributes Show mappings attributes object
      • settings object Additional properties
        Hide settings attributes Show settings attributes object
        • index object Additional properties
        • mode string
        • Hide soft_deletes attributes Show soft_deletes attributes object
          • enabled boolean

            Indicates whether soft deletes are enabled on the index.

          • Hide retention_lease attribute Show retention_lease attribute object
            • period string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • sort object
          Hide sort attributes Show sort attributes object
        • Values are true, false, or checksum.

        • codec string
        • routing_partition_size number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • auto_expand_replicas string | null

          One of:
        • merge object
          Hide merge attribute Show merge attribute object
          • Hide scheduler attributes Show scheduler attributes object
            • max_thread_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

            • max_merge_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • blocks object
          Hide blocks attributes Show blocks attributes object
          • read_only boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read_only_allow_delete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • write boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • metadata boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analyze object
          Hide analyze attribute Show analyze attribute object
          • max_token_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Hide highlight attribute Show highlight attribute object
        • routing object
          Hide routing attributes Show routing attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide lifecycle attributes Show lifecycle attributes object
          • name string
          • indexing_complete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

          • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

          • step object
            Hide step attribute Show step attribute object
            • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

          • prefer_ilm boolean | string

            Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

        • creation_date number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • creation_date_string string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • uuid string
        • version object
          Hide version attributes Show version attributes object
        • translog object
          Hide translog attributes Show translog attributes object
        • Hide query_string attribute Show query_string attribute object
          • lenient boolean | string Required

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analysis object
          Hide analysis attributes Show analysis attributes object
        • settings object Additional properties
        • Hide time_series attributes Show time_series attributes object
          • end_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          • start_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • queries object
          Hide queries attribute Show queries attribute object
          • cache object
            Hide cache attribute Show cache attribute object
        • Configure custom similarity settings to customize how search results are scored.

        • mapping object
          Hide mapping attributes Show mapping attributes object
          • coerce boolean
          • Hide total_fields attributes Show total_fields attributes object
            • limit number | string

              The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

            • ignore_dynamic_beyond_limit boolean | string

              This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

          • depth object
            Hide depth attribute Show depth attribute object
            • limit number

              The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

          • Hide nested_fields attribute Show nested_fields attribute object
            • limit number

              The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

          • Hide nested_objects attribute Show nested_objects attribute object
            • limit number

              The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

          • Hide field_name_length attribute Show field_name_length attribute object
            • limit number

              Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

          • Hide dimension_fields attribute Show dimension_fields attribute object
            • limit number

              [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

          • source object
            Hide source attribute Show source attribute object
            • mode string Required

              Values are disabled, stored, or synthetic.

        • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
          • level string
          • source number
          • reformat boolean
          • Hide threshold attribute Show threshold attribute object
            • index object
              Hide index attributes Show index attributes object
              • warn string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • info string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • debug string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • trace string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide indexing_pressure attribute Show indexing_pressure attribute object
          • memory object Required
            Hide memory attribute Show memory attribute object
            • limit number

              Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

        • store object
          Hide store attributes Show store attributes object
          • type string Required

            Any of:

            Values are fs, niofs, mmapfs, or hybridfs.

          • allow_mmap boolean

            You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

      • defaults object Additional properties
        Hide defaults attributes Show defaults attributes object
        • index object Additional properties
        • mode string
        • Hide soft_deletes attributes Show soft_deletes attributes object
          • enabled boolean

            Indicates whether soft deletes are enabled on the index.

          • Hide retention_lease attribute Show retention_lease attribute object
            • period string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • sort object
          Hide sort attributes Show sort attributes object
        • Values are true, false, or checksum.

        • codec string
        • routing_partition_size number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • auto_expand_replicas string | null

          One of:
        • merge object
          Hide merge attribute Show merge attribute object
          • Hide scheduler attributes Show scheduler attributes object
            • max_thread_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

            • max_merge_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • blocks object
          Hide blocks attributes Show blocks attributes object
          • read_only boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read_only_allow_delete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • write boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • metadata boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analyze object
          Hide analyze attribute Show analyze attribute object
          • max_token_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Hide highlight attribute Show highlight attribute object
        • routing object
          Hide routing attributes Show routing attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide lifecycle attributes Show lifecycle attributes object
          • name string
          • indexing_complete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

          • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

          • step object
            Hide step attribute Show step attribute object
            • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

          • prefer_ilm boolean | string

            Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

        • creation_date number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • creation_date_string string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • uuid string
        • version object
          Hide version attributes Show version attributes object
        • translog object
          Hide translog attributes Show translog attributes object
        • Hide query_string attribute Show query_string attribute object
          • lenient boolean | string Required

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analysis object
          Hide analysis attributes Show analysis attributes object
        • settings object Additional properties
        • Hide time_series attributes Show time_series attributes object
          • end_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          • start_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • queries object
          Hide queries attribute Show queries attribute object
          • cache object
            Hide cache attribute Show cache attribute object
        • Configure custom similarity settings to customize how search results are scored.

        • mapping object
          Hide mapping attributes Show mapping attributes object
          • coerce boolean
          • Hide total_fields attributes Show total_fields attributes object
            • limit number | string

              The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

            • ignore_dynamic_beyond_limit boolean | string

              This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

          • depth object
            Hide depth attribute Show depth attribute object
            • limit number

              The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

          • Hide nested_fields attribute Show nested_fields attribute object
            • limit number

              The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

          • Hide nested_objects attribute Show nested_objects attribute object
            • limit number

              The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

          • Hide field_name_length attribute Show field_name_length attribute object
            • limit number

              Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

          • Hide dimension_fields attribute Show dimension_fields attribute object
            • limit number

              [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

          • source object
            Hide source attribute Show source attribute object
            • mode string Required

              Values are disabled, stored, or synthetic.

        • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
          • level string
          • source number
          • reformat boolean
          • Hide threshold attribute Show threshold attribute object
            • index object
              Hide index attributes Show index attributes object
              • warn string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • info string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • debug string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • trace string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide indexing_pressure attribute Show indexing_pressure attribute object
          • memory object Required
            Hide memory attribute Show memory attribute object
            • limit number

              Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

        • store object
          Hide store attributes Show store attributes object
          • type string Required

            Any of:

            Values are fs, niofs, mmapfs, or hybridfs.

          • allow_mmap boolean

            You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

      • Hide lifecycle attributes Show lifecycle attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide downsampling attribute Show downsampling attribute object
          • rounds array[object] Required

            The list of downsampling rounds to execute as part of this downsampling configuration

            Hide rounds attributes Show rounds attributes object
            • after string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • config object Required
        • enabled boolean

          If defined, it turns data stream lifecycle on/off (true/false) for this data stream. A data stream lifecycle that's disabled (enabled: false) will have no effect on the data stream.

GET /{index}
curl \
 --request GET 'http://api.example.com/{index}' \
 --header "Authorization: $API_KEY"

Create an index

PUT /{index}

You can use the create index API to add a new index to an Elasticsearch cluster. When creating an index, you can specify the following:

  • Settings for the index.
  • Mappings for fields in the index.
  • Index aliases

Wait for active shards

By default, index creation will only return a response to the client when the primary copies of each shard have been started, or the request times out. The index creation response will indicate what happened. For example, acknowledged indicates whether the index was successfully created in the cluster, while shards_acknowledged indicates whether the requisite number of shard copies were started for each shard in the index before timing out. Note that it is still possible for either acknowledged or shards_acknowledged to be false, but for the index creation to be successful. These values simply indicate whether the operation completed before the timeout. If acknowledged is false, the request timed out before the cluster state was updated with the newly created index, but it probably will be created sometime soon. If shards_acknowledged is false, then the request timed out before the requisite number of shards were started (by default just the primaries), even if the cluster state was successfully updated to reflect the newly created index (that is to say, acknowledged is true).

You can change the default of only waiting for the primary shards to start through the index setting index.write.wait_for_active_shards. Note that changing this setting will also affect the wait_for_active_shards value on all subsequent write operations.

Path parameters

  • index string Required

    Name of the index you wish to create. Index names must meet the following criteria:

    • Lowercase only
    • Cannot include \, /, *, ?, ", <, >, |, (space character), ,, or #
    • Indices prior to 7.0 could contain a colon (:), but that has been deprecated and will not be supported in later versions
    • Cannot start with -, _, or +
    • Cannot be . or ..
    • Cannot be longer than 255 bytes (note thtat it is bytes, so multi-byte characters will reach the limit faster)
    • Names starting with . are deprecated, except for hidden indices and internal indices managed by plugins

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1).

application/json

Body

  • aliases object

    Aliases for the index.

    Hide aliases attribute Show aliases attribute object
  • mappings object
    Hide mappings attributes Show mappings attributes object
    • Hide all_field attributes Show all_field attributes object
    • dynamic string

      Values are strict, runtime, true, or false.

    • dynamic_templates array[object]
    • Hide _field_names attribute Show _field_names attribute object
    • Hide index_field attribute Show index_field attribute object
    • _meta object
      Hide _meta attribute Show _meta attribute object
      • * object Additional properties
    • _routing object
      Hide _routing attribute Show _routing attribute object
    • _size object
      Hide _size attribute Show _size attribute object
    • _source object
      Hide _source attributes Show _source attributes object
    • runtime object
      Hide runtime attribute Show runtime attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • fields object

          For type composite

          Hide fields attribute Show fields attribute object
          • * object Additional properties
            Hide * attribute Show * attribute object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • fetch_fields array[object]

          For type lookup

          Hide fetch_fields attributes Show fetch_fields attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • format string
        • format string

          A custom format for date type runtime fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • script object
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
        • type string Required

          Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

    • enabled boolean
    • Values are true or false.

    • Hide _data_stream_timestamp attribute Show _data_stream_timestamp attribute object
  • settings object Additional properties
    Hide settings attributes Show settings attributes object
    • index object Additional properties
    • mode string
    • Hide soft_deletes attributes Show soft_deletes attributes object
      • enabled boolean

        Indicates whether soft deletes are enabled on the index.

      • Hide retention_lease attribute Show retention_lease attribute object
        • period string Required

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • sort object
      Hide sort attributes Show sort attributes object
    • Values are true, false, or checksum.

    • codec string
    • routing_partition_size number | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • auto_expand_replicas string | null

      One of:
    • merge object
      Hide merge attribute Show merge attribute object
      • Hide scheduler attributes Show scheduler attributes object
        • max_thread_count number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • max_merge_count number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • blocks object
      Hide blocks attributes Show blocks attributes object
      • read_only boolean | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • read_only_allow_delete boolean | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • read boolean | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • write boolean | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • metadata boolean | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • analyze object
      Hide analyze attribute Show analyze attribute object
      • max_token_count number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • Hide highlight attribute Show highlight attribute object
    • routing object
      Hide routing attributes Show routing attributes object
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Hide lifecycle attributes Show lifecycle attributes object
      • name string
      • indexing_complete boolean | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

      • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

      • step object
        Hide step attribute Show step attribute object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

      • prefer_ilm boolean | string

        Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

    • creation_date number | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • creation_date_string string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    • uuid string
    • version object
      Hide version attributes Show version attributes object
    • translog object
      Hide translog attributes Show translog attributes object
    • Hide query_string attribute Show query_string attribute object
      • lenient boolean | string Required

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • analysis object
      Hide analysis attributes Show analysis attributes object
    • settings object Additional properties
    • Hide time_series attributes Show time_series attributes object
      • end_time string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • start_time string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    • queries object
      Hide queries attribute Show queries attribute object
      • cache object
        Hide cache attribute Show cache attribute object
    • Configure custom similarity settings to customize how search results are scored.

    • mapping object
      Hide mapping attributes Show mapping attributes object
      • coerce boolean
      • Hide total_fields attributes Show total_fields attributes object
        • limit number | string

          The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

        • ignore_dynamic_beyond_limit boolean | string

          This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

      • depth object
        Hide depth attribute Show depth attribute object
        • limit number

          The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

      • Hide nested_fields attribute Show nested_fields attribute object
        • limit number

          The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

      • Hide nested_objects attribute Show nested_objects attribute object
        • limit number

          The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

      • Hide field_name_length attribute Show field_name_length attribute object
        • limit number

          Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

      • Hide dimension_fields attribute Show dimension_fields attribute object
        • limit number

          [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

      • source object
        Hide source attribute Show source attribute object
        • mode string Required

          Values are disabled, stored, or synthetic.

    • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
      • level string
      • source number
      • reformat boolean
      • Hide threshold attribute Show threshold attribute object
        • index object
          Hide index attributes Show index attributes object
          • warn string

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • info string

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • debug string

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • trace string

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Hide indexing_pressure attribute Show indexing_pressure attribute object
      • memory object Required
        Hide memory attribute Show memory attribute object
        • limit number

          Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

    • store object
      Hide store attributes Show store attributes object
      • type string Required

        Any of:

        Values are fs, niofs, mmapfs, or hybridfs.

      • allow_mmap boolean

        You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

Responses

PUT /{index}
curl \
 --request PUT 'http://api.example.com/{index}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"settings\": {\n    \"number_of_shards\": 3,\n    \"number_of_replicas\": 2\n  }\n}"'
This request specifies the `number_of_shards` and `number_of_replicas`.
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 2
  }
}
You can provide mapping definitions in the create index API requests.
{
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "properties": {
      "field1": { "type": "text" }
    }
  }
}
You can provide mapping definitions in the create index API requests. Index alias names also support date math.
{
  "aliases": {
    "alias_1": {},
    "alias_2": {
      "filter": {
        "term": {
          "user.id": "kimchy"
        }
      },
      "routing": "shard-1"
    }
  }
}

Delete indices

DELETE /{index}

Deleting an index deletes its documents, shards, and metadata. It does not delete related Kibana components, such as data views, visualizations, or dashboards.

You cannot delete the current write index of a data stream. To delete the index, you must roll over the data stream so a new write index is created. You can then use the delete index API to delete the previous write index.

Path parameters

  • index string | array[string] Required

    Comma-separated list of indices to delete. You cannot specify index aliases. By default, this parameter does not support wildcards (*) or _all. To use wildcards or _all, set the action.destructive_requires_name cluster setting to false.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
DELETE /{index}
curl \
 --request DELETE 'http://api.example.com/{index}' \
 --header "Authorization: $API_KEY"

Check indices

HEAD /{index}

Check if one or more indices, index aliases, or data streams exist.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and aliases. Supports wildcards (*).

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If true, returns settings in flat format.

  • If false, the request returns an error if it targets a missing or closed index.

  • If true, return all default settings in the response.

  • local boolean

    If true, the request retrieves information from the local node only.

Responses

HEAD /{index}
curl \
 --request HEAD 'http://api.example.com/{index}' \
 --header "Authorization: $API_KEY"

Get aliases

GET /{index}/_alias/{name}

Retrieves information for one or more data stream or index aliases.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams or indices used to limit the request. Supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

  • name string | array[string] Required

    Comma-separated list of aliases to retrieve. Supports wildcards (*). To retrieve all aliases, omit this parameter or use * or _all.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object Additional properties
      Hide * attribute Show * attribute object
      • aliases object Required
        Hide aliases attribute Show aliases attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • filter object

            An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

            External documentation
          • Value used to route indexing operations to a specific shard. If specified, this overwrites the routing value for indexing operations.

          • If true, the index is the write index for the alias.

          • routing string

            Value used to route indexing and search operations to a specific shard.

          • Value used to route search operations to a specific shard. If specified, this overwrites the routing value for search operations.

          • is_hidden boolean

            If true, the alias is hidden. All indices for the alias must have the same is_hidden value.

GET /{index}/_alias/{name}
curl \
 --request GET 'http://api.example.com/{index}/_alias/{name}' \
 --header "Authorization: $API_KEY"

Create or update an alias

PUT /{index}/_alias/{name}

Adds a data stream or index to an alias.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams or indices to add. Supports wildcards (*). Wildcard patterns that match both data streams and indices return an error.

  • name string Required

    Alias to update. If the alias doesn’t exist, the request creates it. Index alias names support date math.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

application/json

Body

  • filter object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • If true, sets the write index or data stream for the alias. If an alias points to multiple indices or data streams and is_write_index isn’t set, the alias rejects write requests. If an index alias points to one index and is_write_index isn’t set, the index automatically acts as the write index. Data stream aliases don’t automatically set a write data stream, even if the alias points to one data stream.

  • routing string

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /{index}/_alias/{name}
curl \
 --request PUT 'http://api.example.com/{index}/_alias/{name}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"actions\": [\n    {\n      \"add\": {\n        \"index\": \"my-data-stream\",\n        \"alias\": \"my-alias\"\n      }\n    }\n  ]\n}"'
Request example
{
  "actions": [
    {
      "add": {
        "index": "my-data-stream",
        "alias": "my-alias"
      }
    }
  ]
}

Create or update an alias

POST /{index}/_alias/{name}

Adds a data stream or index to an alias.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams or indices to add. Supports wildcards (*). Wildcard patterns that match both data streams and indices return an error.

  • name string Required

    Alias to update. If the alias doesn’t exist, the request creates it. Index alias names support date math.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

application/json

Body

  • filter object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • If true, sets the write index or data stream for the alias. If an alias points to multiple indices or data streams and is_write_index isn’t set, the alias rejects write requests. If an index alias points to one index and is_write_index isn’t set, the index automatically acts as the write index. Data stream aliases don’t automatically set a write data stream, even if the alias points to one data stream.

  • routing string

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /{index}/_alias/{name}
curl \
 --request POST 'http://api.example.com/{index}/_alias/{name}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"actions\": [\n    {\n      \"add\": {\n        \"index\": \"my-data-stream\",\n        \"alias\": \"my-alias\"\n      }\n    }\n  ]\n}"'
Request example
{
  "actions": [
    {
      "add": {
        "index": "my-data-stream",
        "alias": "my-alias"
      }
    }
  ]
}

Delete an alias

DELETE /{index}/_alias/{name}

Removes a data stream or index from an alias.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams or indices used to limit the request. Supports wildcards (*).

  • name string | array[string] Required

    Comma-separated list of aliases to remove. Supports wildcards (*). To remove all aliases, use * or _all.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /{index}/_alias/{name}
curl \
 --request DELETE 'http://api.example.com/{index}/_alias/{name}' \
 --header "Authorization: $API_KEY"

Check aliases

HEAD /{index}/_alias/{name}

Check if one or more data stream or index aliases exist.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams or indices used to limit the request. Supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

  • name string | array[string] Required

    Comma-separated list of aliases to check. Supports wildcards (*).

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, requests that include a missing data stream or index in the target indices or data streams return an error.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

HEAD /{index}/_alias/{name}
curl \
 --request HEAD 'http://api.example.com/{index}/_alias/{name}' \
 --header "Authorization: $API_KEY"

Create or update an alias

PUT /{index}/_aliases/{name}

Adds a data stream or index to an alias.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams or indices to add. Supports wildcards (*). Wildcard patterns that match both data streams and indices return an error.

  • name string Required

    Alias to update. If the alias doesn’t exist, the request creates it. Index alias names support date math.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

application/json

Body

  • filter object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • If true, sets the write index or data stream for the alias. If an alias points to multiple indices or data streams and is_write_index isn’t set, the alias rejects write requests. If an index alias points to one index and is_write_index isn’t set, the index automatically acts as the write index. Data stream aliases don’t automatically set a write data stream, even if the alias points to one data stream.

  • routing string

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /{index}/_aliases/{name}
curl \
 --request PUT 'http://api.example.com/{index}/_aliases/{name}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"actions\": [\n    {\n      \"add\": {\n        \"index\": \"my-data-stream\",\n        \"alias\": \"my-alias\"\n      }\n    }\n  ]\n}"'
Request example
{
  "actions": [
    {
      "add": {
        "index": "my-data-stream",
        "alias": "my-alias"
      }
    }
  ]
}

Create or update an alias

POST /{index}/_aliases/{name}

Adds a data stream or index to an alias.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams or indices to add. Supports wildcards (*). Wildcard patterns that match both data streams and indices return an error.

  • name string Required

    Alias to update. If the alias doesn’t exist, the request creates it. Index alias names support date math.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

application/json

Body

  • filter object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • If true, sets the write index or data stream for the alias. If an alias points to multiple indices or data streams and is_write_index isn’t set, the alias rejects write requests. If an index alias points to one index and is_write_index isn’t set, the index automatically acts as the write index. Data stream aliases don’t automatically set a write data stream, even if the alias points to one data stream.

  • routing string

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /{index}/_aliases/{name}
curl \
 --request POST 'http://api.example.com/{index}/_aliases/{name}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"actions\": [\n    {\n      \"add\": {\n        \"index\": \"my-data-stream\",\n        \"alias\": \"my-alias\"\n      }\n    }\n  ]\n}"'
Request example
{
  "actions": [
    {
      "add": {
        "index": "my-data-stream",
        "alias": "my-alias"
      }
    }
  ]
}

Delete an alias

DELETE /{index}/_aliases/{name}

Removes a data stream or index from an alias.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams or indices used to limit the request. Supports wildcards (*).

  • name string | array[string] Required

    Comma-separated list of aliases to remove. Supports wildcards (*). To remove all aliases, use * or _all.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /{index}/_aliases/{name}
curl \
 --request DELETE 'http://api.example.com/{index}/_aliases/{name}' \
 --header "Authorization: $API_KEY"

Get index templates Added in 7.9.0

GET /_index_template/{name}

Get information about one or more index templates.

Path parameters

  • name string Required

    Comma-separated list of index template names used to limit the request. Wildcard (*) expressions are supported.

Query parameters

  • local boolean

    If true, the request retrieves information from the local node only. Defaults to false, which means information is retrieved from the master node.

  • If true, returns settings in flat format.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • If true, returns all relevant default configurations for the index template.

Responses

GET /_index_template/{name}
curl \
 --request GET 'http://api.example.com/_index_template/{name}' \
 --header "Authorization: $API_KEY"

Create or update an index template Added in 7.9.0

PUT /_index_template/{name}

Index templates define settings, mappings, and aliases that can be applied automatically to new indices.

Elasticsearch applies templates to new indices based on an wildcard pattern that matches the index name. Index templates are applied during data stream or index creation. For data streams, these settings and mappings are applied when the stream's backing indices are created. Settings and mappings specified in a create index API request override any settings or mappings specified in an index template. Changes to index templates do not affect existing indices, including the existing backing indices of a data stream.

You can use C-style /* *\/ block comments in index templates. You can include comments anywhere in the request body, except before the opening curly bracket.

Multiple matching templates

If multiple index templates match the name of a new index or data stream, the template with the highest priority is used.

Multiple templates with overlapping index patterns at the same priority are not allowed and an error will be thrown when attempting to create a template matching an existing index template at identical priorities.

Composing aliases, mappings, and settings

When multiple component templates are specified in the composed_of field for an index template, they are merged in the order specified, meaning that later component templates override earlier component templates. Any mappings, settings, or aliases from the parent index template are merged in next. Finally, any configuration on the index request itself is merged. Mapping definitions are merged recursively, which means that later mapping components can introduce new field mappings and update the mapping configuration. If a field mapping is already contained in an earlier component, its definition will be completely overwritten by the later one. This recursive merging strategy applies not only to field mappings, but also root options like dynamic_templates and meta. If an earlier component contains a dynamic_templates block, then by default new dynamic_templates entries are appended onto the end. If an entry already exists with the same key, then it is overwritten by the new definition.

Path parameters

  • name string Required

    Index or template name

Query parameters

  • create boolean

    If true, this request cannot replace or update existing index templates.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • cause string

    User defined reason for creating/updating the index template

application/json

Body Required

  • index_patterns string | array[string]
  • composed_of array[string]

    An ordered list of component template names. Component templates are merged in the order specified, meaning that the last component template specified has the highest precedence.

  • template object
    Hide template attributes Show template attributes object
    • aliases object

      Aliases to add. If the index template includes a data_stream object, these are data stream aliases. Otherwise, these are index aliases. Data stream aliases ignore the index_routing, routing, and search_routing options.

      Hide aliases attribute Show aliases attribute object
    • mappings object
      Hide mappings attributes Show mappings attributes object
    • settings object Additional properties
      Hide settings attributes Show settings attributes object
      • index object Additional properties
      • mode string
      • Hide soft_deletes attributes Show soft_deletes attributes object
        • enabled boolean

          Indicates whether soft deletes are enabled on the index.

        • Hide retention_lease attribute Show retention_lease attribute object
          • period string Required

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • sort object
        Hide sort attributes Show sort attributes object
      • Values are true, false, or checksum.

      • codec string
      • routing_partition_size number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • auto_expand_replicas string | null

        One of:
      • merge object
        Hide merge attribute Show merge attribute object
        • Hide scheduler attributes Show scheduler attributes object
          • max_thread_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • max_merge_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • blocks object
        Hide blocks attributes Show blocks attributes object
        • read_only boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • read_only_allow_delete boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • read boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • write boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • metadata boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • analyze object
        Hide analyze attribute Show analyze attribute object
        • max_token_count number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • Hide highlight attribute Show highlight attribute object
      • routing object
        Hide routing attributes Show routing attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide lifecycle attributes Show lifecycle attributes object
        • name string
        • indexing_complete boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

        • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

        • step object
          Hide step attribute Show step attribute object
          • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

        • prefer_ilm boolean | string

          Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

      • creation_date number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • creation_date_string string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • uuid string
      • version object
        Hide version attributes Show version attributes object
      • translog object
        Hide translog attributes Show translog attributes object
      • Hide query_string attribute Show query_string attribute object
        • lenient boolean | string Required

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • analysis object
        Hide analysis attributes Show analysis attributes object
      • settings object Additional properties
      • Hide time_series attributes Show time_series attributes object
        • end_time string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • start_time string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • queries object
        Hide queries attribute Show queries attribute object
        • cache object
          Hide cache attribute Show cache attribute object
      • Configure custom similarity settings to customize how search results are scored.

      • mapping object
        Hide mapping attributes Show mapping attributes object
        • coerce boolean
        • Hide total_fields attributes Show total_fields attributes object
          • limit number | string

            The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

          • ignore_dynamic_beyond_limit boolean | string

            This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

        • depth object
          Hide depth attribute Show depth attribute object
          • limit number

            The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

        • Hide nested_fields attribute Show nested_fields attribute object
          • limit number

            The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

        • Hide nested_objects attribute Show nested_objects attribute object
          • limit number

            The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

        • Hide field_name_length attribute Show field_name_length attribute object
          • limit number

            Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

        • Hide dimension_fields attribute Show dimension_fields attribute object
          • limit number

            [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

        • source object
          Hide source attribute Show source attribute object
          • mode string Required

            Values are disabled, stored, or synthetic.

      • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
        • level string
        • source number
        • reformat boolean
        • Hide threshold attribute Show threshold attribute object
          • index object
            Hide index attributes Show index attributes object
            • warn string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • info string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • debug string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • trace string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide indexing_pressure attribute Show indexing_pressure attribute object
        • memory object Required
          Hide memory attribute Show memory attribute object
          • limit number

            Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

      • store object
        Hide store attributes Show store attributes object
        • type string Required

          Any of:

          Values are fs, niofs, mmapfs, or hybridfs.

        • allow_mmap boolean

          You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

    • Hide lifecycle attributes Show lifecycle attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide downsampling attribute Show downsampling attribute object
        • rounds array[object] Required

          The list of downsampling rounds to execute as part of this downsampling configuration

          Hide rounds attributes Show rounds attributes object
          • after string Required

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • config object Required
            Hide config attribute Show config attribute object
            • fixed_interval string Required

              A date histogram interval. Similar to Duration with additional units: w (week), M (month), q (quarter) and y (year)

      • enabled boolean

        If defined, it turns data stream lifecycle on/off (true/false) for this data stream. A data stream lifecycle that's disabled (enabled: false) will have no effect on the data stream.

  • Hide data_stream attributes Show data_stream attributes object
  • priority number

    Priority to determine index template precedence when a new data stream or index is created. The index template with the highest priority is chosen. If no priority is specified the template is treated as though it is of priority 0 (lowest priority). This number is not automatically generated by Elasticsearch.

  • version number
  • _meta object
    Hide _meta attribute Show _meta attribute object
    • * object Additional properties
  • This setting overrides the value of the action.auto_create_index cluster setting. If set to true in a template, then indices can be automatically created using that template even if auto-creation of indices is disabled via actions.auto_create_index. If set to false, then indices or data streams matching the template must always be explicitly created, and may never be automatically created.

  • The configuration option ignore_missing_component_templates can be used when an index template references a component template that might not exist

  • deprecated boolean

    Marks this index template as deprecated. When creating or updating a non-deprecated index template that uses deprecated components, Elasticsearch will emit a deprecation warning.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /_index_template/{name}
curl \
 --request PUT 'http://api.example.com/_index_template/{name}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"index_patterns\" : [\"template*\"],\n  \"priority\" : 1,\n  \"template\": {\n    \"settings\" : {\n      \"number_of_shards\" : 2\n    }\n  }\n}"'
Request examples
{
  "index_patterns" : ["template*"],
  "priority" : 1,
  "template": {
    "settings" : {
      "number_of_shards" : 2
    }
  }
}
You can include index aliases in an index template. During index creation, the `{index}` placeholder in the alias name will be replaced with the actual index name that the template gets applied to.
{
  "index_patterns": [
    "template*"
  ],
  "template": {
    "settings": {
      "number_of_shards": 1
    },
    "aliases": {
      "alias1": {},
      "alias2": {
        "filter": {
          "term": {
            "user.id": "kimchy"
          }
        },
        "routing": "shard-1"
      },
      "{index}-alias": {}
    }
  }
}

Create or update an index template Added in 7.9.0

POST /_index_template/{name}

Index templates define settings, mappings, and aliases that can be applied automatically to new indices.

Elasticsearch applies templates to new indices based on an wildcard pattern that matches the index name. Index templates are applied during data stream or index creation. For data streams, these settings and mappings are applied when the stream's backing indices are created. Settings and mappings specified in a create index API request override any settings or mappings specified in an index template. Changes to index templates do not affect existing indices, including the existing backing indices of a data stream.

You can use C-style /* *\/ block comments in index templates. You can include comments anywhere in the request body, except before the opening curly bracket.

Multiple matching templates

If multiple index templates match the name of a new index or data stream, the template with the highest priority is used.

Multiple templates with overlapping index patterns at the same priority are not allowed and an error will be thrown when attempting to create a template matching an existing index template at identical priorities.

Composing aliases, mappings, and settings

When multiple component templates are specified in the composed_of field for an index template, they are merged in the order specified, meaning that later component templates override earlier component templates. Any mappings, settings, or aliases from the parent index template are merged in next. Finally, any configuration on the index request itself is merged. Mapping definitions are merged recursively, which means that later mapping components can introduce new field mappings and update the mapping configuration. If a field mapping is already contained in an earlier component, its definition will be completely overwritten by the later one. This recursive merging strategy applies not only to field mappings, but also root options like dynamic_templates and meta. If an earlier component contains a dynamic_templates block, then by default new dynamic_templates entries are appended onto the end. If an entry already exists with the same key, then it is overwritten by the new definition.

Path parameters

  • name string Required

    Index or template name

Query parameters

  • create boolean

    If true, this request cannot replace or update existing index templates.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • cause string

    User defined reason for creating/updating the index template

application/json

Body Required

  • index_patterns string | array[string]
  • composed_of array[string]

    An ordered list of component template names. Component templates are merged in the order specified, meaning that the last component template specified has the highest precedence.

  • template object
    Hide template attributes Show template attributes object
    • aliases object

      Aliases to add. If the index template includes a data_stream object, these are data stream aliases. Otherwise, these are index aliases. Data stream aliases ignore the index_routing, routing, and search_routing options.

      Hide aliases attribute Show aliases attribute object
    • mappings object
      Hide mappings attributes Show mappings attributes object
    • settings object Additional properties
      Hide settings attributes Show settings attributes object
      • index object Additional properties
      • mode string
      • Hide soft_deletes attributes Show soft_deletes attributes object
        • enabled boolean

          Indicates whether soft deletes are enabled on the index.

        • Hide retention_lease attribute Show retention_lease attribute object
          • period string Required

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • sort object
        Hide sort attributes Show sort attributes object
      • Values are true, false, or checksum.

      • codec string
      • routing_partition_size number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • auto_expand_replicas string | null

        One of:
      • merge object
        Hide merge attribute Show merge attribute object
        • Hide scheduler attributes Show scheduler attributes object
          • max_thread_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • max_merge_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • blocks object
        Hide blocks attributes Show blocks attributes object
        • read_only boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • read_only_allow_delete boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • read boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • write boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • metadata boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • analyze object
        Hide analyze attribute Show analyze attribute object
        • max_token_count number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • Hide highlight attribute Show highlight attribute object
      • routing object
        Hide routing attributes Show routing attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide lifecycle attributes Show lifecycle attributes object
        • name string
        • indexing_complete boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

        • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

        • step object
          Hide step attribute Show step attribute object
          • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

        • prefer_ilm boolean | string

          Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

      • creation_date number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • creation_date_string string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • uuid string
      • version object
        Hide version attributes Show version attributes object
      • translog object
        Hide translog attributes Show translog attributes object
      • Hide query_string attribute Show query_string attribute object
        • lenient boolean | string Required

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • analysis object
        Hide analysis attributes Show analysis attributes object
      • settings object Additional properties
      • Hide time_series attributes Show time_series attributes object
        • end_time string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • start_time string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • queries object
        Hide queries attribute Show queries attribute object
        • cache object
          Hide cache attribute Show cache attribute object
      • Configure custom similarity settings to customize how search results are scored.

      • mapping object
        Hide mapping attributes Show mapping attributes object
        • coerce boolean
        • Hide total_fields attributes Show total_fields attributes object
          • limit number | string

            The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

          • ignore_dynamic_beyond_limit boolean | string

            This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

        • depth object
          Hide depth attribute Show depth attribute object
          • limit number

            The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

        • Hide nested_fields attribute Show nested_fields attribute object
          • limit number

            The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

        • Hide nested_objects attribute Show nested_objects attribute object
          • limit number

            The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

        • Hide field_name_length attribute Show field_name_length attribute object
          • limit number

            Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

        • Hide dimension_fields attribute Show dimension_fields attribute object
          • limit number

            [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

        • source object
          Hide source attribute Show source attribute object
          • mode string Required

            Values are disabled, stored, or synthetic.

      • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
        • level string
        • source number
        • reformat boolean
        • Hide threshold attribute Show threshold attribute object
          • index object
            Hide index attributes Show index attributes object
            • warn string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • info string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • debug string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • trace string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide indexing_pressure attribute Show indexing_pressure attribute object
        • memory object Required
          Hide memory attribute Show memory attribute object
          • limit number

            Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

      • store object
        Hide store attributes Show store attributes object
        • type string Required

          Any of:

          Values are fs, niofs, mmapfs, or hybridfs.

        • allow_mmap boolean

          You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

    • Hide lifecycle attributes Show lifecycle attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide downsampling attribute Show downsampling attribute object
        • rounds array[object] Required

          The list of downsampling rounds to execute as part of this downsampling configuration

          Hide rounds attributes Show rounds attributes object
          • after string Required

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • config object Required
            Hide config attribute Show config attribute object
            • fixed_interval string Required

              A date histogram interval. Similar to Duration with additional units: w (week), M (month), q (quarter) and y (year)

      • enabled boolean

        If defined, it turns data stream lifecycle on/off (true/false) for this data stream. A data stream lifecycle that's disabled (enabled: false) will have no effect on the data stream.

  • Hide data_stream attributes Show data_stream attributes object
  • priority number

    Priority to determine index template precedence when a new data stream or index is created. The index template with the highest priority is chosen. If no priority is specified the template is treated as though it is of priority 0 (lowest priority). This number is not automatically generated by Elasticsearch.

  • version number
  • _meta object
    Hide _meta attribute Show _meta attribute object
    • * object Additional properties
  • This setting overrides the value of the action.auto_create_index cluster setting. If set to true in a template, then indices can be automatically created using that template even if auto-creation of indices is disabled via actions.auto_create_index. If set to false, then indices or data streams matching the template must always be explicitly created, and may never be automatically created.

  • The configuration option ignore_missing_component_templates can be used when an index template references a component template that might not exist

  • deprecated boolean

    Marks this index template as deprecated. When creating or updating a non-deprecated index template that uses deprecated components, Elasticsearch will emit a deprecation warning.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_index_template/{name}
curl \
 --request POST 'http://api.example.com/_index_template/{name}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"index_patterns\" : [\"template*\"],\n  \"priority\" : 1,\n  \"template\": {\n    \"settings\" : {\n      \"number_of_shards\" : 2\n    }\n  }\n}"'
Request examples
{
  "index_patterns" : ["template*"],
  "priority" : 1,
  "template": {
    "settings" : {
      "number_of_shards" : 2
    }
  }
}
You can include index aliases in an index template. During index creation, the `{index}` placeholder in the alias name will be replaced with the actual index name that the template gets applied to.
{
  "index_patterns": [
    "template*"
  ],
  "template": {
    "settings": {
      "number_of_shards": 1
    },
    "aliases": {
      "alias1": {},
      "alias2": {
        "filter": {
          "term": {
            "user.id": "kimchy"
          }
        },
        "routing": "shard-1"
      },
      "{index}-alias": {}
    }
  }
}

Delete an index template Added in 7.8.0

DELETE /_index_template/{name}

The provided may contain multiple template names separated by a comma. If multiple template names are specified then there is no wildcard support and the provided names should match completely with existing templates.

Path parameters

  • name string | array[string] Required

    Comma-separated list of index template names used to limit the request. Wildcard (*) expressions are supported.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_index_template/{name}
curl \
 --request DELETE 'http://api.example.com/_index_template/{name}' \
 --header "Authorization: $API_KEY"

Check index templates

HEAD /_index_template/{name}

Check whether index templates exist.

Path parameters

  • name string Required

    Comma-separated list of index template names used to limit the request. Wildcard (*) expressions are supported.

Query parameters

  • local boolean

    If true, the request retrieves information from the local node only. Defaults to false, which means information is retrieved from the master node.

  • If true, returns settings in flat format.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

HEAD /_index_template/{name}
curl \
 --request HEAD 'http://api.example.com/_index_template/{name}' \
 --header "Authorization: $API_KEY"

Get aliases

GET /_alias/{name}

Retrieves information for one or more data stream or index aliases.

Path parameters

  • name string | array[string] Required

    Comma-separated list of aliases to retrieve. Supports wildcards (*). To retrieve all aliases, omit this parameter or use * or _all.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object Additional properties
      Hide * attribute Show * attribute object
      • aliases object Required
        Hide aliases attribute Show aliases attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • filter object

            An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

            External documentation
          • Value used to route indexing operations to a specific shard. If specified, this overwrites the routing value for indexing operations.

          • If true, the index is the write index for the alias.

          • routing string

            Value used to route indexing and search operations to a specific shard.

          • Value used to route search operations to a specific shard. If specified, this overwrites the routing value for search operations.

          • is_hidden boolean

            If true, the alias is hidden. All indices for the alias must have the same is_hidden value.

GET /_alias/{name}
curl \
 --request GET 'http://api.example.com/_alias/{name}' \
 --header "Authorization: $API_KEY"

Check aliases

HEAD /_alias/{name}

Check if one or more data stream or index aliases exist.

Path parameters

  • name string | array[string] Required

    Comma-separated list of aliases to check. Supports wildcards (*).

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, requests that include a missing data stream or index in the target indices or data streams return an error.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

HEAD /_alias/{name}
curl \
 --request HEAD 'http://api.example.com/_alias/{name}' \
 --header "Authorization: $API_KEY"

Get aliases

GET /_alias

Retrieves information for one or more data stream or index aliases.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object Additional properties
      Hide * attribute Show * attribute object
      • aliases object Required
        Hide aliases attribute Show aliases attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • filter object

            An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

            External documentation
          • Value used to route indexing operations to a specific shard. If specified, this overwrites the routing value for indexing operations.

          • If true, the index is the write index for the alias.

          • routing string

            Value used to route indexing and search operations to a specific shard.

          • Value used to route search operations to a specific shard. If specified, this overwrites the routing value for search operations.

          • is_hidden boolean

            If true, the alias is hidden. All indices for the alias must have the same is_hidden value.

GET /_alias
curl \
 --request GET 'http://api.example.com/_alias' \
 --header "Authorization: $API_KEY"

Get aliases

GET /{index}/_alias

Retrieves information for one or more data stream or index aliases.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams or indices used to limit the request. Supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object Additional properties
      Hide * attribute Show * attribute object
      • aliases object Required
        Hide aliases attribute Show aliases attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • filter object

            An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

            External documentation
          • Value used to route indexing operations to a specific shard. If specified, this overwrites the routing value for indexing operations.

          • If true, the index is the write index for the alias.

          • routing string

            Value used to route indexing and search operations to a specific shard.

          • Value used to route search operations to a specific shard. If specified, this overwrites the routing value for search operations.

          • is_hidden boolean

            If true, the alias is hidden. All indices for the alias must have the same is_hidden value.

GET /{index}/_alias
curl \
 --request GET 'http://api.example.com/{index}/_alias' \
 --header "Authorization: $API_KEY"

Get index templates Added in 7.9.0

GET /_index_template

Get information about one or more index templates.

Query parameters

  • local boolean

    If true, the request retrieves information from the local node only. Defaults to false, which means information is retrieved from the master node.

  • If true, returns settings in flat format.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • If true, returns all relevant default configurations for the index template.

Responses

GET /_index_template
curl \
 --request GET 'http://api.example.com/_index_template' \
 --header "Authorization: $API_KEY"

Get mapping definitions

GET /_mapping

For data streams, the API retrieves mappings for the stream’s backing indices.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

  • local boolean Deprecated

    If true, the request retrieves information from the local node only.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

GET /_mapping
curl \
 --request GET 'http://api.example.com/_mapping' \
 --header "Authorization: $API_KEY"

Get mapping definitions

GET /{index}/_mapping

For data streams, the API retrieves mappings for the stream’s backing indices.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and aliases used to limit the request. Supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

  • local boolean Deprecated

    If true, the request retrieves information from the local node only.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

GET /{index}/_mapping
curl \
 --request GET 'http://api.example.com/{index}/_mapping' \
 --header "Authorization: $API_KEY"

Update field mappings

PUT /{index}/_mapping

Add new fields to an existing data stream or index. You can also use this API to change the search settings of existing fields and add new properties to existing object fields. For data streams, these changes are applied to all backing indices by default.

Add multi-fields to an existing field

Multi-fields let you index the same field in different ways. You can use this API to update the fields mapping parameter and enable multi-fields for an existing field. WARNING: If an index (or data stream) contains documents when you add a multi-field, those documents will not have values for the new multi-field. You can populate the new multi-field with the update by query API.

Change supported mapping parameters for an existing field

The documentation for each mapping parameter indicates whether you can update it for an existing field using this API. For example, you can use the update mapping API to update the ignore_above parameter.

Change the mapping of an existing field

Except for supported mapping parameters, you can't change the mapping or field type of an existing field. Changing an existing field could invalidate data that's already indexed.

If you need to change the mapping of a field in a data stream's backing indices, refer to documentation about modifying data streams. If you need to change the mapping of a field in other indices, create a new index with the correct mapping and reindex your data into that index.

Rename a field

Renaming a field would invalidate data already indexed under the old field name. Instead, add an alias field to create an alternate field name.

External documentation

Path parameters

  • index string | array[string] Required

    A comma-separated list of index names the mapping should be added to (supports wildcards); use _all or omit to add the mapping on all indices.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

  • If true, the mappings are applied only to the current write index for the target.

application/json

Body Required

  • Controls whether dynamic date detection is enabled.

  • dynamic string

    Values are strict, runtime, true, or false.

  • If date detection is enabled then new string fields are checked against 'dynamic_date_formats' and if the value matches then a new date field is added instead of string.

  • dynamic_templates array[object]

    Specify dynamic templates for the mapping.

  • Hide _field_names attribute Show _field_names attribute object
  • _meta object
    Hide _meta attribute Show _meta attribute object
    • * object Additional properties
  • Automatically map strings into numeric data types for all fields.

  • Mapping for a field. For new fields, this mapping can include:

    • Field name
    • Field data type
    • Mapping parameters
  • _routing object
    Hide _routing attribute Show _routing attribute object
  • _source object
    Hide _source attributes Show _source attributes object
  • runtime object
    Hide runtime attribute Show runtime attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • fields object

        For type composite

        Hide fields attribute Show fields attribute object
        • * object Additional properties
          Hide * attribute Show * attribute object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • fetch_fields array[object]

        For type lookup

        Hide fetch_fields attributes Show fetch_fields attributes object
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • format string
      • format string

        A custom format for date type runtime fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • script object
        Hide script attributes Show script attributes object
        • source string | object

          One of:
        • id string
        • params object

          Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          Hide params attribute Show params attribute object
          • * object Additional properties
        • lang string

          Any of:

          Values are painless, expression, mustache, or java.

        • options object
          Hide options attribute Show options attribute object
          • * string Additional properties
      • type string Required

        Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
PUT /{index}/_mapping
curl \
 --request PUT 'http://api.example.com/{index}/_mapping' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"properties\": {\n    \"user\": {\n      \"properties\": {\n        \"name\": {\n          \"type\": \"keyword\"\n        }\n      }\n    }\n  }\n}"'
Request example
The update mapping API can be applied to multiple data streams or indices with a single request. For example, run `PUT /my-index-000001,my-index-000002/_mapping` to update mappings for the `my-index-000001` and `my-index-000002` indices at the same time.
{
  "properties": {
    "user": {
      "properties": {
        "name": {
          "type": "keyword"
        }
      }
    }
  }
}

Update field mappings

POST /{index}/_mapping

Add new fields to an existing data stream or index. You can also use this API to change the search settings of existing fields and add new properties to existing object fields. For data streams, these changes are applied to all backing indices by default.

Add multi-fields to an existing field

Multi-fields let you index the same field in different ways. You can use this API to update the fields mapping parameter and enable multi-fields for an existing field. WARNING: If an index (or data stream) contains documents when you add a multi-field, those documents will not have values for the new multi-field. You can populate the new multi-field with the update by query API.

Change supported mapping parameters for an existing field

The documentation for each mapping parameter indicates whether you can update it for an existing field using this API. For example, you can use the update mapping API to update the ignore_above parameter.

Change the mapping of an existing field

Except for supported mapping parameters, you can't change the mapping or field type of an existing field. Changing an existing field could invalidate data that's already indexed.

If you need to change the mapping of a field in a data stream's backing indices, refer to documentation about modifying data streams. If you need to change the mapping of a field in other indices, create a new index with the correct mapping and reindex your data into that index.

Rename a field

Renaming a field would invalidate data already indexed under the old field name. Instead, add an alias field to create an alternate field name.

External documentation

Path parameters

  • index string | array[string] Required

    A comma-separated list of index names the mapping should be added to (supports wildcards); use _all or omit to add the mapping on all indices.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

  • If true, the mappings are applied only to the current write index for the target.

application/json

Body Required

  • Controls whether dynamic date detection is enabled.

  • dynamic string

    Values are strict, runtime, true, or false.

  • If date detection is enabled then new string fields are checked against 'dynamic_date_formats' and if the value matches then a new date field is added instead of string.

  • dynamic_templates array[object]

    Specify dynamic templates for the mapping.

  • Hide _field_names attribute Show _field_names attribute object
  • _meta object
    Hide _meta attribute Show _meta attribute object
    • * object Additional properties
  • Automatically map strings into numeric data types for all fields.

  • Mapping for a field. For new fields, this mapping can include:

    • Field name
    • Field data type
    • Mapping parameters
  • _routing object
    Hide _routing attribute Show _routing attribute object
  • _source object
    Hide _source attributes Show _source attributes object
  • runtime object
    Hide runtime attribute Show runtime attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • fields object

        For type composite

        Hide fields attribute Show fields attribute object
        • * object Additional properties
          Hide * attribute Show * attribute object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • fetch_fields array[object]

        For type lookup

        Hide fetch_fields attributes Show fetch_fields attributes object
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • format string
      • format string

        A custom format for date type runtime fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • script object
        Hide script attributes Show script attributes object
        • source string | object

          One of:
        • id string
        • params object

          Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          Hide params attribute Show params attribute object
          • * object Additional properties
        • lang string

          Any of:

          Values are painless, expression, mustache, or java.

        • options object
          Hide options attribute Show options attribute object
          • * string Additional properties
      • type string Required

        Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
POST /{index}/_mapping
curl \
 --request POST 'http://api.example.com/{index}/_mapping' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"properties\": {\n    \"user\": {\n      \"properties\": {\n        \"name\": {\n          \"type\": \"keyword\"\n        }\n      }\n    }\n  }\n}"'
Request example
The update mapping API can be applied to multiple data streams or indices with a single request. For example, run `PUT /my-index-000001,my-index-000002/_mapping` to update mappings for the `my-index-000001` and `my-index-000002` indices at the same time.
{
  "properties": {
    "user": {
      "properties": {
        "name": {
          "type": "keyword"
        }
      }
    }
  }
}

Get index settings

GET /_settings

Get setting information for one or more indices. For data streams, it returns setting information for the stream's backing indices.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If true, returns settings in flat format.

  • If false, the request returns an error if it targets a missing or closed index.

  • If true, return all default settings in the response.

  • local boolean

    If true, the request retrieves information from the local node only. If false, information is retrieved from the master node.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object
      Hide * attributes Show * attributes object
      • aliases object
        Hide aliases attribute Show aliases attribute object
      • mappings object
        Hide mappings attributes Show mappings attributes object
      • settings object Additional properties
        Hide settings attributes Show settings attributes object
        • index object Additional properties
        • mode string
        • Hide soft_deletes attributes Show soft_deletes attributes object
          • enabled boolean

            Indicates whether soft deletes are enabled on the index.

          • Hide retention_lease attribute Show retention_lease attribute object
            • period string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • sort object
          Hide sort attributes Show sort attributes object
        • Values are true, false, or checksum.

        • codec string
        • routing_partition_size number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • auto_expand_replicas string | null

          One of:
        • merge object
          Hide merge attribute Show merge attribute object
          • Hide scheduler attributes Show scheduler attributes object
            • max_thread_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

            • max_merge_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • blocks object
          Hide blocks attributes Show blocks attributes object
          • read_only boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read_only_allow_delete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • write boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • metadata boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analyze object
          Hide analyze attribute Show analyze attribute object
          • max_token_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Hide highlight attribute Show highlight attribute object
        • routing object
          Hide routing attributes Show routing attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide lifecycle attributes Show lifecycle attributes object
          • name string
          • indexing_complete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

          • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

          • step object
            Hide step attribute Show step attribute object
            • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

          • prefer_ilm boolean | string

            Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

        • creation_date number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • creation_date_string string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • uuid string
        • version object
          Hide version attributes Show version attributes object
        • translog object
          Hide translog attributes Show translog attributes object
        • Hide query_string attribute Show query_string attribute object
          • lenient boolean | string Required

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analysis object
          Hide analysis attributes Show analysis attributes object
        • settings object Additional properties
        • Hide time_series attributes Show time_series attributes object
          • end_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          • start_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • queries object
          Hide queries attribute Show queries attribute object
          • cache object
            Hide cache attribute Show cache attribute object
        • Configure custom similarity settings to customize how search results are scored.

        • mapping object
          Hide mapping attributes Show mapping attributes object
          • coerce boolean
          • Hide total_fields attributes Show total_fields attributes object
            • limit number | string

              The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

            • ignore_dynamic_beyond_limit boolean | string

              This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

          • depth object
            Hide depth attribute Show depth attribute object
            • limit number

              The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

          • Hide nested_fields attribute Show nested_fields attribute object
            • limit number

              The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

          • Hide nested_objects attribute Show nested_objects attribute object
            • limit number

              The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

          • Hide field_name_length attribute Show field_name_length attribute object
            • limit number

              Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

          • Hide dimension_fields attribute Show dimension_fields attribute object
            • limit number

              [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

          • source object
            Hide source attribute Show source attribute object
            • mode string Required

              Values are disabled, stored, or synthetic.

        • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
          • level string
          • source number
          • reformat boolean
          • Hide threshold attribute Show threshold attribute object
            • index object
              Hide index attributes Show index attributes object
              • warn string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • info string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • debug string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • trace string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide indexing_pressure attribute Show indexing_pressure attribute object
          • memory object Required
            Hide memory attribute Show memory attribute object
            • limit number

              Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

        • store object
          Hide store attributes Show store attributes object
          • type string Required

            Any of:

            Values are fs, niofs, mmapfs, or hybridfs.

          • allow_mmap boolean

            You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

      • defaults object Additional properties
        Hide defaults attributes Show defaults attributes object
        • index object Additional properties
        • mode string
        • Hide soft_deletes attributes Show soft_deletes attributes object
          • enabled boolean

            Indicates whether soft deletes are enabled on the index.

          • Hide retention_lease attribute Show retention_lease attribute object
            • period string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • sort object
          Hide sort attributes Show sort attributes object
        • Values are true, false, or checksum.

        • codec string
        • routing_partition_size number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • auto_expand_replicas string | null

          One of:
        • merge object
          Hide merge attribute Show merge attribute object
          • Hide scheduler attributes Show scheduler attributes object
            • max_thread_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

            • max_merge_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • blocks object
          Hide blocks attributes Show blocks attributes object
          • read_only boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read_only_allow_delete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • write boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • metadata boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analyze object
          Hide analyze attribute Show analyze attribute object
          • max_token_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Hide highlight attribute Show highlight attribute object
        • routing object
          Hide routing attributes Show routing attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide lifecycle attributes Show lifecycle attributes object
          • name string
          • indexing_complete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

          • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

          • step object
            Hide step attribute Show step attribute object
            • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

          • prefer_ilm boolean | string

            Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

        • creation_date number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • creation_date_string string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • uuid string
        • version object
          Hide version attributes Show version attributes object
        • translog object
          Hide translog attributes Show translog attributes object
        • Hide query_string attribute Show query_string attribute object
          • lenient boolean | string Required

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analysis object
          Hide analysis attributes Show analysis attributes object
        • settings object Additional properties
        • Hide time_series attributes Show time_series attributes object
          • end_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          • start_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • queries object
          Hide queries attribute Show queries attribute object
          • cache object
            Hide cache attribute Show cache attribute object
        • Configure custom similarity settings to customize how search results are scored.

        • mapping object
          Hide mapping attributes Show mapping attributes object
          • coerce boolean
          • Hide total_fields attributes Show total_fields attributes object
            • limit number | string

              The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

            • ignore_dynamic_beyond_limit boolean | string

              This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

          • depth object
            Hide depth attribute Show depth attribute object
            • limit number

              The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

          • Hide nested_fields attribute Show nested_fields attribute object
            • limit number

              The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

          • Hide nested_objects attribute Show nested_objects attribute object
            • limit number

              The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

          • Hide field_name_length attribute Show field_name_length attribute object
            • limit number

              Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

          • Hide dimension_fields attribute Show dimension_fields attribute object
            • limit number

              [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

          • source object
            Hide source attribute Show source attribute object
            • mode string Required

              Values are disabled, stored, or synthetic.

        • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
          • level string
          • source number
          • reformat boolean
          • Hide threshold attribute Show threshold attribute object
            • index object
              Hide index attributes Show index attributes object
              • warn string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • info string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • debug string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • trace string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide indexing_pressure attribute Show indexing_pressure attribute object
          • memory object Required
            Hide memory attribute Show memory attribute object
            • limit number

              Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

        • store object
          Hide store attributes Show store attributes object
          • type string Required

            Any of:

            Values are fs, niofs, mmapfs, or hybridfs.

          • allow_mmap boolean

            You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

      • Hide lifecycle attributes Show lifecycle attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide downsampling attribute Show downsampling attribute object
          • rounds array[object] Required

            The list of downsampling rounds to execute as part of this downsampling configuration

            Hide rounds attributes Show rounds attributes object
            • after string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • config object Required
        • enabled boolean

          If defined, it turns data stream lifecycle on/off (true/false) for this data stream. A data stream lifecycle that's disabled (enabled: false) will have no effect on the data stream.

GET /_settings
curl \
 --request GET 'http://api.example.com/_settings' \
 --header "Authorization: $API_KEY"

Update index settings

PUT /_settings

Changes dynamic index settings in real time. For data streams, index setting changes are applied to all backing indices by default.

To revert a setting to the default value, use a null value. The list of per-index settings that can be updated dynamically on live indices can be found in index module documentation. To preserve existing settings from being updated, set the preserve_existing parameter to true.

NOTE: You can only define new analyzers on closed indices. To add an analyzer, you must close the index, define the analyzer, and reopen the index. You cannot close the write index of a data stream. To update the analyzer for a data stream's write index and future backing indices, update the analyzer in the index template used by the stream. Then roll over the data stream to apply the new analyzer to the stream's write index and future backing indices. This affects searches and any new data added to the stream after the rollover. However, it does not affect the data stream's backing indices or their existing data. To change the analyzer for existing backing indices, you must create a new data stream and reindex your data into it.

External documentation

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If true, returns settings in flat format.

  • If true, returns settings in flat format.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • If true, existing index settings remain unchanged.

  • reopen boolean

    Whether to close and reopen the index to apply non-dynamic settings. If set to true the indices to which the settings are being applied will be closed temporarily and then reopened in order to apply the changes.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

application/json

Body Required

  • index object Additional properties
  • mode string
  • Hide soft_deletes attributes Show soft_deletes attributes object
    • enabled boolean

      Indicates whether soft deletes are enabled on the index.

    • Hide retention_lease attribute Show retention_lease attribute object
      • period string Required

        A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • sort object
    Hide sort attributes Show sort attributes object
  • Values are true, false, or checksum.

  • codec string
  • routing_partition_size number | string

    Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

    Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

  • auto_expand_replicas string | null

    One of:
  • merge object
    Hide merge attribute Show merge attribute object
    • Hide scheduler attributes Show scheduler attributes object
      • max_thread_count number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • max_merge_count number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • blocks object
    Hide blocks attributes Show blocks attributes object
    • read_only boolean | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • read_only_allow_delete boolean | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • read boolean | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • write boolean | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • metadata boolean | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

  • analyze object
    Hide analyze attribute Show analyze attribute object
    • max_token_count number | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

  • Hide highlight attribute Show highlight attribute object
  • routing object
    Hide routing attributes Show routing attributes object
  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • Hide lifecycle attributes Show lifecycle attributes object
    • name string
    • indexing_complete boolean | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

    • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

    • step object
      Hide step attribute Show step attribute object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

    • prefer_ilm boolean | string

      Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

  • creation_date number | string

    Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

    Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

  • creation_date_string string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • uuid string
  • version object
    Hide version attributes Show version attributes object
  • translog object
    Hide translog attributes Show translog attributes object
  • Hide query_string attribute Show query_string attribute object
    • lenient boolean | string Required

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

  • analysis object
    Hide analysis attributes Show analysis attributes object
  • settings object Additional properties
  • Hide time_series attributes Show time_series attributes object
    • end_time string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    • start_time string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • queries object
    Hide queries attribute Show queries attribute object
    • cache object
      Hide cache attribute Show cache attribute object
  • Configure custom similarity settings to customize how search results are scored.

  • mapping object
    Hide mapping attributes Show mapping attributes object
    • coerce boolean
    • Hide total_fields attributes Show total_fields attributes object
      • limit number | string

        The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

      • ignore_dynamic_beyond_limit boolean | string

        This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

    • depth object
      Hide depth attribute Show depth attribute object
      • limit number

        The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

    • Hide nested_fields attribute Show nested_fields attribute object
      • limit number

        The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

    • Hide nested_objects attribute Show nested_objects attribute object
      • limit number

        The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

    • Hide field_name_length attribute Show field_name_length attribute object
      • limit number

        Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

    • Hide dimension_fields attribute Show dimension_fields attribute object
      • limit number

        [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

    • source object
      Hide source attribute Show source attribute object
      • mode string Required

        Values are disabled, stored, or synthetic.

  • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
    • level string
    • source number
    • reformat boolean
    • Hide threshold attribute Show threshold attribute object
      • index object
        Hide index attributes Show index attributes object
        • warn string

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • info string

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • debug string

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • trace string

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • Hide indexing_pressure attribute Show indexing_pressure attribute object
    • memory object Required
      Hide memory attribute Show memory attribute object
      • limit number

        Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

  • store object
    Hide store attributes Show store attributes object
    • type string Required

      Any of:

      Values are fs, niofs, mmapfs, or hybridfs.

    • allow_mmap boolean

      You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /_settings
curl \
 --request PUT 'http://api.example.com/_settings' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"index\" : {\n    \"number_of_replicas\" : 2\n  }\n}"'
{
  "index" : {
    "number_of_replicas" : 2
  }
}
To revert a setting to the default value, use `null`.
{
  "index" : {
    "refresh_interval" : null
  }
}
To add an analyzer, you must close the index, define the analyzer, then reopen the index.
{
  "analysis" : {
    "analyzer":{
      "content":{
        "type":"custom",
        "tokenizer":"whitespace"
      }
    }
  }
}

POST /my-index-000001/_open

Get index settings

GET /{index}/_settings

Get setting information for one or more indices. For data streams, it returns setting information for the stream's backing indices.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and aliases used to limit the request. Supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If true, returns settings in flat format.

  • If false, the request returns an error if it targets a missing or closed index.

  • If true, return all default settings in the response.

  • local boolean

    If true, the request retrieves information from the local node only. If false, information is retrieved from the master node.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object
      Hide * attributes Show * attributes object
      • aliases object
        Hide aliases attribute Show aliases attribute object
      • mappings object
        Hide mappings attributes Show mappings attributes object
      • settings object Additional properties
        Hide settings attributes Show settings attributes object
        • index object Additional properties
        • mode string
        • Hide soft_deletes attributes Show soft_deletes attributes object
          • enabled boolean

            Indicates whether soft deletes are enabled on the index.

          • Hide retention_lease attribute Show retention_lease attribute object
            • period string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • sort object
          Hide sort attributes Show sort attributes object
        • Values are true, false, or checksum.

        • codec string
        • routing_partition_size number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • auto_expand_replicas string | null

          One of:
        • merge object
          Hide merge attribute Show merge attribute object
          • Hide scheduler attributes Show scheduler attributes object
            • max_thread_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

            • max_merge_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • blocks object
          Hide blocks attributes Show blocks attributes object
          • read_only boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read_only_allow_delete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • write boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • metadata boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analyze object
          Hide analyze attribute Show analyze attribute object
          • max_token_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Hide highlight attribute Show highlight attribute object
        • routing object
          Hide routing attributes Show routing attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide lifecycle attributes Show lifecycle attributes object
          • name string
          • indexing_complete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

          • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

          • step object
            Hide step attribute Show step attribute object
            • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

          • prefer_ilm boolean | string

            Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

        • creation_date number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • creation_date_string string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • uuid string
        • version object
          Hide version attributes Show version attributes object
        • translog object
          Hide translog attributes Show translog attributes object
        • Hide query_string attribute Show query_string attribute object
          • lenient boolean | string Required

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analysis object
          Hide analysis attributes Show analysis attributes object
        • settings object Additional properties
        • Hide time_series attributes Show time_series attributes object
          • end_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          • start_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • queries object
          Hide queries attribute Show queries attribute object
          • cache object
            Hide cache attribute Show cache attribute object
        • Configure custom similarity settings to customize how search results are scored.

        • mapping object
          Hide mapping attributes Show mapping attributes object
          • coerce boolean
          • Hide total_fields attributes Show total_fields attributes object
            • limit number | string

              The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

            • ignore_dynamic_beyond_limit boolean | string

              This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

          • depth object
            Hide depth attribute Show depth attribute object
            • limit number

              The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

          • Hide nested_fields attribute Show nested_fields attribute object
            • limit number

              The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

          • Hide nested_objects attribute Show nested_objects attribute object
            • limit number

              The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

          • Hide field_name_length attribute Show field_name_length attribute object
            • limit number

              Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

          • Hide dimension_fields attribute Show dimension_fields attribute object
            • limit number

              [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

          • source object
            Hide source attribute Show source attribute object
            • mode string Required

              Values are disabled, stored, or synthetic.

        • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
          • level string
          • source number
          • reformat boolean
          • Hide threshold attribute Show threshold attribute object
            • index object
              Hide index attributes Show index attributes object
              • warn string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • info string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • debug string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • trace string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide indexing_pressure attribute Show indexing_pressure attribute object
          • memory object Required
            Hide memory attribute Show memory attribute object
            • limit number

              Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

        • store object
          Hide store attributes Show store attributes object
          • type string Required

            Any of:

            Values are fs, niofs, mmapfs, or hybridfs.

          • allow_mmap boolean

            You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

      • defaults object Additional properties
        Hide defaults attributes Show defaults attributes object
        • index object Additional properties
        • mode string
        • Hide soft_deletes attributes Show soft_deletes attributes object
          • enabled boolean

            Indicates whether soft deletes are enabled on the index.

          • Hide retention_lease attribute Show retention_lease attribute object
            • period string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • sort object
          Hide sort attributes Show sort attributes object
        • Values are true, false, or checksum.

        • codec string
        • routing_partition_size number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • auto_expand_replicas string | null

          One of:
        • merge object
          Hide merge attribute Show merge attribute object
          • Hide scheduler attributes Show scheduler attributes object
            • max_thread_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

            • max_merge_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • blocks object
          Hide blocks attributes Show blocks attributes object
          • read_only boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read_only_allow_delete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • write boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • metadata boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analyze object
          Hide analyze attribute Show analyze attribute object
          • max_token_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Hide highlight attribute Show highlight attribute object
        • routing object
          Hide routing attributes Show routing attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide lifecycle attributes Show lifecycle attributes object
          • name string
          • indexing_complete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

          • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

          • step object
            Hide step attribute Show step attribute object
            • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

          • prefer_ilm boolean | string

            Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

        • creation_date number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • creation_date_string string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • uuid string
        • version object
          Hide version attributes Show version attributes object
        • translog object
          Hide translog attributes Show translog attributes object
        • Hide query_string attribute Show query_string attribute object
          • lenient boolean | string Required

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analysis object
          Hide analysis attributes Show analysis attributes object
        • settings object Additional properties
        • Hide time_series attributes Show time_series attributes object
          • end_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          • start_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • queries object
          Hide queries attribute Show queries attribute object
          • cache object
            Hide cache attribute Show cache attribute object
        • Configure custom similarity settings to customize how search results are scored.

        • mapping object
          Hide mapping attributes Show mapping attributes object
          • coerce boolean
          • Hide total_fields attributes Show total_fields attributes object
            • limit number | string

              The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

            • ignore_dynamic_beyond_limit boolean | string

              This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

          • depth object
            Hide depth attribute Show depth attribute object
            • limit number

              The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

          • Hide nested_fields attribute Show nested_fields attribute object
            • limit number

              The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

          • Hide nested_objects attribute Show nested_objects attribute object
            • limit number

              The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

          • Hide field_name_length attribute Show field_name_length attribute object
            • limit number

              Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

          • Hide dimension_fields attribute Show dimension_fields attribute object
            • limit number

              [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

          • source object
            Hide source attribute Show source attribute object
            • mode string Required

              Values are disabled, stored, or synthetic.

        • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
          • level string
          • source number
          • reformat boolean
          • Hide threshold attribute Show threshold attribute object
            • index object
              Hide index attributes Show index attributes object
              • warn string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • info string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • debug string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • trace string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide indexing_pressure attribute Show indexing_pressure attribute object
          • memory object Required
            Hide memory attribute Show memory attribute object
            • limit number

              Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

        • store object
          Hide store attributes Show store attributes object
          • type string Required

            Any of:

            Values are fs, niofs, mmapfs, or hybridfs.

          • allow_mmap boolean

            You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

      • Hide lifecycle attributes Show lifecycle attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide downsampling attribute Show downsampling attribute object
          • rounds array[object] Required

            The list of downsampling rounds to execute as part of this downsampling configuration

            Hide rounds attributes Show rounds attributes object
            • after string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • config object Required
        • enabled boolean

          If defined, it turns data stream lifecycle on/off (true/false) for this data stream. A data stream lifecycle that's disabled (enabled: false) will have no effect on the data stream.

GET /{index}/_settings
curl \
 --request GET 'http://api.example.com/{index}/_settings' \
 --header "Authorization: $API_KEY"

Update index settings

PUT /{index}/_settings

Changes dynamic index settings in real time. For data streams, index setting changes are applied to all backing indices by default.

To revert a setting to the default value, use a null value. The list of per-index settings that can be updated dynamically on live indices can be found in index module documentation. To preserve existing settings from being updated, set the preserve_existing parameter to true.

NOTE: You can only define new analyzers on closed indices. To add an analyzer, you must close the index, define the analyzer, and reopen the index. You cannot close the write index of a data stream. To update the analyzer for a data stream's write index and future backing indices, update the analyzer in the index template used by the stream. Then roll over the data stream to apply the new analyzer to the stream's write index and future backing indices. This affects searches and any new data added to the stream after the rollover. However, it does not affect the data stream's backing indices or their existing data. To change the analyzer for existing backing indices, you must create a new data stream and reindex your data into it.

External documentation

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and aliases used to limit the request. Supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If true, returns settings in flat format.

  • If true, returns settings in flat format.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • If true, existing index settings remain unchanged.

  • reopen boolean

    Whether to close and reopen the index to apply non-dynamic settings. If set to true the indices to which the settings are being applied will be closed temporarily and then reopened in order to apply the changes.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

application/json

Body Required

  • index object Additional properties
  • mode string
  • Hide soft_deletes attributes Show soft_deletes attributes object
    • enabled boolean

      Indicates whether soft deletes are enabled on the index.

    • Hide retention_lease attribute Show retention_lease attribute object
      • period string Required

        A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • sort object
    Hide sort attributes Show sort attributes object
  • Values are true, false, or checksum.

  • codec string
  • routing_partition_size number | string

    Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

    Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

  • auto_expand_replicas string | null

    One of:
  • merge object
    Hide merge attribute Show merge attribute object
    • Hide scheduler attributes Show scheduler attributes object
      • max_thread_count number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • max_merge_count number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • blocks object
    Hide blocks attributes Show blocks attributes object
    • read_only boolean | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • read_only_allow_delete boolean | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • read boolean | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • write boolean | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • metadata boolean | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

  • analyze object
    Hide analyze attribute Show analyze attribute object
    • max_token_count number | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

  • Hide highlight attribute Show highlight attribute object
  • routing object
    Hide routing attributes Show routing attributes object
  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • Hide lifecycle attributes Show lifecycle attributes object
    • name string
    • indexing_complete boolean | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

    • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

    • step object
      Hide step attribute Show step attribute object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

    • prefer_ilm boolean | string

      Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

  • creation_date number | string

    Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

    Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

  • creation_date_string string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • uuid string
  • version object
    Hide version attributes Show version attributes object
  • translog object
    Hide translog attributes Show translog attributes object
  • Hide query_string attribute Show query_string attribute object
    • lenient boolean | string Required

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

  • analysis object
    Hide analysis attributes Show analysis attributes object
  • settings object Additional properties
  • Hide time_series attributes Show time_series attributes object
    • end_time string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    • start_time string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • queries object
    Hide queries attribute Show queries attribute object
    • cache object
      Hide cache attribute Show cache attribute object
  • Configure custom similarity settings to customize how search results are scored.

  • mapping object
    Hide mapping attributes Show mapping attributes object
    • coerce boolean
    • Hide total_fields attributes Show total_fields attributes object
      • limit number | string

        The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

      • ignore_dynamic_beyond_limit boolean | string

        This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

    • depth object
      Hide depth attribute Show depth attribute object
      • limit number

        The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

    • Hide nested_fields attribute Show nested_fields attribute object
      • limit number

        The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

    • Hide nested_objects attribute Show nested_objects attribute object
      • limit number

        The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

    • Hide field_name_length attribute Show field_name_length attribute object
      • limit number

        Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

    • Hide dimension_fields attribute Show dimension_fields attribute object
      • limit number

        [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

    • source object
      Hide source attribute Show source attribute object
      • mode string Required

        Values are disabled, stored, or synthetic.

  • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
    • level string
    • source number
    • reformat boolean
    • Hide threshold attribute Show threshold attribute object
      • index object
        Hide index attributes Show index attributes object
        • warn string

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • info string

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • debug string

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • trace string

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • Hide indexing_pressure attribute Show indexing_pressure attribute object
    • memory object Required
      Hide memory attribute Show memory attribute object
      • limit number

        Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

  • store object
    Hide store attributes Show store attributes object
    • type string Required

      Any of:

      Values are fs, niofs, mmapfs, or hybridfs.

    • allow_mmap boolean

      You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /{index}/_settings
curl \
 --request PUT 'http://api.example.com/{index}/_settings' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"index\" : {\n    \"number_of_replicas\" : 2\n  }\n}"'
{
  "index" : {
    "number_of_replicas" : 2
  }
}
To revert a setting to the default value, use `null`.
{
  "index" : {
    "refresh_interval" : null
  }
}
To add an analyzer, you must close the index, define the analyzer, then reopen the index.
{
  "analysis" : {
    "analyzer":{
      "content":{
        "type":"custom",
        "tokenizer":"whitespace"
      }
    }
  }
}

POST /my-index-000001/_open

Get index settings

GET /{index}/_settings/{name}

Get setting information for one or more indices. For data streams, it returns setting information for the stream's backing indices.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and aliases used to limit the request. Supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

  • name string | array[string] Required

    Comma-separated list or wildcard expression of settings to retrieve.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If true, returns settings in flat format.

  • If false, the request returns an error if it targets a missing or closed index.

  • If true, return all default settings in the response.

  • local boolean

    If true, the request retrieves information from the local node only. If false, information is retrieved from the master node.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object
      Hide * attributes Show * attributes object
      • aliases object
        Hide aliases attribute Show aliases attribute object
      • mappings object
        Hide mappings attributes Show mappings attributes object
      • settings object Additional properties
        Hide settings attributes Show settings attributes object
        • index object Additional properties
        • mode string
        • Hide soft_deletes attributes Show soft_deletes attributes object
          • enabled boolean

            Indicates whether soft deletes are enabled on the index.

          • Hide retention_lease attribute Show retention_lease attribute object
            • period string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • sort object
          Hide sort attributes Show sort attributes object
        • Values are true, false, or checksum.

        • codec string
        • routing_partition_size number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • auto_expand_replicas string | null

          One of:
        • merge object
          Hide merge attribute Show merge attribute object
          • Hide scheduler attributes Show scheduler attributes object
            • max_thread_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

            • max_merge_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • blocks object
          Hide blocks attributes Show blocks attributes object
          • read_only boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read_only_allow_delete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • write boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • metadata boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analyze object
          Hide analyze attribute Show analyze attribute object
          • max_token_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Hide highlight attribute Show highlight attribute object
        • routing object
          Hide routing attributes Show routing attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide lifecycle attributes Show lifecycle attributes object
          • name string
          • indexing_complete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

          • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

          • step object
            Hide step attribute Show step attribute object
            • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

          • prefer_ilm boolean | string

            Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

        • creation_date number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • creation_date_string string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • uuid string
        • version object
          Hide version attributes Show version attributes object
        • translog object
          Hide translog attributes Show translog attributes object
        • Hide query_string attribute Show query_string attribute object
          • lenient boolean | string Required

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analysis object
          Hide analysis attributes Show analysis attributes object
        • settings object Additional properties
        • Hide time_series attributes Show time_series attributes object
          • end_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          • start_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • queries object
          Hide queries attribute Show queries attribute object
          • cache object
            Hide cache attribute Show cache attribute object
        • Configure custom similarity settings to customize how search results are scored.

        • mapping object
          Hide mapping attributes Show mapping attributes object
          • coerce boolean
          • Hide total_fields attributes Show total_fields attributes object
            • limit number | string

              The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

            • ignore_dynamic_beyond_limit boolean | string

              This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

          • depth object
            Hide depth attribute Show depth attribute object
            • limit number

              The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

          • Hide nested_fields attribute Show nested_fields attribute object
            • limit number

              The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

          • Hide nested_objects attribute Show nested_objects attribute object
            • limit number

              The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

          • Hide field_name_length attribute Show field_name_length attribute object
            • limit number

              Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

          • Hide dimension_fields attribute Show dimension_fields attribute object
            • limit number

              [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

          • source object
            Hide source attribute Show source attribute object
            • mode string Required

              Values are disabled, stored, or synthetic.

        • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
          • level string
          • source number
          • reformat boolean
          • Hide threshold attribute Show threshold attribute object
            • index object
              Hide index attributes Show index attributes object
              • warn string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • info string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • debug string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • trace string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide indexing_pressure attribute Show indexing_pressure attribute object
          • memory object Required
            Hide memory attribute Show memory attribute object
            • limit number

              Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

        • store object
          Hide store attributes Show store attributes object
          • type string Required

            Any of:

            Values are fs, niofs, mmapfs, or hybridfs.

          • allow_mmap boolean

            You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

      • defaults object Additional properties
        Hide defaults attributes Show defaults attributes object
        • index object Additional properties
        • mode string
        • Hide soft_deletes attributes Show soft_deletes attributes object
          • enabled boolean

            Indicates whether soft deletes are enabled on the index.

          • Hide retention_lease attribute Show retention_lease attribute object
            • period string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • sort object
          Hide sort attributes Show sort attributes object
        • Values are true, false, or checksum.

        • codec string
        • routing_partition_size number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • auto_expand_replicas string | null

          One of:
        • merge object
          Hide merge attribute Show merge attribute object
          • Hide scheduler attributes Show scheduler attributes object
            • max_thread_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

            • max_merge_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • blocks object
          Hide blocks attributes Show blocks attributes object
          • read_only boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read_only_allow_delete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • write boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • metadata boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analyze object
          Hide analyze attribute Show analyze attribute object
          • max_token_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Hide highlight attribute Show highlight attribute object
        • routing object
          Hide routing attributes Show routing attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide lifecycle attributes Show lifecycle attributes object
          • name string
          • indexing_complete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

          • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

          • step object
            Hide step attribute Show step attribute object
            • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

          • prefer_ilm boolean | string

            Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

        • creation_date number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • creation_date_string string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • uuid string
        • version object
          Hide version attributes Show version attributes object
        • translog object
          Hide translog attributes Show translog attributes object
        • Hide query_string attribute Show query_string attribute object
          • lenient boolean | string Required

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analysis object
          Hide analysis attributes Show analysis attributes object
        • settings object Additional properties
        • Hide time_series attributes Show time_series attributes object
          • end_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          • start_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • queries object
          Hide queries attribute Show queries attribute object
          • cache object
            Hide cache attribute Show cache attribute object
        • Configure custom similarity settings to customize how search results are scored.

        • mapping object
          Hide mapping attributes Show mapping attributes object
          • coerce boolean
          • Hide total_fields attributes Show total_fields attributes object
            • limit number | string

              The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

            • ignore_dynamic_beyond_limit boolean | string

              This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

          • depth object
            Hide depth attribute Show depth attribute object
            • limit number

              The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

          • Hide nested_fields attribute Show nested_fields attribute object
            • limit number

              The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

          • Hide nested_objects attribute Show nested_objects attribute object
            • limit number

              The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

          • Hide field_name_length attribute Show field_name_length attribute object
            • limit number

              Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

          • Hide dimension_fields attribute Show dimension_fields attribute object
            • limit number

              [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

          • source object
            Hide source attribute Show source attribute object
            • mode string Required

              Values are disabled, stored, or synthetic.

        • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
          • level string
          • source number
          • reformat boolean
          • Hide threshold attribute Show threshold attribute object
            • index object
              Hide index attributes Show index attributes object
              • warn string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • info string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • debug string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • trace string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide indexing_pressure attribute Show indexing_pressure attribute object
          • memory object Required
            Hide memory attribute Show memory attribute object
            • limit number

              Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

        • store object
          Hide store attributes Show store attributes object
          • type string Required

            Any of:

            Values are fs, niofs, mmapfs, or hybridfs.

          • allow_mmap boolean

            You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

      • Hide lifecycle attributes Show lifecycle attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide downsampling attribute Show downsampling attribute object
          • rounds array[object] Required

            The list of downsampling rounds to execute as part of this downsampling configuration

            Hide rounds attributes Show rounds attributes object
            • after string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • config object Required
        • enabled boolean

          If defined, it turns data stream lifecycle on/off (true/false) for this data stream. A data stream lifecycle that's disabled (enabled: false) will have no effect on the data stream.

GET /{index}/_settings/{name}
curl \
 --request GET 'http://api.example.com/{index}/_settings/{name}' \
 --header "Authorization: $API_KEY"

Get index settings

GET /_settings/{name}

Get setting information for one or more indices. For data streams, it returns setting information for the stream's backing indices.

Path parameters

  • name string | array[string] Required

    Comma-separated list or wildcard expression of settings to retrieve.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If true, returns settings in flat format.

  • If false, the request returns an error if it targets a missing or closed index.

  • If true, return all default settings in the response.

  • local boolean

    If true, the request retrieves information from the local node only. If false, information is retrieved from the master node.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object
      Hide * attributes Show * attributes object
      • aliases object
        Hide aliases attribute Show aliases attribute object
      • mappings object
        Hide mappings attributes Show mappings attributes object
      • settings object Additional properties
        Hide settings attributes Show settings attributes object
        • index object Additional properties
        • mode string
        • Hide soft_deletes attributes Show soft_deletes attributes object
          • enabled boolean

            Indicates whether soft deletes are enabled on the index.

          • Hide retention_lease attribute Show retention_lease attribute object
            • period string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • sort object
          Hide sort attributes Show sort attributes object
        • Values are true, false, or checksum.

        • codec string
        • routing_partition_size number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • auto_expand_replicas string | null

          One of:
        • merge object
          Hide merge attribute Show merge attribute object
          • Hide scheduler attributes Show scheduler attributes object
            • max_thread_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

            • max_merge_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • blocks object
          Hide blocks attributes Show blocks attributes object
          • read_only boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read_only_allow_delete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • write boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • metadata boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analyze object
          Hide analyze attribute Show analyze attribute object
          • max_token_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Hide highlight attribute Show highlight attribute object
        • routing object
          Hide routing attributes Show routing attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide lifecycle attributes Show lifecycle attributes object
          • name string
          • indexing_complete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

          • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

          • step object
            Hide step attribute Show step attribute object
            • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

          • prefer_ilm boolean | string

            Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

        • creation_date number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • creation_date_string string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • uuid string
        • version object
          Hide version attributes Show version attributes object
        • translog object
          Hide translog attributes Show translog attributes object
        • Hide query_string attribute Show query_string attribute object
          • lenient boolean | string Required

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analysis object
          Hide analysis attributes Show analysis attributes object
        • settings object Additional properties
        • Hide time_series attributes Show time_series attributes object
          • end_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          • start_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • queries object
          Hide queries attribute Show queries attribute object
          • cache object
            Hide cache attribute Show cache attribute object
        • Configure custom similarity settings to customize how search results are scored.

        • mapping object
          Hide mapping attributes Show mapping attributes object
          • coerce boolean
          • Hide total_fields attributes Show total_fields attributes object
            • limit number | string

              The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

            • ignore_dynamic_beyond_limit boolean | string

              This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

          • depth object
            Hide depth attribute Show depth attribute object
            • limit number

              The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

          • Hide nested_fields attribute Show nested_fields attribute object
            • limit number

              The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

          • Hide nested_objects attribute Show nested_objects attribute object
            • limit number

              The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

          • Hide field_name_length attribute Show field_name_length attribute object
            • limit number

              Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

          • Hide dimension_fields attribute Show dimension_fields attribute object
            • limit number

              [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

          • source object
            Hide source attribute Show source attribute object
            • mode string Required

              Values are disabled, stored, or synthetic.

        • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
          • level string
          • source number
          • reformat boolean
          • Hide threshold attribute Show threshold attribute object
            • index object
              Hide index attributes Show index attributes object
              • warn string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • info string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • debug string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • trace string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide indexing_pressure attribute Show indexing_pressure attribute object
          • memory object Required
            Hide memory attribute Show memory attribute object
            • limit number

              Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

        • store object
          Hide store attributes Show store attributes object
          • type string Required

            Any of:

            Values are fs, niofs, mmapfs, or hybridfs.

          • allow_mmap boolean

            You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

      • defaults object Additional properties
        Hide defaults attributes Show defaults attributes object
        • index object Additional properties
        • mode string
        • Hide soft_deletes attributes Show soft_deletes attributes object
          • enabled boolean

            Indicates whether soft deletes are enabled on the index.

          • Hide retention_lease attribute Show retention_lease attribute object
            • period string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • sort object
          Hide sort attributes Show sort attributes object
        • Values are true, false, or checksum.

        • codec string
        • routing_partition_size number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • auto_expand_replicas string | null

          One of:
        • merge object
          Hide merge attribute Show merge attribute object
          • Hide scheduler attributes Show scheduler attributes object
            • max_thread_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

            • max_merge_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • blocks object
          Hide blocks attributes Show blocks attributes object
          • read_only boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read_only_allow_delete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • write boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • metadata boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analyze object
          Hide analyze attribute Show analyze attribute object
          • max_token_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Hide highlight attribute Show highlight attribute object
        • routing object
          Hide routing attributes Show routing attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide lifecycle attributes Show lifecycle attributes object
          • name string
          • indexing_complete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

          • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

          • step object
            Hide step attribute Show step attribute object
            • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

          • prefer_ilm boolean | string

            Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

        • creation_date number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • creation_date_string string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • uuid string
        • version object
          Hide version attributes Show version attributes object
        • translog object
          Hide translog attributes Show translog attributes object
        • Hide query_string attribute Show query_string attribute object
          • lenient boolean | string Required

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analysis object
          Hide analysis attributes Show analysis attributes object
        • settings object Additional properties
        • Hide time_series attributes Show time_series attributes object
          • end_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          • start_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • queries object
          Hide queries attribute Show queries attribute object
          • cache object
            Hide cache attribute Show cache attribute object
        • Configure custom similarity settings to customize how search results are scored.

        • mapping object
          Hide mapping attributes Show mapping attributes object
          • coerce boolean
          • Hide total_fields attributes Show total_fields attributes object
            • limit number | string

              The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

            • ignore_dynamic_beyond_limit boolean | string

              This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

          • depth object
            Hide depth attribute Show depth attribute object
            • limit number

              The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

          • Hide nested_fields attribute Show nested_fields attribute object
            • limit number

              The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

          • Hide nested_objects attribute Show nested_objects attribute object
            • limit number

              The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

          • Hide field_name_length attribute Show field_name_length attribute object
            • limit number

              Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

          • Hide dimension_fields attribute Show dimension_fields attribute object
            • limit number

              [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

          • source object
            Hide source attribute Show source attribute object
            • mode string Required

              Values are disabled, stored, or synthetic.

        • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
          • level string
          • source number
          • reformat boolean
          • Hide threshold attribute Show threshold attribute object
            • index object
              Hide index attributes Show index attributes object
              • warn string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • info string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • debug string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • trace string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide indexing_pressure attribute Show indexing_pressure attribute object
          • memory object Required
            Hide memory attribute Show memory attribute object
            • limit number

              Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

        • store object
          Hide store attributes Show store attributes object
          • type string Required

            Any of:

            Values are fs, niofs, mmapfs, or hybridfs.

          • allow_mmap boolean

            You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

      • Hide lifecycle attributes Show lifecycle attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide downsampling attribute Show downsampling attribute object
          • rounds array[object] Required

            The list of downsampling rounds to execute as part of this downsampling configuration

            Hide rounds attributes Show rounds attributes object
            • after string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • config object Required
        • enabled boolean

          If defined, it turns data stream lifecycle on/off (true/false) for this data stream. A data stream lifecycle that's disabled (enabled: false) will have no effect on the data stream.

GET /_settings/{name}
curl \
 --request GET 'http://api.example.com/_settings/{name}' \
 --header "Authorization: $API_KEY"

Refresh an index

GET /_refresh

A refresh makes recent operations performed on one or more indices available for search. For data streams, the API runs the refresh operation on the stream’s backing indices.

By default, Elasticsearch periodically refreshes indices every second, but only on indices that have received one search request or more in the last 30 seconds. You can change this default interval with the index.refresh_interval setting.

Refresh requests are synchronous and do not return a response until the refresh operation completes.

Refreshes are resource-intensive. To ensure good cluster performance, it's recommended to wait for Elasticsearch's periodic refresh rather than performing an explicit refresh when possible.

If your application workflow indexes documents and then runs a search to retrieve the indexed document, it's recommended to use the index API's refresh=wait_for query parameter option. This option ensures the indexing operation waits for a periodic refresh before running the search.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

Responses

GET /_refresh
curl \
 --request GET 'http://api.example.com/_refresh' \
 --header "Authorization: $API_KEY"

Refresh an index

POST /_refresh

A refresh makes recent operations performed on one or more indices available for search. For data streams, the API runs the refresh operation on the stream’s backing indices.

By default, Elasticsearch periodically refreshes indices every second, but only on indices that have received one search request or more in the last 30 seconds. You can change this default interval with the index.refresh_interval setting.

Refresh requests are synchronous and do not return a response until the refresh operation completes.

Refreshes are resource-intensive. To ensure good cluster performance, it's recommended to wait for Elasticsearch's periodic refresh rather than performing an explicit refresh when possible.

If your application workflow indexes documents and then runs a search to retrieve the indexed document, it's recommended to use the index API's refresh=wait_for query parameter option. This option ensures the indexing operation waits for a periodic refresh before running the search.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

Responses

POST /_refresh
curl \
 --request POST 'http://api.example.com/_refresh' \
 --header "Authorization: $API_KEY"

Refresh an index

GET /{index}/_refresh

A refresh makes recent operations performed on one or more indices available for search. For data streams, the API runs the refresh operation on the stream’s backing indices.

By default, Elasticsearch periodically refreshes indices every second, but only on indices that have received one search request or more in the last 30 seconds. You can change this default interval with the index.refresh_interval setting.

Refresh requests are synchronous and do not return a response until the refresh operation completes.

Refreshes are resource-intensive. To ensure good cluster performance, it's recommended to wait for Elasticsearch's periodic refresh rather than performing an explicit refresh when possible.

If your application workflow indexes documents and then runs a search to retrieve the indexed document, it's recommended to use the index API's refresh=wait_for query parameter option. This option ensures the indexing operation waits for a periodic refresh before running the search.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and aliases used to limit the request. Supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

Responses

GET /{index}/_refresh
curl \
 --request GET 'http://api.example.com/{index}/_refresh' \
 --header "Authorization: $API_KEY"

Refresh an index

POST /{index}/_refresh

A refresh makes recent operations performed on one or more indices available for search. For data streams, the API runs the refresh operation on the stream’s backing indices.

By default, Elasticsearch periodically refreshes indices every second, but only on indices that have received one search request or more in the last 30 seconds. You can change this default interval with the index.refresh_interval setting.

Refresh requests are synchronous and do not return a response until the refresh operation completes.

Refreshes are resource-intensive. To ensure good cluster performance, it's recommended to wait for Elasticsearch's periodic refresh rather than performing an explicit refresh when possible.

If your application workflow indexes documents and then runs a search to retrieve the indexed document, it's recommended to use the index API's refresh=wait_for query parameter option. This option ensures the indexing operation waits for a periodic refresh before running the search.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and aliases used to limit the request. Supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

Responses

POST /{index}/_refresh
curl \
 --request POST 'http://api.example.com/{index}/_refresh' \
 --header "Authorization: $API_KEY"

Resolve indices Added in 7.9.0

GET /_resolve/index/{name}

Resolve the names and/or index patterns for indices, aliases, and data streams. Multiple patterns and remote clusters are supported.

Path parameters

  • name string | array[string] Required

    Comma-separated name(s) or index pattern(s) of the indices, aliases, and data streams to resolve. Resources on remote clusters can be specified using the <cluster>:<name> syntax.

Query parameters

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
GET /_resolve/index/{name}
curl \
 --request GET 'http://api.example.com/_resolve/index/{name}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET /_resolve/index/f*,remoteCluster1:bar*?expand_wildcards=all`.
{
  "indices": [
    {
      "name": "foo_closed",
      "attributes": [
        "closed"
      ]
    },
    {
      "name": "freeze-index",
      "aliases": [
        "f-alias"
      ],
      "attributes": [
        "open"
      ]
    },
    {
      "name": "remoteCluster1:bar-01",
      "attributes": [
        "open"
      ]
    }
  ],
  "aliases": [
    {
      "name": "f-alias",
      "indices": [
        "freeze-index",
        "my-index-000001"
      ]
    }
  ],
  "data_streams": [
    {
      "name": "foo",
      "backing_indices": [
        ".ds-foo-2099.03.07-000001"
      ],
      "timestamp_field": "@timestamp"
    }
  ]
}

Roll over to a new index Added in 5.0.0

POST /{alias}/_rollover

TIP: It is recommended to use the index lifecycle rollover action to automate rollovers.

The rollover API creates a new index for a data stream or index alias. The API behavior depends on the rollover target.

Roll over a data stream

If you roll over a data stream, the API creates a new write index for the stream. The stream's previous write index becomes a regular backing index. A rollover also increments the data stream's generation.

Roll over an index alias with a write index

TIP: Prior to Elasticsearch 7.9, you'd typically use an index alias with a write index to manage time series data. Data streams replace this functionality, require less maintenance, and automatically integrate with data tiers.

If an index alias points to multiple indices, one of the indices must be a write index. The rollover API creates a new write index for the alias with is_write_index set to true. The API also sets is_write_index to false for the previous write index.

Roll over an index alias with one index

If you roll over an index alias that points to only one index, the API creates a new index for the alias and removes the original index from the alias.

NOTE: A rollover creates a new index and is subject to the wait_for_active_shards setting.

Increment index names for an alias

When you roll over an index alias, you can specify a name for the new index. If you don't specify a name and the current index ends with - and a number, such as my-index-000001 or my-index-3, the new index name increments that number. For example, if you roll over an alias with a current index of my-index-000001, the rollover creates a new index named my-index-000002. This number is always six characters and zero-padded, regardless of the previous index's name.

If you use an index alias for time series data, you can use date math in the index name to track the rollover date. For example, you can create an alias that points to an index named <my-index-{now/d}-000001>. If you create the index on May 6, 2099, the index's name is my-index-2099.05.06-000001. If you roll over the alias on May 7, 2099, the new index's name is my-index-2099.05.07-000002.

Path parameters

  • alias string Required

    Name of the data stream or index alias to roll over.

Query parameters

  • dry_run boolean

    If true, checks whether the current index satisfies the specified conditions but does not perform a rollover.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1).

  • lazy boolean

    If set to true, the rollover action will only mark a data stream to signal that it needs to be rolled over at the next write. Only allowed on data streams.

application/json

Body

  • aliases object

    Aliases for the target index. Data streams do not support this parameter.

    Hide aliases attribute Show aliases attribute object
  • Hide conditions attributes Show conditions attributes object
  • mappings object
    Hide mappings attributes Show mappings attributes object
    • Hide all_field attributes Show all_field attributes object
    • dynamic string

      Values are strict, runtime, true, or false.

    • dynamic_templates array[object]
    • Hide _field_names attribute Show _field_names attribute object
    • Hide index_field attribute Show index_field attribute object
    • _meta object
      Hide _meta attribute Show _meta attribute object
      • * object Additional properties
    • _routing object
      Hide _routing attribute Show _routing attribute object
    • _size object
      Hide _size attribute Show _size attribute object
    • _source object
      Hide _source attributes Show _source attributes object
    • runtime object
      Hide runtime attribute Show runtime attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • fields object

          For type composite

          Hide fields attribute Show fields attribute object
          • * object Additional properties
            Hide * attribute Show * attribute object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • fetch_fields array[object]

          For type lookup

          Hide fetch_fields attributes Show fetch_fields attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • format string
        • format string

          A custom format for date type runtime fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • script object
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
        • type string Required

          Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

    • enabled boolean
    • Values are true or false.

    • Hide _data_stream_timestamp attribute Show _data_stream_timestamp attribute object
  • settings object

    Configuration options for the index. Data streams do not support this parameter.

    Hide settings attribute Show settings attribute object
    • * object Additional properties

Responses

POST /{alias}/_rollover
curl \
 --request POST 'http://api.example.com/{alias}/_rollover' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"conditions\": {\n    \"max_age\": \"7d\",\n    \"max_docs\": 1000,\n    \"max_primary_shard_size\": \"50gb\",\n    \"max_primary_shard_docs\": \"2000\"\n  }\n}"'
Request example
{
  "conditions": {
    "max_age": "7d",
    "max_docs": 1000,
    "max_primary_shard_size": "50gb",
    "max_primary_shard_docs": "2000"
  }
}
Response examples (200)
An abbreviated response from `GET /_segments`.
{
  "_shards": {},
  "indices": {
    "test": {
      "shards": {
        "0": [
          {
            "routing": {
              "state": "STARTED",
              "primary": true,
              "node": "zDC_RorJQCao9xf9pg3Fvw"
            },
            "num_committed_segments": 0,
            "num_search_segments": 1,
            "segments": {
              "_0": {
                "generation": 0,
                "num_docs": 1,
                "deleted_docs": 0,
                "size_in_bytes": 3800,
                "committed": false,
                "search": true,
                "version": "7.0.0",
                "compound": true,
                "attributes": {}
              }
            }
          }
        ]
      }
    }
  }
}

Roll over to a new index Added in 5.0.0

POST /{alias}/_rollover/{new_index}

TIP: It is recommended to use the index lifecycle rollover action to automate rollovers.

The rollover API creates a new index for a data stream or index alias. The API behavior depends on the rollover target.

Roll over a data stream

If you roll over a data stream, the API creates a new write index for the stream. The stream's previous write index becomes a regular backing index. A rollover also increments the data stream's generation.

Roll over an index alias with a write index

TIP: Prior to Elasticsearch 7.9, you'd typically use an index alias with a write index to manage time series data. Data streams replace this functionality, require less maintenance, and automatically integrate with data tiers.

If an index alias points to multiple indices, one of the indices must be a write index. The rollover API creates a new write index for the alias with is_write_index set to true. The API also sets is_write_index to false for the previous write index.

Roll over an index alias with one index

If you roll over an index alias that points to only one index, the API creates a new index for the alias and removes the original index from the alias.

NOTE: A rollover creates a new index and is subject to the wait_for_active_shards setting.

Increment index names for an alias

When you roll over an index alias, you can specify a name for the new index. If you don't specify a name and the current index ends with - and a number, such as my-index-000001 or my-index-3, the new index name increments that number. For example, if you roll over an alias with a current index of my-index-000001, the rollover creates a new index named my-index-000002. This number is always six characters and zero-padded, regardless of the previous index's name.

If you use an index alias for time series data, you can use date math in the index name to track the rollover date. For example, you can create an alias that points to an index named <my-index-{now/d}-000001>. If you create the index on May 6, 2099, the index's name is my-index-2099.05.06-000001. If you roll over the alias on May 7, 2099, the new index's name is my-index-2099.05.07-000002.

Path parameters

  • alias string Required

    Name of the data stream or index alias to roll over.

  • new_index string Required

    Name of the index to create. Supports date math. Data streams do not support this parameter.

Query parameters

  • dry_run boolean

    If true, checks whether the current index satisfies the specified conditions but does not perform a rollover.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1).

  • lazy boolean

    If set to true, the rollover action will only mark a data stream to signal that it needs to be rolled over at the next write. Only allowed on data streams.

application/json

Body

  • aliases object

    Aliases for the target index. Data streams do not support this parameter.

    Hide aliases attribute Show aliases attribute object
  • Hide conditions attributes Show conditions attributes object
  • mappings object
    Hide mappings attributes Show mappings attributes object
    • Hide all_field attributes Show all_field attributes object
    • dynamic string

      Values are strict, runtime, true, or false.

    • dynamic_templates array[object]
    • Hide _field_names attribute Show _field_names attribute object
    • Hide index_field attribute Show index_field attribute object
    • _meta object
      Hide _meta attribute Show _meta attribute object
      • * object Additional properties
    • _routing object
      Hide _routing attribute Show _routing attribute object
    • _size object
      Hide _size attribute Show _size attribute object
    • _source object
      Hide _source attributes Show _source attributes object
    • runtime object
      Hide runtime attribute Show runtime attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • fields object

          For type composite

          Hide fields attribute Show fields attribute object
          • * object Additional properties
            Hide * attribute Show * attribute object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • fetch_fields array[object]

          For type lookup

          Hide fetch_fields attributes Show fetch_fields attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • format string
        • format string

          A custom format for date type runtime fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • script object
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
        • type string Required

          Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

    • enabled boolean
    • Values are true or false.

    • Hide _data_stream_timestamp attribute Show _data_stream_timestamp attribute object
  • settings object

    Configuration options for the index. Data streams do not support this parameter.

    Hide settings attribute Show settings attribute object
    • * object Additional properties

Responses

POST /{alias}/_rollover/{new_index}
curl \
 --request POST 'http://api.example.com/{alias}/_rollover/{new_index}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"conditions\": {\n    \"max_age\": \"7d\",\n    \"max_docs\": 1000,\n    \"max_primary_shard_size\": \"50gb\",\n    \"max_primary_shard_docs\": \"2000\"\n  }\n}"'
Request example
{
  "conditions": {
    "max_age": "7d",
    "max_docs": 1000,
    "max_primary_shard_size": "50gb",
    "max_primary_shard_docs": "2000"
  }
}
Response examples (200)
An abbreviated response from `GET /_segments`.
{
  "_shards": {},
  "indices": {
    "test": {
      "shards": {
        "0": [
          {
            "routing": {
              "state": "STARTED",
              "primary": true,
              "node": "zDC_RorJQCao9xf9pg3Fvw"
            },
            "num_committed_segments": 0,
            "num_search_segments": 1,
            "segments": {
              "_0": {
                "generation": 0,
                "num_docs": 1,
                "deleted_docs": 0,
                "size_in_bytes": 3800,
                "committed": false,
                "search": true,
                "version": "7.0.0",
                "compound": true,
                "attributes": {}
              }
            }
          }
        ]
      }
    }
  }
}

Simulate an index Added in 7.9.0

POST /_index_template/_simulate_index/{name}

Get the index configuration that would be applied to the specified index from an existing index template.

Path parameters

  • name string Required

    Name of the index to simulate

Query parameters

  • create boolean

    Whether the index template we optionally defined in the body should only be dry-run added if new or can also replace an existing one

  • cause string

    User defined reason for dry-run creating the new template for simulation purposes

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • If true, returns all relevant default configurations for the index template.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • overlapping array[object]
      Hide overlapping attributes Show overlapping attributes object
    • template object Required
      Hide template attributes Show template attributes object
      • aliases object Required
        Hide aliases attribute Show aliases attribute object
      • mappings object Required
        Hide mappings attributes Show mappings attributes object
      • settings object Required Additional properties
        Hide settings attributes Show settings attributes object
        • index object Additional properties
        • mode string
        • Hide soft_deletes attributes Show soft_deletes attributes object
          • enabled boolean

            Indicates whether soft deletes are enabled on the index.

          • Hide retention_lease attribute Show retention_lease attribute object
            • period string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • sort object
          Hide sort attributes Show sort attributes object
        • Values are true, false, or checksum.

        • codec string
        • routing_partition_size number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • auto_expand_replicas string | null

          One of:
        • merge object
          Hide merge attribute Show merge attribute object
          • Hide scheduler attributes Show scheduler attributes object
            • max_thread_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

            • max_merge_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • blocks object
          Hide blocks attributes Show blocks attributes object
          • read_only boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read_only_allow_delete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • write boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • metadata boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analyze object
          Hide analyze attribute Show analyze attribute object
          • max_token_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Hide highlight attribute Show highlight attribute object
        • routing object
          Hide routing attributes Show routing attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide lifecycle attributes Show lifecycle attributes object
          • name string
          • indexing_complete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

          • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

          • step object
            Hide step attribute Show step attribute object
            • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

          • prefer_ilm boolean | string

            Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

        • creation_date number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • creation_date_string string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • uuid string
        • version object
          Hide version attributes Show version attributes object
        • translog object
          Hide translog attributes Show translog attributes object
        • Hide query_string attribute Show query_string attribute object
          • lenient boolean | string Required

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analysis object
          Hide analysis attributes Show analysis attributes object
        • settings object Additional properties
        • Hide time_series attributes Show time_series attributes object
          • end_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          • start_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • queries object
          Hide queries attribute Show queries attribute object
          • cache object
            Hide cache attribute Show cache attribute object
        • Configure custom similarity settings to customize how search results are scored.

        • mapping object
          Hide mapping attributes Show mapping attributes object
          • coerce boolean
          • Hide total_fields attributes Show total_fields attributes object
            • limit number | string

              The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

            • ignore_dynamic_beyond_limit boolean | string

              This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

          • depth object
            Hide depth attribute Show depth attribute object
            • limit number

              The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

          • Hide nested_fields attribute Show nested_fields attribute object
            • limit number

              The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

          • Hide nested_objects attribute Show nested_objects attribute object
            • limit number

              The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

          • Hide field_name_length attribute Show field_name_length attribute object
            • limit number

              Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

          • Hide dimension_fields attribute Show dimension_fields attribute object
            • limit number

              [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

          • source object
            Hide source attribute Show source attribute object
            • mode string Required

              Values are disabled, stored, or synthetic.

        • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
          • level string
          • source number
          • reformat boolean
          • Hide threshold attribute Show threshold attribute object
            • index object
              Hide index attributes Show index attributes object
              • warn string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • info string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • debug string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • trace string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide indexing_pressure attribute Show indexing_pressure attribute object
          • memory object Required
            Hide memory attribute Show memory attribute object
            • limit number

              Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

        • store object
          Hide store attributes Show store attributes object
          • type string Required

            Any of:

            Values are fs, niofs, mmapfs, or hybridfs.

          • allow_mmap boolean

            You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

POST /_index_template/_simulate_index/{name}
curl \
 --request POST 'http://api.example.com/_index_template/_simulate_index/{name}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `POST /_index_template/_simulate_index/my-index-000001`.
{
  "template" : {
    "settings" : {
      "index" : {
        "number_of_shards" : "2",
        "number_of_replicas" : "0",
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        }
      }
    },
    "mappings" : {
      "properties" : {
        "@timestamp" : {
          "type" : "date"
        }
      }
    },
    "aliases" : { }
  },
  "overlapping" : [
    {
      "name" : "template_1",
      "index_patterns" : [
        "my-index-*"
      ]
    }
  ]
}

Simulate an index template

POST /_index_template/_simulate

Get the index configuration that would be applied by a particular index template.

Query parameters

  • create boolean

    If true, the template passed in the body is only used if no existing templates match the same index patterns. If false, the simulation uses the template with the highest priority. Note that the template is not permanently added or updated in either case; it is only used for the simulation.

  • cause string

    User defined reason for dry-run creating the new template for simulation purposes

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • If true, returns all relevant default configurations for the index template.

application/json

Body

  • This setting overrides the value of the action.auto_create_index cluster setting. If set to true in a template, then indices can be automatically created using that template even if auto-creation of indices is disabled via actions.auto_create_index. If set to false, then indices or data streams matching the template must always be explicitly created, and may never be automatically created.

  • index_patterns string | array[string]
  • composed_of array[string]

    An ordered list of component template names. Component templates are merged in the order specified, meaning that the last component template specified has the highest precedence.

  • template object
    Hide template attributes Show template attributes object
    • aliases object

      Aliases to add. If the index template includes a data_stream object, these are data stream aliases. Otherwise, these are index aliases. Data stream aliases ignore the index_routing, routing, and search_routing options.

      Hide aliases attribute Show aliases attribute object
    • mappings object
      Hide mappings attributes Show mappings attributes object
    • settings object Additional properties
      Hide settings attributes Show settings attributes object
      • index object Additional properties
      • mode string
      • Hide soft_deletes attributes Show soft_deletes attributes object
        • enabled boolean

          Indicates whether soft deletes are enabled on the index.

        • Hide retention_lease attribute Show retention_lease attribute object
          • period string Required

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • sort object
        Hide sort attributes Show sort attributes object
      • Values are true, false, or checksum.

      • codec string
      • routing_partition_size number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • auto_expand_replicas string | null

        One of:
      • merge object
        Hide merge attribute Show merge attribute object
        • Hide scheduler attributes Show scheduler attributes object
          • max_thread_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • max_merge_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • blocks object
        Hide blocks attributes Show blocks attributes object
        • read_only boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • read_only_allow_delete boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • read boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • write boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • metadata boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • analyze object
        Hide analyze attribute Show analyze attribute object
        • max_token_count number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • Hide highlight attribute Show highlight attribute object
      • routing object
        Hide routing attributes Show routing attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide lifecycle attributes Show lifecycle attributes object
        • name string
        • indexing_complete boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

        • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

        • step object
          Hide step attribute Show step attribute object
          • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

        • prefer_ilm boolean | string

          Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

      • creation_date number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • creation_date_string string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • uuid string
      • version object
        Hide version attributes Show version attributes object
      • translog object
        Hide translog attributes Show translog attributes object
      • Hide query_string attribute Show query_string attribute object
        • lenient boolean | string Required

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • analysis object
        Hide analysis attributes Show analysis attributes object
      • settings object Additional properties
      • Hide time_series attributes Show time_series attributes object
        • end_time string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • start_time string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • queries object
        Hide queries attribute Show queries attribute object
        • cache object
          Hide cache attribute Show cache attribute object
      • Configure custom similarity settings to customize how search results are scored.

      • mapping object
        Hide mapping attributes Show mapping attributes object
        • coerce boolean
        • Hide total_fields attributes Show total_fields attributes object
          • limit number | string

            The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

          • ignore_dynamic_beyond_limit boolean | string

            This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

        • depth object
          Hide depth attribute Show depth attribute object
          • limit number

            The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

        • Hide nested_fields attribute Show nested_fields attribute object
          • limit number

            The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

        • Hide nested_objects attribute Show nested_objects attribute object
          • limit number

            The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

        • Hide field_name_length attribute Show field_name_length attribute object
          • limit number

            Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

        • Hide dimension_fields attribute Show dimension_fields attribute object
          • limit number

            [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

        • source object
          Hide source attribute Show source attribute object
          • mode string Required

            Values are disabled, stored, or synthetic.

      • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
        • level string
        • source number
        • reformat boolean
        • Hide threshold attribute Show threshold attribute object
          • index object
            Hide index attributes Show index attributes object
            • warn string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • info string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • debug string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • trace string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide indexing_pressure attribute Show indexing_pressure attribute object
        • memory object Required
          Hide memory attribute Show memory attribute object
          • limit number

            Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

      • store object
        Hide store attributes Show store attributes object
        • type string Required

          Any of:

          Values are fs, niofs, mmapfs, or hybridfs.

        • allow_mmap boolean

          You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

    • Hide lifecycle attributes Show lifecycle attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide downsampling attribute Show downsampling attribute object
        • rounds array[object] Required

          The list of downsampling rounds to execute as part of this downsampling configuration

          Hide rounds attributes Show rounds attributes object
          • after string Required

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • config object Required
            Hide config attribute Show config attribute object
            • fixed_interval string Required

              A date histogram interval. Similar to Duration with additional units: w (week), M (month), q (quarter) and y (year)

      • enabled boolean

        If defined, it turns data stream lifecycle on/off (true/false) for this data stream. A data stream lifecycle that's disabled (enabled: false) will have no effect on the data stream.

  • Hide data_stream attributes Show data_stream attributes object
  • priority number

    Priority to determine index template precedence when a new data stream or index is created. The index template with the highest priority is chosen. If no priority is specified the template is treated as though it is of priority 0 (lowest priority). This number is not automatically generated by Elasticsearch.

  • version number
  • _meta object
    Hide _meta attribute Show _meta attribute object
    • * object Additional properties
  • The configuration option ignore_missing_component_templates can be used when an index template references a component template that might not exist

  • deprecated boolean

    Marks this index template as deprecated. When creating or updating a non-deprecated index template that uses deprecated components, Elasticsearch will emit a deprecation warning.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • overlapping array[object]
      Hide overlapping attributes Show overlapping attributes object
    • template object Required
      Hide template attributes Show template attributes object
      • aliases object Required
        Hide aliases attribute Show aliases attribute object
      • mappings object Required
        Hide mappings attributes Show mappings attributes object
      • settings object Required Additional properties
        Hide settings attributes Show settings attributes object
        • index object Additional properties
        • mode string
        • Hide soft_deletes attributes Show soft_deletes attributes object
          • enabled boolean

            Indicates whether soft deletes are enabled on the index.

          • Hide retention_lease attribute Show retention_lease attribute object
            • period string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • sort object
          Hide sort attributes Show sort attributes object
        • Values are true, false, or checksum.

        • codec string
        • routing_partition_size number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • auto_expand_replicas string | null

          One of:
        • merge object
          Hide merge attribute Show merge attribute object
          • Hide scheduler attributes Show scheduler attributes object
            • max_thread_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

            • max_merge_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • blocks object
          Hide blocks attributes Show blocks attributes object
          • read_only boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read_only_allow_delete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • write boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • metadata boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analyze object
          Hide analyze attribute Show analyze attribute object
          • max_token_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Hide highlight attribute Show highlight attribute object
        • routing object
          Hide routing attributes Show routing attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide lifecycle attributes Show lifecycle attributes object
          • name string
          • indexing_complete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

          • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

          • step object
            Hide step attribute Show step attribute object
            • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

          • prefer_ilm boolean | string

            Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

        • creation_date number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • creation_date_string string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • uuid string
        • version object
          Hide version attributes Show version attributes object
        • translog object
          Hide translog attributes Show translog attributes object
        • Hide query_string attribute Show query_string attribute object
          • lenient boolean | string Required

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analysis object
          Hide analysis attributes Show analysis attributes object
        • settings object Additional properties
        • Hide time_series attributes Show time_series attributes object
          • end_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          • start_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • queries object
          Hide queries attribute Show queries attribute object
          • cache object
            Hide cache attribute Show cache attribute object
        • Configure custom similarity settings to customize how search results are scored.

        • mapping object
          Hide mapping attributes Show mapping attributes object
          • coerce boolean
          • Hide total_fields attributes Show total_fields attributes object
            • limit number | string

              The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

            • ignore_dynamic_beyond_limit boolean | string

              This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

          • depth object
            Hide depth attribute Show depth attribute object
            • limit number

              The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

          • Hide nested_fields attribute Show nested_fields attribute object
            • limit number

              The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

          • Hide nested_objects attribute Show nested_objects attribute object
            • limit number

              The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

          • Hide field_name_length attribute Show field_name_length attribute object
            • limit number

              Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

          • Hide dimension_fields attribute Show dimension_fields attribute object
            • limit number

              [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

          • source object
            Hide source attribute Show source attribute object
            • mode string Required

              Values are disabled, stored, or synthetic.

        • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
          • level string
          • source number
          • reformat boolean
          • Hide threshold attribute Show threshold attribute object
            • index object
              Hide index attributes Show index attributes object
              • warn string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • info string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • debug string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • trace string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide indexing_pressure attribute Show indexing_pressure attribute object
          • memory object Required
            Hide memory attribute Show memory attribute object
            • limit number

              Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

        • store object
          Hide store attributes Show store attributes object
          • type string Required

            Any of:

            Values are fs, niofs, mmapfs, or hybridfs.

          • allow_mmap boolean

            You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

POST /_index_template/_simulate
curl \
 --request POST 'http://api.example.com/_index_template/_simulate' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"index_patterns\": [\"my-index-*\"],\n  \"composed_of\": [\"ct2\"],\n  \"priority\": 10,\n  \"template\": {\n    \"settings\": {\n      \"index.number_of_replicas\": 1\n    }\n  }\n}"'
Request example
To see what settings will be applied by a template before you add it to the cluster, you can pass a template configuration in the request body. The specified template is used for the simulation if it has a higher priority than existing templates.
{
  "index_patterns": ["my-index-*"],
  "composed_of": ["ct2"],
  "priority": 10,
  "template": {
    "settings": {
      "index.number_of_replicas": 1
    }
  }
}
Response examples (200)
A successful response from `POST /_index_template/_simulate` with a template configuration in the request body. The response shows any overlapping templates with a lower priority.
{
  "template" : {
    "settings" : {
      "index" : {
        "number_of_replicas" : "1",
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        }
      }
    },
    "mappings" : {
      "properties" : {
        "@timestamp" : {
          "type" : "date"
        }
      }
    },
    "aliases" : { }
  },
  "overlapping" : [
    {
      "name" : "final-template",
      "index_patterns" : [
        "my-index-*"
      ]
    }
  ]
}

Simulate an index template

POST /_index_template/_simulate/{name}

Get the index configuration that would be applied by a particular index template.

Path parameters

  • name string Required

    Name of the index template to simulate. To test a template configuration before you add it to the cluster, omit this parameter and specify the template configuration in the request body.

Query parameters

  • create boolean

    If true, the template passed in the body is only used if no existing templates match the same index patterns. If false, the simulation uses the template with the highest priority. Note that the template is not permanently added or updated in either case; it is only used for the simulation.

  • cause string

    User defined reason for dry-run creating the new template for simulation purposes

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • If true, returns all relevant default configurations for the index template.

application/json

Body

  • This setting overrides the value of the action.auto_create_index cluster setting. If set to true in a template, then indices can be automatically created using that template even if auto-creation of indices is disabled via actions.auto_create_index. If set to false, then indices or data streams matching the template must always be explicitly created, and may never be automatically created.

  • index_patterns string | array[string]
  • composed_of array[string]

    An ordered list of component template names. Component templates are merged in the order specified, meaning that the last component template specified has the highest precedence.

  • template object
    Hide template attributes Show template attributes object
    • aliases object

      Aliases to add. If the index template includes a data_stream object, these are data stream aliases. Otherwise, these are index aliases. Data stream aliases ignore the index_routing, routing, and search_routing options.

      Hide aliases attribute Show aliases attribute object
    • mappings object
      Hide mappings attributes Show mappings attributes object
    • settings object Additional properties
      Hide settings attributes Show settings attributes object
      • index object Additional properties
      • mode string
      • Hide soft_deletes attributes Show soft_deletes attributes object
        • enabled boolean

          Indicates whether soft deletes are enabled on the index.

        • Hide retention_lease attribute Show retention_lease attribute object
          • period string Required

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • sort object
        Hide sort attributes Show sort attributes object
      • Values are true, false, or checksum.

      • codec string
      • routing_partition_size number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • auto_expand_replicas string | null

        One of:
      • merge object
        Hide merge attribute Show merge attribute object
        • Hide scheduler attributes Show scheduler attributes object
          • max_thread_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • max_merge_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • blocks object
        Hide blocks attributes Show blocks attributes object
        • read_only boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • read_only_allow_delete boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • read boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • write boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • metadata boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • analyze object
        Hide analyze attribute Show analyze attribute object
        • max_token_count number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • Hide highlight attribute Show highlight attribute object
      • routing object
        Hide routing attributes Show routing attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide lifecycle attributes Show lifecycle attributes object
        • name string
        • indexing_complete boolean | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

        • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

        • step object
          Hide step attribute Show step attribute object
          • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

        • prefer_ilm boolean | string

          Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

      • creation_date number | string

        Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

        Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • creation_date_string string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • uuid string
      • version object
        Hide version attributes Show version attributes object
      • translog object
        Hide translog attributes Show translog attributes object
      • Hide query_string attribute Show query_string attribute object
        • lenient boolean | string Required

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

      • analysis object
        Hide analysis attributes Show analysis attributes object
      • settings object Additional properties
      • Hide time_series attributes Show time_series attributes object
        • end_time string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • start_time string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • queries object
        Hide queries attribute Show queries attribute object
        • cache object
          Hide cache attribute Show cache attribute object
      • Configure custom similarity settings to customize how search results are scored.

      • mapping object
        Hide mapping attributes Show mapping attributes object
        • coerce boolean
        • Hide total_fields attributes Show total_fields attributes object
          • limit number | string

            The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

          • ignore_dynamic_beyond_limit boolean | string

            This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

        • depth object
          Hide depth attribute Show depth attribute object
          • limit number

            The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

        • Hide nested_fields attribute Show nested_fields attribute object
          • limit number

            The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

        • Hide nested_objects attribute Show nested_objects attribute object
          • limit number

            The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

        • Hide field_name_length attribute Show field_name_length attribute object
          • limit number

            Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

        • Hide dimension_fields attribute Show dimension_fields attribute object
          • limit number

            [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

        • source object
          Hide source attribute Show source attribute object
          • mode string Required

            Values are disabled, stored, or synthetic.

      • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
        • level string
        • source number
        • reformat boolean
        • Hide threshold attribute Show threshold attribute object
          • index object
            Hide index attributes Show index attributes object
            • warn string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • info string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • debug string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

            • trace string

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide indexing_pressure attribute Show indexing_pressure attribute object
        • memory object Required
          Hide memory attribute Show memory attribute object
          • limit number

            Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

      • store object
        Hide store attributes Show store attributes object
        • type string Required

          Any of:

          Values are fs, niofs, mmapfs, or hybridfs.

        • allow_mmap boolean

          You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

    • Hide lifecycle attributes Show lifecycle attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide downsampling attribute Show downsampling attribute object
        • rounds array[object] Required

          The list of downsampling rounds to execute as part of this downsampling configuration

          Hide rounds attributes Show rounds attributes object
          • after string Required

            A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • config object Required
            Hide config attribute Show config attribute object
            • fixed_interval string Required

              A date histogram interval. Similar to Duration with additional units: w (week), M (month), q (quarter) and y (year)

      • enabled boolean

        If defined, it turns data stream lifecycle on/off (true/false) for this data stream. A data stream lifecycle that's disabled (enabled: false) will have no effect on the data stream.

  • Hide data_stream attributes Show data_stream attributes object
  • priority number

    Priority to determine index template precedence when a new data stream or index is created. The index template with the highest priority is chosen. If no priority is specified the template is treated as though it is of priority 0 (lowest priority). This number is not automatically generated by Elasticsearch.

  • version number
  • _meta object
    Hide _meta attribute Show _meta attribute object
    • * object Additional properties
  • The configuration option ignore_missing_component_templates can be used when an index template references a component template that might not exist

  • deprecated boolean

    Marks this index template as deprecated. When creating or updating a non-deprecated index template that uses deprecated components, Elasticsearch will emit a deprecation warning.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • overlapping array[object]
      Hide overlapping attributes Show overlapping attributes object
    • template object Required
      Hide template attributes Show template attributes object
      • aliases object Required
        Hide aliases attribute Show aliases attribute object
      • mappings object Required
        Hide mappings attributes Show mappings attributes object
      • settings object Required Additional properties
        Hide settings attributes Show settings attributes object
        • index object Additional properties
        • mode string
        • Hide soft_deletes attributes Show soft_deletes attributes object
          • enabled boolean

            Indicates whether soft deletes are enabled on the index.

          • Hide retention_lease attribute Show retention_lease attribute object
            • period string Required

              A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • sort object
          Hide sort attributes Show sort attributes object
        • Values are true, false, or checksum.

        • codec string
        • routing_partition_size number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • auto_expand_replicas string | null

          One of:
        • merge object
          Hide merge attribute Show merge attribute object
          • Hide scheduler attributes Show scheduler attributes object
            • max_thread_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

            • max_merge_count number | string

              Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

              Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • blocks object
          Hide blocks attributes Show blocks attributes object
          • read_only boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read_only_allow_delete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • read boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • write boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • metadata boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analyze object
          Hide analyze attribute Show analyze attribute object
          • max_token_count number | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Hide highlight attribute Show highlight attribute object
        • routing object
          Hide routing attributes Show routing attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide lifecycle attributes Show lifecycle attributes object
          • name string
          • indexing_complete boolean | string

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

          • If specified, this is the timestamp used to calculate the index age for its phase transitions. Use this setting if you create a new index that contains old data and want to use the original creation date to calculate the index age. Specified as a Unix epoch value in milliseconds.

          • Set to true to parse the origination date from the index name. This origination date is used to calculate the index age for its phase transitions. The index name must match the pattern .*-{date_format}-\d+, where the date_format is yyyy.MM.dd and the trailing digits are optional. An index that was rolled over would normally match the full format, for example logs-2016.10.31-000002). If the index name doesn’t match the pattern, index creation fails.

          • step object
            Hide step attribute Show step attribute object
            • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • The index alias to update when the index rolls over. Specify when using a policy that contains a rollover action. When the index rolls over, the alias is updated to reflect that the index is no longer the write index. For more information about rolling indices, see Rollover.

          • prefer_ilm boolean | string

            Preference for the system that manages a data stream backing index (preferring ILM when both ILM and DLM are applicable for an index).

        • creation_date number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • creation_date_string string | number

          A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • uuid string
        • version object
          Hide version attributes Show version attributes object
        • translog object
          Hide translog attributes Show translog attributes object
        • Hide query_string attribute Show query_string attribute object
          • lenient boolean | string Required

            Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

            Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • analysis object
          Hide analysis attributes Show analysis attributes object
        • settings object Additional properties
        • Hide time_series attributes Show time_series attributes object
          • end_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

          • start_time string | number

            A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

        • queries object
          Hide queries attribute Show queries attribute object
          • cache object
            Hide cache attribute Show cache attribute object
        • Configure custom similarity settings to customize how search results are scored.

        • mapping object
          Hide mapping attributes Show mapping attributes object
          • coerce boolean
          • Hide total_fields attributes Show total_fields attributes object
            • limit number | string

              The maximum number of fields in an index. Field and object mappings, as well as field aliases count towards this limit. The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.

            • ignore_dynamic_beyond_limit boolean | string

              This setting determines what happens when a dynamically mapped field would exceed the total fields limit. When set to false (the default), the index request of the document that tries to add a dynamic field to the mapping will fail with the message Limit of total fields [X] has been exceeded. When set to true, the index request will not fail. Instead, fields that would exceed the limit are not added to the mapping, similar to dynamic: false. The fields that were not added to the mapping will be added to the _ignored field.

          • depth object
            Hide depth attribute Show depth attribute object
            • limit number

              The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc.

          • Hide nested_fields attribute Show nested_fields attribute object
            • limit number

              The maximum number of distinct nested mappings in an index. The nested type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting limits the number of unique nested types per index.

          • Hide nested_objects attribute Show nested_objects attribute object
            • limit number

              The maximum number of nested JSON objects that a single document can contain across all nested types. This limit helps to prevent out of memory errors when a document contains too many nested objects.

          • Hide field_name_length attribute Show field_name_length attribute object
            • limit number

              Setting for the maximum length of a field name. This setting isn’t really something that addresses mappings explosion but might still be useful if you want to limit the field length. It usually shouldn’t be necessary to set this setting. The default is okay unless a user starts to add a huge number of fields with really long names. Default is Long.MAX_VALUE (no limit).

          • Hide dimension_fields attribute Show dimension_fields attribute object
            • limit number

              [preview] This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

          • source object
            Hide source attribute Show source attribute object
            • mode string Required

              Values are disabled, stored, or synthetic.

        • Hide indexing.slowlog attributes Show indexing.slowlog attributes object
          • level string
          • source number
          • reformat boolean
          • Hide threshold attribute Show threshold attribute object
            • index object
              Hide index attributes Show index attributes object
              • warn string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • info string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • debug string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

              • trace string

                A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide indexing_pressure attribute Show indexing_pressure attribute object
          • memory object Required
            Hide memory attribute Show memory attribute object
            • limit number

              Number of outstanding bytes that may be consumed by indexing requests. When this limit is reached or exceeded, the node will reject new coordinating and primary operations. When replica operations consume 1.5x this limit, the node will reject new replica operations. Defaults to 10% of the heap.

        • store object
          Hide store attributes Show store attributes object
          • type string Required

            Any of:

            Values are fs, niofs, mmapfs, or hybridfs.

          • allow_mmap boolean

            You can restrict the use of the mmapfs and the related hybridfs store type via the setting node.store.allow_mmap. This is a boolean setting indicating whether or not memory-mapping is allowed. The default is to allow it. This setting is useful, for example, if you are in an environment where you can not control the ability to create a lot of memory maps so you need disable the ability to use memory-mapping.

POST /_index_template/_simulate/{name}
curl \
 --request POST 'http://api.example.com/_index_template/_simulate/{name}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"index_patterns\": [\"my-index-*\"],\n  \"composed_of\": [\"ct2\"],\n  \"priority\": 10,\n  \"template\": {\n    \"settings\": {\n      \"index.number_of_replicas\": 1\n    }\n  }\n}"'
Request example
To see what settings will be applied by a template before you add it to the cluster, you can pass a template configuration in the request body. The specified template is used for the simulation if it has a higher priority than existing templates.
{
  "index_patterns": ["my-index-*"],
  "composed_of": ["ct2"],
  "priority": 10,
  "template": {
    "settings": {
      "index.number_of_replicas": 1
    }
  }
}
Response examples (200)
A successful response from `POST /_index_template/_simulate` with a template configuration in the request body. The response shows any overlapping templates with a lower priority.
{
  "template" : {
    "settings" : {
      "index" : {
        "number_of_replicas" : "1",
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        }
      }
    },
    "mappings" : {
      "properties" : {
        "@timestamp" : {
          "type" : "date"
        }
      }
    },
    "aliases" : { }
  },
  "overlapping" : [
    {
      "name" : "final-template",
      "index_patterns" : [
        "my-index-*"
      ]
    }
  ]
}

Create or update an alias Added in 1.3.0

POST /_aliases

Adds a data stream or index to an alias.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_aliases
curl \
 --request POST 'http://api.example.com/_aliases' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"actions":[{"add":{"alias":"string","aliases":"string","filter":{},"index":"string","indices":"string","index_routing":"string","is_hidden":true,"is_write_index":true,"routing":"string","search_routing":"string","must_exist":true},"remove":{"alias":"string","aliases":"string","index":"string","indices":"string","must_exist":true},"remove_index":{"index":"string","indices":"string","must_exist":true}}]}'

Validate a query Added in 1.3.0

GET /_validate/query

Validates a query without running it.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • all_shards boolean

    If true, the validation is executed on all shards instead of one random shard per index.

  • analyzer string

    Analyzer to use for the query string. This parameter can only be used when the q query string parameter is specified.

  • If true, wildcard and prefix queries are analyzed.

  • The default operator for query string query: AND or OR.

    Values are and, AND, or, or OR.

  • df string

    Field to use as default where no field prefix is given in the query string. This parameter can only be used when the q query string parameter is specified.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • explain boolean

    If true, the response returns detailed information if an error has occurred.

  • If false, the request returns an error if it targets a missing or closed index.

  • lenient boolean

    If true, format-based query failures (such as providing text to a numeric field) in the query string will be ignored.

  • rewrite boolean

    If true, returns a more detailed explanation showing the actual Lucene query that will be executed.

  • q string

    Query in the Lucene query string syntax.

application/json

Body

Responses

GET /_validate/query
curl \
 --request GET 'http://api.example.com/_validate/query' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"query":{}}'

Validate a query Added in 1.3.0

POST /_validate/query

Validates a query without running it.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • all_shards boolean

    If true, the validation is executed on all shards instead of one random shard per index.

  • analyzer string

    Analyzer to use for the query string. This parameter can only be used when the q query string parameter is specified.

  • If true, wildcard and prefix queries are analyzed.

  • The default operator for query string query: AND or OR.

    Values are and, AND, or, or OR.

  • df string

    Field to use as default where no field prefix is given in the query string. This parameter can only be used when the q query string parameter is specified.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • explain boolean

    If true, the response returns detailed information if an error has occurred.

  • If false, the request returns an error if it targets a missing or closed index.

  • lenient boolean

    If true, format-based query failures (such as providing text to a numeric field) in the query string will be ignored.

  • rewrite boolean

    If true, returns a more detailed explanation showing the actual Lucene query that will be executed.

  • q string

    Query in the Lucene query string syntax.

application/json

Body

Responses

POST /_validate/query
curl \
 --request POST 'http://api.example.com/_validate/query' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"query":{}}'

Validate a query Added in 1.3.0

GET /{index}/_validate/query

Validates a query without running it.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and aliases to search. Supports wildcards (*). To search all data streams or indices, omit this parameter or use * or _all.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • all_shards boolean

    If true, the validation is executed on all shards instead of one random shard per index.

  • analyzer string

    Analyzer to use for the query string. This parameter can only be used when the q query string parameter is specified.

  • If true, wildcard and prefix queries are analyzed.

  • The default operator for query string query: AND or OR.

    Values are and, AND, or, or OR.

  • df string

    Field to use as default where no field prefix is given in the query string. This parameter can only be used when the q query string parameter is specified.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • explain boolean

    If true, the response returns detailed information if an error has occurred.

  • If false, the request returns an error if it targets a missing or closed index.

  • lenient boolean

    If true, format-based query failures (such as providing text to a numeric field) in the query string will be ignored.

  • rewrite boolean

    If true, returns a more detailed explanation showing the actual Lucene query that will be executed.

  • q string

    Query in the Lucene query string syntax.

application/json

Body

Responses

GET /{index}/_validate/query
curl \
 --request GET 'http://api.example.com/{index}/_validate/query' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"query":{}}'

Validate a query Added in 1.3.0

POST /{index}/_validate/query

Validates a query without running it.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and aliases to search. Supports wildcards (*). To search all data streams or indices, omit this parameter or use * or _all.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • all_shards boolean

    If true, the validation is executed on all shards instead of one random shard per index.

  • analyzer string

    Analyzer to use for the query string. This parameter can only be used when the q query string parameter is specified.

  • If true, wildcard and prefix queries are analyzed.

  • The default operator for query string query: AND or OR.

    Values are and, AND, or, or OR.

  • df string

    Field to use as default where no field prefix is given in the query string. This parameter can only be used when the q query string parameter is specified.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • explain boolean

    If true, the response returns detailed information if an error has occurred.

  • If false, the request returns an error if it targets a missing or closed index.

  • lenient boolean

    If true, format-based query failures (such as providing text to a numeric field) in the query string will be ignored.

  • rewrite boolean

    If true, returns a more detailed explanation showing the actual Lucene query that will be executed.

  • q string

    Query in the Lucene query string syntax.

application/json

Body

Responses

POST /{index}/_validate/query
curl \
 --request POST 'http://api.example.com/{index}/_validate/query' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"query":{}}'

Inference

Inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.

Perform chat completion inference Added in 8.18.0

POST /_inference/chat_completion/{inference_id}/_stream

The chat completion inference API enables real-time responses for chat completion tasks by delivering answers incrementally, reducing response times during computation. It only works with the chat_completion task type for openai and elastic inference services.

IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.

NOTE: The chat_completion task type is only available within the _stream API and only supports streaming. The Chat completion inference API and the Stream inference API differ in their response structure and capabilities. The Chat completion inference API provides more comprehensive customization options through more fields and function calling support. If you use the openai service or the elastic service, use the Chat completion inference API.

Path parameters

Query parameters

  • timeout string

    Specifies the amount of time to wait for the inference request to complete.

application/json

Body Required

  • messages array[object] Required

    A list of objects representing the conversation. Requests should generally only add new messages from the user (role user). The other message roles (assistant, system, or tool) should generally only be copied from the response to a previous completion request, such that the messages array is built up throughout a conversation.

    Hide messages attributes Show messages attributes object
  • model string

    The ID of the model to use.

  • The upper bound limit for the number of tokens that can be generated for a completion request.

  • stop array[string]

    A sequence of strings to control when the model should stop generating additional tokens.

  • The sampling temperature to use.

  • tool_choice string | object

    One of:
  • tools array[object]

    A list of tools that the model can call.

    Hide tools attributes Show tools attributes object
    • type string Required

      The type of tool.

    • function object Required
      Hide function attributes Show function attributes object
      • A description of what the function does. This is used by the model to choose when and how to call the function.

      • name string Required

        The name of the function.

      • The parameters the functional accepts. This should be formatted as a JSON object.

      • strict boolean

        Whether to enable schema adherence when generating the function call.

  • top_p number

    Nucleus sampling, an alternative to sampling with temperature.

Responses

POST /_inference/chat_completion/{inference_id}/_stream
curl \
 --request POST 'http://api.example.com/_inference/chat_completion/{inference_id}/_stream' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"model\": \"gpt-4o\",\n  \"messages\": [\n      {\n          \"role\": \"user\",\n          \"content\": \"What is Elastic?\"\n      }\n  ]\n}"'
Run `POST _inference/chat_completion/openai-completion/_stream` to perform a chat completion on the example question with streaming.
{
  "model": "gpt-4o",
  "messages": [
      {
          "role": "user",
          "content": "What is Elastic?"
      }
  ]
}
Run `POST POST _inference/chat_completion/openai-completion/_stream` to perform a chat completion using an Assistant message with `tool_calls`.
{
  "messages": [
      {
          "role": "assistant",
          "content": "Let's find out what the weather is",
          "tool_calls": [ 
              {
                  "id": "call_KcAjWtAww20AihPHphUh46Gd",
                  "type": "function",
                  "function": {
                      "name": "get_current_weather",
                      "arguments": "{\"location\":\"Boston, MA\"}"
                  }
              }
          ]
      },
      { 
          "role": "tool",
          "content": "The weather is cold",
          "tool_call_id": "call_KcAjWtAww20AihPHphUh46Gd"
      }
  ]
}
Run `POST POST _inference/chat_completion/openai-completion/_stream` to perform a chat completion using a User message with `tools` and `tool_choice`.
{
  "messages": [
      {
          "role": "user",
          "content": [
              {
                  "type": "text",
                  "text": "What's the price of a scarf?"
              }
          ]
      }
  ],
  "tools": [
      {
          "type": "function",
          "function": {
              "name": "get_current_price",
              "description": "Get the current price of a item",
              "parameters": {
                  "type": "object",
                  "properties": {
                      "item": {
                          "id": "123"
                      }
                  }
              }
          }
      }
  ],
  "tool_choice": {
      "type": "function",
      "function": {
          "name": "get_current_price"
      }
  }
}
Response examples (200)
A successful response when performing a chat completion task using a User message with `tools` and `tool_choice`.
event: message
data: {"chat_completion":{"id":"chatcmpl-Ae0TWsy2VPnSfBbv5UztnSdYUMFP3","choices":[{"delta":{"content":"","role":"assistant"},"index":0}],"model":"gpt-4o-2024-08-06","object":"chat.completion.chunk"}}

event: message
data: {"chat_completion":{"id":"chatcmpl-Ae0TWsy2VPnSfBbv5UztnSdYUMFP3","choices":[{"delta":{"content":Elastic"},"index":0}],"model":"gpt-4o-2024-08-06","object":"chat.completion.chunk"}}

event: message
data: {"chat_completion":{"id":"chatcmpl-Ae0TWsy2VPnSfBbv5UztnSdYUMFP3","choices":[{"delta":{"content":" is"},"index":0}],"model":"gpt-4o-2024-08-06","object":"chat.completion.chunk"}}

(...)

event: message
data: {"chat_completion":{"id":"chatcmpl-Ae0TWsy2VPnSfBbv5UztnSdYUMFP3","choices":[],"model":"gpt-4o-2024-08-06","object":"chat.completion.chunk","usage":{"completion_tokens":28,"prompt_tokens":16,"total_tokens":44}}} 

event: message
data: [DONE]

Perform completion inference on the service Added in 8.11.0

POST /_inference/completion/{inference_id}

Path parameters

Query parameters

  • timeout string

    Specifies the amount of time to wait for the inference request to complete.

application/json

Body

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • completion array[object] Required
      Hide completion attribute Show completion attribute object
POST /_inference/completion/{inference_id}
curl \
 --request POST 'http://api.example.com/_inference/completion/{inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"input\": \"What is Elastic?\"\n}"'
Request example
Run `POST _inference/completion/openai_chat_completions` to perform a completion on the example question.
{
  "input": "What is Elastic?"
}
Response examples (200)
A successful response from `POST _inference/completion/openai_chat_completions`.
{
  "completion": [
    {
      "result": "Elastic is a company that provides a range of software solutions for search, logging, security, and analytics. Their flagship product is Elasticsearch, an open-source, distributed search engine that allows users to search, analyze, and visualize large volumes of data in real-time. Elastic also offers products such as Kibana, a data visualization tool, and Logstash, a log management and pipeline tool, as well as various other tools and solutions for data analysis and management."
    }
  ]
}

Get an inference endpoint Added in 8.11.0

GET /_inference/{inference_id}

Path parameters

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • endpoints array[object] Required
      Hide endpoints attributes Show endpoints attributes object
      • Hide chunking_settings attributes Show chunking_settings attributes object
        • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

        • overlap number

          The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

        • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

        • strategy string

          The chunking strategy: sentence or word.

      • service string Required

        The service type

      • service_settings object Required
      • inference_id string Required

        The inference Id

      • task_type string Required

        Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

GET /_inference/{inference_id}
curl \
 --request GET 'http://api.example.com/_inference/{inference_id}' \
 --header "Authorization: $API_KEY"

Create an inference endpoint Added in 8.11.0

PUT /_inference/{inference_id}

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral, Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.

Path parameters

application/json

Body Required

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    The service type

  • service_settings object Required

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"chunking_settings":{"max_chunk_size":42.0,"overlap":42.0,"sentence_overlap":42.0,"strategy":"string"},"service":"string","service_settings":{},"task_settings":{}}'

Perform inference on the service Added in 8.11.0

POST /_inference/{inference_id}

This API enables you to use machine learning models to perform specific tasks on data that you provide as an input. It returns a response with the results of the tasks. The inference endpoint you use can perform one specific task that has been defined when the endpoint was created with the create inference API.

For details about using this API with a service, such as Amazon Bedrock, Anthropic, or HuggingFace, refer to the service-specific documentation.


The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.

Path parameters

  • inference_id string Required

    The unique identifier for the inference endpoint.

Query parameters

  • timeout string

    The amount of time to wait for the inference request to complete.

application/json

Body

  • query string

    The query input, which is required only for the rerank task. It is not required for other tasks.

  • input string | array[string] Required

    The text on which you want to perform the inference task. It can be a single string or an array.


    Inference endpoints for the completion task type currently only support a single string as input.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide text_embedding_bytes attribute Show text_embedding_bytes attribute object
      • embedding array[number] Required

        Text Embedding results containing bytes are represented as Dense Vectors of bytes.

    • text_embedding_bits array[object]
      Hide text_embedding_bits attribute Show text_embedding_bits attribute object
      • embedding array[number] Required

        Text Embedding results containing bytes are represented as Dense Vectors of bytes.

    • text_embedding array[object]
      Hide text_embedding attribute Show text_embedding attribute object
      • embedding array[number] Required

        Text Embedding results are represented as Dense Vectors of floats.

    • sparse_embedding array[object]
      Hide sparse_embedding attribute Show sparse_embedding attribute object
      • embedding object Required

        Sparse Embedding tokens are represented as a dictionary of string to double.

        Hide embedding attribute Show embedding attribute object
        • * number Additional properties
    • completion array[object]
      Hide completion attribute Show completion attribute object
    • rerank array[object]
      Hide rerank attributes Show rerank attributes object
POST /_inference/{inference_id}
curl \
 --request POST 'http://api.example.com/_inference/{inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"query":"string","input":"string","task_settings":{}}'

Delete an inference endpoint Added in 8.11.0

DELETE /_inference/{inference_id}

Path parameters

Query parameters

  • dry_run boolean

    When true, the endpoint is not deleted and a list of ingest processors which reference this endpoint is returned.

  • force boolean

    When true, the inference endpoint is forcefully deleted even if it is still being used by ingest processors or semantic text fields.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

    • pipelines array[string] Required
DELETE /_inference/{inference_id}
curl \
 --request DELETE 'http://api.example.com/_inference/{inference_id}' \
 --header "Authorization: $API_KEY"

Get an inference endpoint Added in 8.11.0

GET /_inference/{task_type}/{inference_id}

Path parameters

  • task_type string Required

    The task type

    Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

  • inference_id string Required

    The inference Id

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • endpoints array[object] Required
      Hide endpoints attributes Show endpoints attributes object
      • Hide chunking_settings attributes Show chunking_settings attributes object
        • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

        • overlap number

          The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

        • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

        • strategy string

          The chunking strategy: sentence or word.

      • service string Required

        The service type

      • service_settings object Required
      • inference_id string Required

        The inference Id

      • task_type string Required

        Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

GET /_inference/{task_type}/{inference_id}
curl \
 --request GET 'http://api.example.com/_inference/{task_type}/{inference_id}' \
 --header "Authorization: $API_KEY"

Create an inference endpoint Added in 8.11.0

PUT /_inference/{task_type}/{inference_id}

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral, Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.

Path parameters

  • task_type string Required

    The task type

    Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

  • inference_id string Required

    The inference Id

application/json

Body Required

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    The service type

  • service_settings object Required

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"chunking_settings":{"max_chunk_size":42.0,"overlap":42.0,"sentence_overlap":42.0,"strategy":"string"},"service":"string","service_settings":{},"task_settings":{}}'

Perform inference on the service Added in 8.11.0

POST /_inference/{task_type}/{inference_id}

This API enables you to use machine learning models to perform specific tasks on data that you provide as an input. It returns a response with the results of the tasks. The inference endpoint you use can perform one specific task that has been defined when the endpoint was created with the create inference API.

For details about using this API with a service, such as Amazon Bedrock, Anthropic, or HuggingFace, refer to the service-specific documentation.


The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.

Path parameters

  • task_type string Required

    The type of inference task that the model performs.

    Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

  • inference_id string Required

    The unique identifier for the inference endpoint.

Query parameters

  • timeout string

    The amount of time to wait for the inference request to complete.

application/json

Body

  • query string

    The query input, which is required only for the rerank task. It is not required for other tasks.

  • input string | array[string] Required

    The text on which you want to perform the inference task. It can be a single string or an array.


    Inference endpoints for the completion task type currently only support a single string as input.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide text_embedding_bytes attribute Show text_embedding_bytes attribute object
      • embedding array[number] Required

        Text Embedding results containing bytes are represented as Dense Vectors of bytes.

    • text_embedding_bits array[object]
      Hide text_embedding_bits attribute Show text_embedding_bits attribute object
      • embedding array[number] Required

        Text Embedding results containing bytes are represented as Dense Vectors of bytes.

    • text_embedding array[object]
      Hide text_embedding attribute Show text_embedding attribute object
      • embedding array[number] Required

        Text Embedding results are represented as Dense Vectors of floats.

    • sparse_embedding array[object]
      Hide sparse_embedding attribute Show sparse_embedding attribute object
      • embedding object Required

        Sparse Embedding tokens are represented as a dictionary of string to double.

        Hide embedding attribute Show embedding attribute object
        • * number Additional properties
    • completion array[object]
      Hide completion attribute Show completion attribute object
    • rerank array[object]
      Hide rerank attributes Show rerank attributes object
POST /_inference/{task_type}/{inference_id}
curl \
 --request POST 'http://api.example.com/_inference/{task_type}/{inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"query":"string","input":"string","task_settings":{}}'

Delete an inference endpoint Added in 8.11.0

DELETE /_inference/{task_type}/{inference_id}

Path parameters

  • task_type string Required

    The task type

    Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

  • inference_id string Required

    The inference identifier.

Query parameters

  • dry_run boolean

    When true, the endpoint is not deleted and a list of ingest processors which reference this endpoint is returned.

  • force boolean

    When true, the inference endpoint is forcefully deleted even if it is still being used by ingest processors or semantic text fields.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

    • pipelines array[string] Required
DELETE /_inference/{task_type}/{inference_id}
curl \
 --request DELETE 'http://api.example.com/_inference/{task_type}/{inference_id}' \
 --header "Authorization: $API_KEY"

Get an inference endpoint Added in 8.11.0

GET /_inference

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • endpoints array[object] Required
      Hide endpoints attributes Show endpoints attributes object
      • Hide chunking_settings attributes Show chunking_settings attributes object
        • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

        • overlap number

          The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

        • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

        • strategy string

          The chunking strategy: sentence or word.

      • service string Required

        The service type

      • service_settings object Required
      • inference_id string Required

        The inference Id

      • task_type string Required

        Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

GET /_inference
curl \
 --request GET 'http://api.example.com/_inference' \
 --header "Authorization: $API_KEY"

Create an AlibabaCloud AI Search inference endpoint Added in 8.16.0

PUT /_inference/{task_type}/{alibabacloud_inference_id}

Create an inference endpoint to perform an inference task with the alibabacloud-ai-search service.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform.

    Values are completion, rerank, space_embedding, or text_embedding.

  • The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is alibabacloud-ai-search.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string Required

      A valid API key for the AlibabaCloud AI Search API.

    • host string Required

      The name of the host address used for the inference task. You can find the host address in the API keys section of the documentation.

      External documentation
    • Hide rate_limit attribute Show rate_limit attribute object
    • service_id string Required

      The name of the model service to use for the inference task. The following service IDs are available for the completion task:

      • ops-qwen-turbo
      • qwen-turbo
      • qwen-plus
      • qwen-max ÷ qwen-max-longcontext

      The following service ID is available for the rerank task:

      • ops-bge-reranker-larger

      The following service ID is available for the sparse_embedding task:

      • ops-text-sparse-embedding-001

      The following service IDs are available for the text_embedding task:

      ops-text-embedding-001 ops-text-embedding-zh-001 ops-text-embedding-en-001 ops-text-embedding-002

    • workspace string Required

      The name of the workspace used for the inference task.

  • Hide task_settings attributes Show task_settings attributes object
    • For a sparse_embedding or text_embedding task, specify the type of input passed to the model. Valid values are:

      • ingest for storing document embeddings in a vector database.
      • search for storing embeddings of search queries run against a vector database to find relevant documents.
    • For a sparse_embedding task, it affects whether the token name will be returned in the response. It defaults to false, which means only the token ID will be returned in the response.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{alibabacloud_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{alibabacloud_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"alibabacloud-ai-search\",\n    \"service_settings\": {\n        \"host\" : \"default-j01.platform-cn-shanghai.opensearch.aliyuncs.com\",\n        \"api_key\": \"AlibabaCloud-API-Key\",\n        \"service_id\": \"ops-qwen-turbo\",\n        \"workspace\" : \"default\"\n    }\n}"'
Run `PUT _inference/completion/alibabacloud_ai_search_completion` to create an inference endpoint that performs a completion task.
{
    "service": "alibabacloud-ai-search",
    "service_settings": {
        "host" : "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
        "api_key": "AlibabaCloud-API-Key",
        "service_id": "ops-qwen-turbo",
        "workspace" : "default"
    }
}
Run `PUT _inference/rerank/alibabacloud_ai_search_rerank` to create an inference endpoint that performs a rerank task.
{
    "service": "alibabacloud-ai-search",
    "service_settings": {
        "api_key": "AlibabaCloud-API-Key",
        "service_id": "ops-bge-reranker-larger",
        "host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
        "workspace": "default"
    }
}
Run `PUT _inference/sparse_embedding/alibabacloud_ai_search_sparse` to create an inference endpoint that performs perform a sparse embedding task.
{
    "service": "alibabacloud-ai-search",
    "service_settings": {
        "api_key": "AlibabaCloud-API-Key",
        "service_id": "ops-text-sparse-embedding-001",
        "host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
        "workspace": "default"
    }
}
Run `PUT _inference/text_embedding/alibabacloud_ai_search_embeddings` to create an inference endpoint that performs a text embedding task.
{
    "service": "alibabacloud-ai-search",
    "service_settings": {
        "api_key": "AlibabaCloud-API-Key",
        "service_id": "ops-text-embedding-001",
        "host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
        "workspace": "default"
    }
}

Create an Amazon Bedrock inference endpoint Added in 8.12.0

PUT /_inference/{task_type}/{amazonbedrock_inference_id}

Creates an inference endpoint to perform an inference task with the amazonbedrock service.


You need to provide the access and secret keys only once, during the inference model creation. The get inference API does not retrieve your access or secret keys. After creating the inference model, you cannot change the associated key pairs. If you want to use a different access and secret key pair, delete the inference model and recreate it with the same name and the updated keys.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform.

    Values are completion or text_embedding.

  • The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is amazonbedrock.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • access_key string Required

      A valid AWS access key that has permissions to use Amazon Bedrock and access to models for inference requests.

    • model string Required

      The base model ID or an ARN to a custom model based on a foundational model. The base model IDs can be found in the Amazon Bedrock documentation. Note that the model ID must be available for the provider chosen and your IAM user must have access to the model.

      External documentation
    • provider string

      The model provider for your deployment. Note that some providers may support only certain task types. Supported providers include:

      • amazontitan - available for text_embedding and completion task types
      • anthropic - available for completion task type only
      • ai21labs - available for completion task type only
      • cohere - available for text_embedding and completion task types
      • meta - available for completion task type only
      • mistral - available for completion task type only
    • region string Required

      The region that your model or ARN is deployed in. The list of available regions per model can be found in the Amazon Bedrock documentation.

      External documentation
    • Hide rate_limit attribute Show rate_limit attribute object
    • secret_key string Required

      A valid AWS secret key that is paired with the access_key. For informationg about creating and managing access and secret keys, refer to the AWS documentation.

      External documentation
  • Hide task_settings attributes Show task_settings attributes object
    • For a completion task, it sets the maximum number for the output tokens to be generated.

    • For a completion task, it is a number between 0.0 and 1.0 that controls the apparent creativity of the results. At temperature 0.0 the model is most deterministic, at temperature 1.0 most random. It should not be used if top_p or top_k is specified.

    • top_k number

      For a completion task, it limits samples to the top-K most likely words, balancing coherence and variability. It is only available for anthropic, cohere, and mistral providers. It is an alternative to temperature; it should not be used if temperature is specified.

    • top_p number

      For a completion task, it is a number in the range of 0.0 to 1.0, to eliminate low-probability tokens. Top-p uses nucleus sampling to select top tokens whose sum of likelihoods does not exceed a certain value, ensuring both variety and coherence. It is an alternative to temperature; it should not be used if temperature is specified.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{amazonbedrock_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{amazonbedrock_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"amazonbedrock\",\n    \"service_settings\": {\n        \"access_key\": \"AWS-access-key\",\n        \"secret_key\": \"AWS-secret-key\",\n        \"region\": \"us-east-1\",\n        \"provider\": \"amazontitan\",\n        \"model\": \"amazon.titan-embed-text-v2:0\"\n    }\n}"'
Request examples
Run `PUT _inference/text_embedding/amazon_bedrock_embeddings` to create an inference endpoint that performs a text embedding task.
{
    "service": "amazonbedrock",
    "service_settings": {
        "access_key": "AWS-access-key",
        "secret_key": "AWS-secret-key",
        "region": "us-east-1",
        "provider": "amazontitan",
        "model": "amazon.titan-embed-text-v2:0"
    }
}
Run `PUT _inference/completion/openai-completion` to create an inference endpoint to perform a completion task type.
{
    "service": "openai",
    "service_settings": {
        "api_key": "OpenAI-API-Key",
        "model_id": "gpt-3.5-turbo"
    }
}

Create an Anthropic inference endpoint Added in 8.16.0

PUT /_inference/{task_type}/{anthropic_inference_id}

Create an inference endpoint to perform an inference task with the anthropic service.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The task type. The only valid task type for the model to perform is completion.

    Value is completion.

  • anthropic_inference_id string Required

    The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is anthropic.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string Required

      A valid API key for the Anthropic API.

    • model_id string Required

      The name of the model to use for the inference task. Refer to the Anthropic documentation for the list of supported models.

    • Hide rate_limit attribute Show rate_limit attribute object
  • Hide task_settings attributes Show task_settings attributes object
    • max_tokens number Required

      For a completion task, it is the maximum number of tokens to generate before stopping.

    • For a completion task, it is the amount of randomness injected into the response. For more details about the supported range, refer to Anthropic documentation.

      External documentation
    • top_k number

      For a completion task, it specifies to only sample from the top K options for each subsequent token. It is recommended for advanced use cases only. You usually only need to use temperature.

    • top_p number

      For a completion task, it specifies to use Anthropic's nucleus sampling. In nucleus sampling, Anthropic computes the cumulative distribution over all the options for each subsequent token in decreasing probability order and cuts it off once it reaches the specified probability. You should either alter temperature or top_p, but not both. It is recommended for advanced use cases only. You usually only need to use temperature.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{anthropic_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{anthropic_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"anthropic\",\n    \"service_settings\": {\n        \"api_key\": \"Anthropic-Api-Key\",\n        \"model_id\": \"Model-ID\"\n    },\n    \"task_settings\": {\n        \"max_tokens\": 1024\n    }\n}"'
Request example
Run `PUT _inference/completion/anthropic_completion` to create an inference endpoint that performs a completion task.
{
    "service": "anthropic",
    "service_settings": {
        "api_key": "Anthropic-Api-Key",
        "model_id": "Model-ID"
    },
    "task_settings": {
        "max_tokens": 1024
    }
}

Create an Azure AI studio inference endpoint Added in 8.14.0

PUT /_inference/{task_type}/{azureaistudio_inference_id}

Create an inference endpoint to perform an inference task with the azureaistudio service.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform.

    Values are completion or text_embedding.

  • The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is azureaistudio.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string Required

      A valid API key of your Azure AI Studio model deployment. This key can be found on the overview page for your deployment in the management section of your Azure AI Studio account.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.

      External documentation
    • endpoint_type string Required

      The type of endpoint that is available for deployment through Azure AI Studio: token or realtime. The token endpoint type is for "pay as you go" endpoints that are billed per token. The realtime endpoint type is for "real-time" endpoints that are billed per hour of usage.

      External documentation
    • target string Required

      The target URL of your Azure AI Studio model deployment. This can be found on the overview page for your deployment in the management section of your Azure AI Studio account.

    • provider string Required

      The model provider for your deployment. Note that some providers may support only certain task types. Supported providers include:

      • cohere - available for text_embedding and completion task types
      • databricks - available for completion task type only
      • meta - available for completion task type only
      • microsoft_phi - available for completion task type only
      • mistral - available for completion task type only
      • openai - available for text_embedding and completion task types
    • Hide rate_limit attribute Show rate_limit attribute object
  • Hide task_settings attributes Show task_settings attributes object
    • For a completion task, instruct the inference process to perform sampling. It has no effect unless temperature or top_p is specified.

    • For a completion task, provide a hint for the maximum number of output tokens to be generated.

    • For a completion task, control the apparent creativity of generated completions with a sampling temperature. It must be a number in the range of 0.0 to 2.0. It should not be used if top_p is specified.

    • top_p number

      For a completion task, make the model consider the results of the tokens with nucleus sampling probability. It is an alternative value to temperature and must be a number in the range of 0.0 to 2.0. It should not be used if temperature is specified.

    • user string

      For a text_embedding task, specify the user issuing the request. This information can be used for abuse detection.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{azureaistudio_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{azureaistudio_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"azureaistudio\",\n    \"service_settings\": {\n        \"api_key\": \"Azure-AI-Studio-API-key\",\n        \"target\": \"Target-Uri\",\n        \"provider\": \"openai\",\n        \"endpoint_type\": \"token\"\n    }\n}"'
Request examples
Run `PUT _inference/text_embedding/azure_ai_studio_embeddings` to create an inference endpoint that performs a text_embedding task. Note that you do not specify a model here, as it is defined already in the Azure AI Studio deployment.
{
    "service": "azureaistudio",
    "service_settings": {
        "api_key": "Azure-AI-Studio-API-key",
        "target": "Target-Uri",
        "provider": "openai",
        "endpoint_type": "token"
    }
}
Run `PUT _inference/completion/azure_ai_studio_completion` to create an inference endpoint that performs a completion task.
{
    "service": "azureaistudio",
    "service_settings": {
        "api_key": "Azure-AI-Studio-API-key",
        "target": "Target-URI",
        "provider": "databricks",
        "endpoint_type": "realtime"
    }
}

Create an Azure OpenAI inference endpoint Added in 8.14.0

PUT /_inference/{task_type}/{azureopenai_inference_id}

Create an inference endpoint to perform an inference task with the azureopenai service.

The list of chat completion models that you can choose from in your Azure OpenAI deployment include:

The list of embeddings models that you can choose from in your deployment can be found in the Azure models documentation.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform. NOTE: The chat_completion task type only supports streaming and only through the _stream API.

    Values are completion or text_embedding.

  • The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is azureopenai.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string

      A valid API key for your Azure OpenAI account. You must specify either api_key or entra_id. If you do not provide either or you provide both, you will receive an error when you try to create your model.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.

      External documentation
    • api_version string Required

      The Azure API version ID to use. It is recommended to use the latest supported non-preview version.

    • deployment_id string Required

      The deployment name of your deployed models. Your Azure OpenAI deployments can be found though the Azure OpenAI Studio portal that is linked to your subscription.

      External documentation
    • entra_id string

      A valid Microsoft Entra token. You must specify either api_key or entra_id. If you do not provide either or you provide both, you will receive an error when you try to create your model.

      External documentation
    • Hide rate_limit attribute Show rate_limit attribute object
    • resource_name string Required

      The name of your Azure OpenAI resource. You can find this from the list of resources in the Azure Portal for your subscription.

      External documentation
  • Hide task_settings attribute Show task_settings attribute object
    • user string

      For a completion or text_embedding task, specify the user issuing the request. This information can be used for abuse detection.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{azureopenai_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{azureopenai_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"azureopenai\",\n    \"service_settings\": {\n        \"api_key\": \"Api-Key\",\n        \"resource_name\": \"Resource-name\",\n        \"deployment_id\": \"Deployment-id\",\n        \"api_version\": \"2024-02-01\"\n    }\n}"'
Request examples
Run `PUT _inference/text_embedding/azure_openai_embeddings` to create an inference endpoint that performs a `text_embedding` task. You do not specify a model, as it is defined already in the Azure OpenAI deployment.
{
    "service": "azureopenai",
    "service_settings": {
        "api_key": "Api-Key",
        "resource_name": "Resource-name",
        "deployment_id": "Deployment-id",
        "api_version": "2024-02-01"
    }
}
Run `PUT _inference/completion/azure_openai_completion` to create an inference endpoint that performs a `completion` task.
{
    "service": "azureopenai",
    "service_settings": {
        "api_key": "Api-Key",
        "resource_name": "Resource-name",
        "deployment_id": "Deployment-id",
        "api_version": "2024-02-01"
    }
}

Create a Cohere inference endpoint Added in 8.13.0

PUT /_inference/{task_type}/{cohere_inference_id}

Create an inference endpoint to perform an inference task with the cohere service.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform.

    Values are completion, rerank, or text_embedding.

  • cohere_inference_id string Required

    The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is cohere.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string Required

      A valid API key for your Cohere account. You can find or create your Cohere API keys on the Cohere API key settings page.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.

      External documentation
    • Values are byte, float, or int8.

    • model_id string

      For a completion, rerank, or text_embedding task, the name of the model to use for the inference task.

      The default value for a text embedding task is embed-english-v2.0.

    • Hide rate_limit attribute Show rate_limit attribute object
    • Values are cosine, dot_product, or l2_norm.

  • Hide task_settings attributes Show task_settings attributes object
    • Values are classification, clustering, ingest, or search.

    • For a rerank task, return doc text within the results.

    • top_n number

      For a rerank task, the number of most relevant documents to return. It defaults to the number of the documents. If this inference endpoint is used in a text_similarity_reranker retriever query and top_n is set, it must be greater than or equal to rank_window_size in the query.

    • truncate string

      Values are END, NONE, or START.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{cohere_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{cohere_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"cohere\",\n    \"service_settings\": {\n        \"api_key\": \"Cohere-Api-key\",\n        \"model_id\": \"embed-english-light-v3.0\",\n        \"embedding_type\": \"byte\"\n    }\n}"'
Request examples
Run `PUT _inference/text_embedding/cohere-embeddings` to create an inference endpoint that performs a text embedding task.
{
    "service": "cohere",
    "service_settings": {
        "api_key": "Cohere-Api-key",
        "model_id": "embed-english-light-v3.0",
        "embedding_type": "byte"
    }
}
Run `PUT _inference/rerank/cohere-rerank` to create an inference endpoint that performs a rerank task.
{
    "service": "cohere",
    "service_settings": {
        "api_key": "Cohere-API-key",
        "model_id": "rerank-english-v3.0"
    },
    "task_settings": {
        "top_n": 10,
        "return_documents": true
    }
}

Create an Elasticsearch inference endpoint Added in 8.13.0

PUT /_inference/{task_type}/{elasticsearch_inference_id}

Create an inference endpoint to perform an inference task with the elasticsearch service.


Your Elasticsearch deployment contains preconfigured ELSER and E5 inference endpoints, you only need to create the enpoints using the API if you want to customize the settings.

If you use the ELSER or the E5 model through the elasticsearch service, the API request will automatically download and deploy the model if it isn't downloaded yet.


You might see a 502 bad gateway error in the response when using the Kibana Console. This error usually just reflects a timeout, while the model downloads in the background. You can check the download progress in the Machine Learning UI. If using the Python client, you can set the timeout parameter to a higher value.

After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform.

    Values are rerank, sparse_embedding, or text_embedding.

  • The unique identifier of the inference endpoint. The must not match the model_id.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is elasticsearch.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • Hide adaptive_allocations attributes Show adaptive_allocations attributes object
      • enabled boolean

        Turn on adaptive_allocations.

      • The maximum number of allocations to scale to. If set, it must be greater than or equal to min_number_of_allocations.

      • The minimum number of allocations to scale to. If set, it must be greater than or equal to 0. If not defined, the deployment scales to 0.

    • The deployment identifier for a trained model deployment. When deployment_id is used the model_id is optional.

    • model_id string Required

      The name of the model to use for the inference task. It can be the ID of a built-in model (for example, .multilingual-e5-small for E5) or a text embedding model that was uploaded by using the Eland client.

      External documentation
    • The total number of allocations that are assigned to the model across machine learning nodes. Increasing this value generally increases the throughput. If adaptive allocations are enabled, do not set this value because it's automatically set.

    • num_threads number Required

      The number of threads used by each model allocation during inference. This setting generally increases the speed per inference request. The inference process is a compute-bound process; threads_per_allocations must not exceed the number of available allocated processors per node. The value must be a power of 2. The maximum value is 32.

  • Hide task_settings attribute Show task_settings attribute object
    • For a rerank task, return the document instead of only the index.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{elasticsearch_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{elasticsearch_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"elasticsearch\",\n    \"service_settings\": {\n        \"adaptive_allocations\": { \n        \"enabled\": true,\n        \"min_number_of_allocations\": 1,\n        \"max_number_of_allocations\": 4\n        },\n        \"num_threads\": 1,\n        \"model_id\": \".elser_model_2\" \n    }\n}"'
Run `PUT _inference/sparse_embedding/my-elser-model` to create an inference endpoint that performs a `sparse_embedding` task. The `model_id` must be the ID of one of the built-in ELSER models. The API will automatically download the ELSER model if it isn't already downloaded and then deploy the model.
{
    "service": "elasticsearch",
    "service_settings": {
        "adaptive_allocations": { 
        "enabled": true,
        "min_number_of_allocations": 1,
        "max_number_of_allocations": 4
        },
        "num_threads": 1,
        "model_id": ".elser_model_2" 
    }
}
Run `PUT _inference/rerank/my-elastic-rerank` to create an inference endpoint that performs a rerank task using the built-in Elastic Rerank cross-encoder model. The `model_id` must be `.rerank-v1`, which is the ID of the built-in Elastic Rerank model. The API will automatically download the Elastic Rerank model if it isn't already downloaded and then deploy the model. Once deployed, the model can be used for semantic re-ranking with a `text_similarity_reranker` retriever.
{
    "service": "elasticsearch",
    "service_settings": {
        "model_id": ".rerank-v1", 
        "num_threads": 1,
        "adaptive_allocations": { 
        "enabled": true,
        "min_number_of_allocations": 1,
        "max_number_of_allocations": 4
        }
    }
}
Run `PUT _inference/text_embedding/my-e5-model` to create an inference endpoint that performs a `text_embedding` task. The `model_id` must be the ID of one of the built-in E5 models. The API will automatically download the E5 model if it isn't already downloaded and then deploy the model.
{
    "service": "elasticsearch",
    "service_settings": {
        "num_allocations": 1,
        "num_threads": 1,
        "model_id": ".multilingual-e5-small" 
    }
}
Run `PUT _inference/text_embedding/my-msmarco-minilm-model` to create an inference endpoint that performs a `text_embedding` task with a model that was uploaded by Eland.
{
    "service": "elasticsearch",
    "service_settings": {
        "num_allocations": 1,
        "num_threads": 1,
        "model_id": "msmarco-MiniLM-L12-cos-v5" 
    }
}
Run `PUT _inference/text_embedding/my-e5-model` to create an inference endpoint that performs a `text_embedding` task and to configure adaptive allocations. The API request will automatically download the E5 model if it isn't already downloaded and then deploy the model.
{
    "service": "elasticsearch",
    "service_settings": {
        "adaptive_allocations": {
        "enabled": true,
        "min_number_of_allocations": 3,
        "max_number_of_allocations": 10
        },
        "num_threads": 1,
        "model_id": ".multilingual-e5-small"
    }
}
Run `PUT _inference/sparse_embedding/use_existing_deployment` to use an already existing model deployment when creating an inference endpoint.
{
    "service": "elasticsearch",
    "service_settings": {
        "deployment_id": ".elser_model_2"
    }
}
Response examples (200)
A successful response from `PUT _inference/sparse_embedding/use_existing_deployment`. It contains the model ID and the threads and allocations settings from the model deployment.
{
  "inference_id": "use_existing_deployment",
  "task_type": "sparse_embedding",
  "service": "elasticsearch",
  "service_settings": {
    "num_allocations": 2,
    "num_threads": 1,
    "model_id": ".elser_model_2",
    "deployment_id": ".elser_model_2"
  },
  "chunking_settings": {
    "strategy": "sentence",
    "max_chunk_size": 250,
    "sentence_overlap": 1
  }
}

Create an ELSER inference endpoint Deprecated Added in 8.11.0

PUT /_inference/{task_type}/{elser_inference_id}

Create an inference endpoint to perform an inference task with the elser service. You can also deploy ELSER by using the Elasticsearch inference integration.


Your Elasticsearch deployment contains a preconfigured ELSER inference endpoint, you only need to create the enpoint using the API if you want to customize the settings.

The API request will automatically download and deploy the ELSER model if it isn't already downloaded.


You might see a 502 bad gateway error in the response when using the Kibana Console. This error usually just reflects a timeout, while the model downloads in the background. You can check the download progress in the Machine Learning UI. If using the Python client, you can set the timeout parameter to a higher value.

After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform.

    Value is sparse_embedding.

  • elser_inference_id string Required

    The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is elser.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • Hide adaptive_allocations attributes Show adaptive_allocations attributes object
      • enabled boolean

        Turn on adaptive_allocations.

      • The maximum number of allocations to scale to. If set, it must be greater than or equal to min_number_of_allocations.

      • The minimum number of allocations to scale to. If set, it must be greater than or equal to 0. If not defined, the deployment scales to 0.

    • num_allocations number Required

      The total number of allocations this model is assigned across machine learning nodes. Increasing this value generally increases the throughput. If adaptive allocations is enabled, do not set this value because it's automatically set.

    • num_threads number Required

      The number of threads used by each model allocation during inference. Increasing this value generally increases the speed per inference request. The inference process is a compute-bound process; threads_per_allocations must not exceed the number of available allocated processors per node. The value must be a power of 2. The maximum value is 32.


      If you want to optimize your ELSER endpoint for ingest, set the number of threads to 1. If you want to optimize your ELSER endpoint for search, set the number of threads to greater than 1.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{elser_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{elser_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"elser\",\n    \"service_settings\": {\n        \"num_allocations\": 1,\n        \"num_threads\": 1\n    }\n}"'
Request examples
Run `PUT _inference/sparse_embedding/my-elser-model` to create an inference endpoint that performs a `sparse_embedding` task. The request will automatically download the ELSER model if it isn't already downloaded and then deploy the model.
{
    "service": "elser",
    "service_settings": {
        "num_allocations": 1,
        "num_threads": 1
    }
}
Run `PUT _inference/sparse_embedding/my-elser-model` to create an inference endpoint that performs a `sparse_embedding` task with adaptive allocations. When adaptive allocations are enabled, the number of allocations of the model is set automatically based on the current load.
{
    "service": "elser",
    "service_settings": {
        "adaptive_allocations": {
            "enabled": true,
            "min_number_of_allocations": 3,
            "max_number_of_allocations": 10
        },
        "num_threads": 1
    }
}
Response examples (200)
A successful response when creating an ELSER inference endpoint.
{
  "inference_id": "my-elser-model",
  "task_type": "sparse_embedding",
  "service": "elser",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1
  },
  "task_settings": {}
}

Create an Google AI Studio inference endpoint Added in 8.15.0

PUT /_inference/{task_type}/{googleaistudio_inference_id}

Create an inference endpoint to perform an inference task with the googleaistudio service.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform.

    Values are completion or text_embedding.

  • The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is googleaistudio.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string Required

      A valid API key of your Google Gemini account.

    • model_id string Required

      The name of the model to use for the inference task. Refer to the Google documentation for the list of supported models.

      External documentation
    • Hide rate_limit attribute Show rate_limit attribute object

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{googleaistudio_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{googleaistudio_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"googleaistudio\",\n    \"service_settings\": {\n        \"api_key\": \"api-key\",\n        \"model_id\": \"model-id\"\n    }\n}"'
Request example
Run `PUT _inference/completion/google_ai_studio_completion` to create an inference endpoint to perform a `completion` task type.
{
    "service": "googleaistudio",
    "service_settings": {
        "api_key": "api-key",
        "model_id": "model-id"
    }
}

Create a Google Vertex AI inference endpoint Added in 8.15.0

PUT /_inference/{task_type}/{googlevertexai_inference_id}

Create an inference endpoint to perform an inference task with the googlevertexai service.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform.

    Values are rerank or text_embedding.

  • The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is googlevertexai.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • location string Required

      The name of the location to use for the inference task. Refer to the Google documentation for the list of supported locations.

      External documentation
    • model_id string Required

      The name of the model to use for the inference task. Refer to the Google documentation for the list of supported models.

      External documentation
    • project_id string Required

      The name of the project to use for the inference task.

    • Hide rate_limit attribute Show rate_limit attribute object
    • service_account_json string Required

      A valid service account in JSON format for the Google Vertex AI API.

  • Hide task_settings attributes Show task_settings attributes object
    • For a text_embedding task, truncate inputs longer than the maximum token length automatically.

    • top_n number

      For a rerank task, the number of the top N documents that should be returned.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{googlevertexai_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{googlevertexai_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"googlevertexai\",\n    \"service_settings\": {\n        \"service_account_json\": \"service-account-json\",\n        \"model_id\": \"model-id\",\n        \"location\": \"location\",\n        \"project_id\": \"project-id\"\n    }\n}"'
Request examples
Run `PUT _inference/text_embedding/google_vertex_ai_embeddings` to create an inference endpoint to perform a `text_embedding` task type.
{
    "service": "googlevertexai",
    "service_settings": {
        "service_account_json": "service-account-json",
        "model_id": "model-id",
        "location": "location",
        "project_id": "project-id"
    }
}
Run `PUT _inference/rerank/google_vertex_ai_rerank` to create an inference endpoint to perform a `rerank` task type.
{
    "service": "googlevertexai",
    "service_settings": {
        "service_account_json": "service-account-json",
        "project_id": "project-id"
    }
}

Create a Hugging Face inference endpoint Added in 8.12.0

PUT /_inference/{task_type}/{huggingface_inference_id}

Create an inference endpoint to perform an inference task with the hugging_face service.

You must first create an inference endpoint on the Hugging Face endpoint page to get an endpoint URL. Select the model you want to use on the new endpoint creation page (for example intfloat/e5-small-v2), then select the sentence embeddings task under the advanced configuration section. Create the endpoint and copy the URL after the endpoint initialization has been finished.

The following models are recommended for the Hugging Face service:

  • all-MiniLM-L6-v2
  • all-MiniLM-L12-v2
  • all-mpnet-base-v2
  • e5-base-v2
  • e5-small-v2
  • multilingual-e5-base
  • multilingual-e5-small

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform.

    Value is text_embedding.

  • The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is hugging_face.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string Required

      A valid access token for your HuggingFace account. You can create or find your access tokens on the HuggingFace settings page.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.

      External documentation
    • Hide rate_limit attribute Show rate_limit attribute object
    • url string Required

      The URL endpoint to use for the requests.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{huggingface_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{huggingface_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"hugging_face\",\n    \"service_settings\": {\n        \"api_key\": \"hugging-face-access-token\", \n        \"url\": \"url-endpoint\" \n    }\n}"'
Request example
Run `PUT _inference/text_embedding/hugging-face-embeddings` to create an inference endpoint that performs a `text_embedding` task type.
{
    "service": "hugging_face",
    "service_settings": {
        "api_key": "hugging-face-access-token", 
        "url": "url-endpoint" 
    }
}

Create an JinaAI inference endpoint Added in 8.18.0

PUT /_inference/{task_type}/{jinaai_inference_id}

Create an inference endpoint to perform an inference task with the jinaai service.

To review the available rerank models, refer to https://jina.ai/reranker. To review the available text_embedding models, refer to the https://jina.ai/embeddings/.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform.

    Values are rerank or text_embedding.

  • jinaai_inference_id string Required

    The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is jinaai.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string Required

      A valid API key of your JinaAI account.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.

      External documentation
    • model_id string

      The name of the model to use for the inference task. For a rerank task, it is required. For a text_embedding task, it is optional.

    • Hide rate_limit attribute Show rate_limit attribute object
    • Values are cosine, dot_product, or l2_norm.

  • Hide task_settings attributes Show task_settings attributes object
    • For a rerank task, return the doc text within the results.

    • task string

      Values are classification, clustering, ingest, or search.

    • top_n number

      For a rerank task, the number of most relevant documents to return. It defaults to the number of the documents. If this inference endpoint is used in a text_similarity_reranker retriever query and top_n is set, it must be greater than or equal to rank_window_size in the query.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{jinaai_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{jinaai_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"jinaai\",\n    \"service_settings\": {\n        \"model_id\": \"jina-embeddings-v3\",\n        \"api_key\": \"JinaAi-Api-key\"\n    }\n}"'
Request examples
Run `PUT _inference/text_embedding/jinaai-embeddings` to create an inference endpoint for text embedding tasks using the JinaAI service.
{
    "service": "jinaai",
    "service_settings": {
        "model_id": "jina-embeddings-v3",
        "api_key": "JinaAi-Api-key"
    }
}
Run `PUT _inference/rerank/jinaai-rerank` to create an inference endpoint for rerank tasks using the JinaAI service.
{
    "service": "jinaai",
    "service_settings": {
        "api_key": "JinaAI-Api-key",
        "model_id": "jina-reranker-v2-base-multilingual"
    },
    "task_settings": {
        "top_n": 10,
        "return_documents": true
    }
}

Create a Mistral inference endpoint Added in 8.15.0

PUT /_inference/{task_type}/{mistral_inference_id}

Creates an inference endpoint to perform an inference task with the mistral service.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The task type. The only valid task type for the model to perform is text_embedding.

    Value is text_embedding.

  • mistral_inference_id string Required

    The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is mistral.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string Required

      A valid API key of your Mistral account. You can find your Mistral API keys or you can create a new one on the API Keys page.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.

      External documentation
    • The maximum number of tokens per input before chunking occurs.

    • model string Required

      The name of the model to use for the inference task. Refer to the Mistral models documentation for the list of available text embedding models.

      External documentation
    • Hide rate_limit attribute Show rate_limit attribute object

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{mistral_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{mistral_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"service\": \"mistral\",\n  \"service_settings\": {\n    \"api_key\": \"Mistral-API-Key\",\n    \"model\": \"mistral-embed\" \n  }\n}"'
Request example
Run `PUT _inference/text_embedding/mistral-embeddings-test` to create a Mistral inference endpoint that performs a text embedding task.
{
  "service": "mistral",
  "service_settings": {
    "api_key": "Mistral-API-Key",
    "model": "mistral-embed" 
  }
}

Create an OpenAI inference endpoint Added in 8.12.0

PUT /_inference/{task_type}/{openai_inference_id}

Create an inference endpoint to perform an inference task with the openai service or openai compatible APIs.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform. NOTE: The chat_completion task type only supports streaming and only through the _stream API.

    Values are chat_completion, completion, or text_embedding.

  • openai_inference_id string Required

    The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is openai.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string Required

      A valid API key of your OpenAI account. You can find your OpenAI API keys in your OpenAI account under the API keys section.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.

      External documentation
    • The number of dimensions the resulting output embeddings should have. It is supported only in text-embedding-3 and later models. If it is not set, the OpenAI defined default for the model is used.

    • model_id string Required

      The name of the model to use for the inference task. Refer to the OpenAI documentation for the list of available text embedding models.

      External documentation
    • The unique identifier for your organization. You can find the Organization ID in your OpenAI account under Settings > Organizations.

    • Hide rate_limit attribute Show rate_limit attribute object
    • url string

      The URL endpoint to use for the requests. It can be changed for testing purposes.

  • Hide task_settings attribute Show task_settings attribute object
    • user string

      For a completion or text_embedding task, specify the user issuing the request. This information can be used for abuse detection.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{openai_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{openai_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"openai\",\n    \"service_settings\": {\n        \"api_key\": \"OpenAI-API-Key\",\n        \"model_id\": \"text-embedding-3-small\",\n        \"dimensions\": 128\n    }\n}"'
Request examples
Run `PUT _inference/text_embedding/openai-embeddings` to create an inference endpoint that performs a `text_embedding` task. The embeddings created by requests to this endpoint will have 128 dimensions.
{
    "service": "openai",
    "service_settings": {
        "api_key": "OpenAI-API-Key",
        "model_id": "text-embedding-3-small",
        "dimensions": 128
    }
}
Run `PUT _inference/completion/amazon_bedrock_completion` to create an inference endpoint to perform a completion task.
{
    "service": "amazonbedrock",
    "service_settings": {
        "access_key": "AWS-access-key",
        "secret_key": "AWS-secret-key",
        "region": "us-east-1",
        "provider": "amazontitan",
        "model": "amazon.titan-text-premier-v1:0"
    }
}

Create a VoyageAI inference endpoint Added in 8.19.0

PUT /_inference/{task_type}/{voyageai_inference_id}

Create an inference endpoint to perform an inference task with the voyageai service.

Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform.

    Values are text_embedding or rerank.

  • voyageai_inference_id string Required

    The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is voyageai.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • The number of dimensions for resulting output embeddings. This setting maps to output_dimension in the VoyageAI documentation. Only for the text_embedding task type.

      External documentation
    • model_id string Required

      The name of the model to use for the inference task. Refer to the VoyageAI documentation for the list of available text embedding and rerank models.

      External documentation
    • Hide rate_limit attribute Show rate_limit attribute object
    • The data type for the embeddings to be returned. This setting maps to output_dtype in the VoyageAI documentation. Permitted values: float, int8, bit. int8 is a synonym of byte in the VoyageAI documentation. bit is a synonym of binary in the VoyageAI documentation. Only for the text_embedding task type.

      External documentation
  • Hide task_settings attributes Show task_settings attributes object
    • Type of the input text. Permitted values: ingest (maps to document in the VoyageAI documentation), search (maps to query in the VoyageAI documentation). Only for the text_embedding task type.

    • Whether to return the source documents in the response. Only for the rerank task type.

    • top_k number

      The number of most relevant documents to return. If not specified, the reranking results of all documents will be returned. Only for the rerank task type.

    • truncation boolean

      Whether to truncate the input texts to fit within the context length.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{voyageai_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{voyageai_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"voyageai\",\n    \"service_settings\": {\n        \"model_id\": \"voyage-3-large\",\n        \"dimensions\": 512\n    }\n}"'
Request examples
Run `PUT _inference/text_embedding/voyageai-embeddings` to create an inference endpoint that performs a `text_embedding` task. The embeddings created by requests to this endpoint will have 512 dimensions.
{
    "service": "voyageai",
    "service_settings": {
        "model_id": "voyage-3-large",
        "dimensions": 512
    }
}
Run `PUT _inference/rerank/voyageai-rerank` to create an inference endpoint that performs a `rerank` task.
{
    "service": "voyageai",
    "service_settings": {
        "model_id": "rerank-2"
    }
}

Create a Watsonx inference endpoint Added in 8.16.0

PUT /_inference/{task_type}/{watsonx_inference_id}

Create an inference endpoint to perform an inference task with the watsonxai service. You need an IBM Cloud Databases for Elasticsearch deployment to use the watsonxai inference service. You can provision one through the IBM catalog, the Cloud Databases CLI plug-in, the Cloud Databases API, or Terraform.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The task type. The only valid task type for the model to perform is text_embedding.

    Value is text_embedding.

  • watsonx_inference_id string Required

    The unique identifier of the inference endpoint.

application/json

Body

  • service string Required

    Value is watsonxai.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string Required

      A valid API key of your Watsonx account. You can find your Watsonx API keys or you can create a new one on the API keys page.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.

      External documentation
    • api_version string Required

      A version parameter that takes a version date in the format of YYYY-MM-DD. For the active version data parameters, refer to the Wastonx documentation.

      External documentation
    • model_id string Required

      The name of the model to use for the inference task. Refer to the IBM Embedding Models section in the Watsonx documentation for the list of available text embedding models.

      External documentation
    • project_id string Required

      The identifier of the IBM Cloud project to use for the inference task.

    • Hide rate_limit attribute Show rate_limit attribute object
    • url string Required

      The URL of the inference endpoint that you created on Watsonx.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{watsonx_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{watsonx_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"service\": \"watsonxai\",\n  \"service_settings\": {\n      \"api_key\": \"Watsonx-API-Key\", \n      \"url\": \"Wastonx-URL\", \n      \"model_id\": \"ibm/slate-30m-english-rtrvr\",\n      \"project_id\": \"IBM-Cloud-ID\", \n      \"api_version\": \"2024-03-14\"\n  }\n}"'
Request example
Run `PUT _inference/text_embedding/watsonx-embeddings` to create an Watonsx inference endpoint that performs a text embedding task.
{
  "service": "watsonxai",
  "service_settings": {
      "api_key": "Watsonx-API-Key", 
      "url": "Wastonx-URL", 
      "model_id": "ibm/slate-30m-english-rtrvr",
      "project_id": "IBM-Cloud-ID", 
      "api_version": "2024-03-14"
  }
}

Perform rereanking inference on the service Added in 8.11.0

POST /_inference/rerank/{inference_id}

Path parameters

  • inference_id string Required

    The unique identifier for the inference endpoint.

Query parameters

  • timeout string

    The amount of time to wait for the inference request to complete.

application/json

Body

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
POST /_inference/rerank/{inference_id}
curl \
 --request POST 'http://api.example.com/_inference/rerank/{inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"input\": [\"luke\", \"like\", \"leia\", \"chewy\",\"r2d2\", \"star\", \"wars\"],\n  \"query\": \"star wars main character\"\n}"'
Request example
Run `POST _inference/rerank/cohere_rerank` to perform reranking on the example input.
{
  "input": ["luke", "like", "leia", "chewy","r2d2", "star", "wars"],
  "query": "star wars main character"
}
Response examples (200)
A successful response from `POST _inference/rerank/cohere_rerank`.
{
  "rerank": [
    {
      "index": "2",
      "relevance_score": "0.011597361",
      "text": "leia"
    },
    {
      "index": "0",
      "relevance_score": "0.006338922",
      "text": "luke"
    },
    {
      "index": "5",
      "relevance_score": "0.0016166499",
      "text": "star"
    },
    {
      "index": "4",
      "relevance_score": "0.0011695103",
      "text": "r2d2"
    },
    {
      "index": "1",
      "relevance_score": "5.614787E-4",
      "text": "like"
    },
    {
      "index": "6",
      "relevance_score": "3.7850367E-4",
      "text": "wars"
    },
    {
      "index": "3",
      "relevance_score": "1.2508839E-5",
      "text": "chewy"
    }
  ]
}

Perform sparse embedding inference on the service Added in 8.11.0

POST /_inference/sparse_embedding/{inference_id}

Path parameters

Query parameters

  • timeout string

    Specifies the amount of time to wait for the inference request to complete.

application/json

Body

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • sparse_embedding array[object] Required
      Hide sparse_embedding attribute Show sparse_embedding attribute object
      • embedding object Required

        Sparse Embedding tokens are represented as a dictionary of string to double.

        Hide embedding attribute Show embedding attribute object
        • * number Additional properties
POST /_inference/sparse_embedding/{inference_id}
curl \
 --request POST 'http://api.example.com/_inference/sparse_embedding/{inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"input\": \"The sky above the port was the color of television tuned to a dead channel.\"\n}"'
Request example
Run `POST _inference/sparse_embedding/my-elser-model` to perform sparse embedding on the example sentence.
{
  "input": "The sky above the port was the color of television tuned to a dead channel."
}
Response examples (200)
An abbreviated response from `POST _inference/sparse_embedding/my-elser-model`.
{
  "sparse_embedding": [
    {
      "port": 2.1259406,
      "sky": 1.7073475,
      "color": 1.6922266,
      "dead": 1.6247464,
      "television": 1.3525393,
      "above": 1.2425821,
      "tuned": 1.1440028,
      "colors": 1.1218185,
      "tv": 1.0111054,
      "ports": 1.0067928,
      "poem": 1.0042328,
      "channel": 0.99471164,
      "tune": 0.96235967,
      "scene": 0.9020516
    }
  ]
}

Perform text embedding inference on the service Added in 8.11.0

POST /_inference/text_embedding/{inference_id}

Path parameters

Query parameters

  • timeout string

    Specifies the amount of time to wait for the inference request to complete.

application/json

Body

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide text_embedding_bytes attribute Show text_embedding_bytes attribute object
      • embedding array[number] Required

        Text Embedding results containing bytes are represented as Dense Vectors of bytes.

    • text_embedding_bits array[object]
      Hide text_embedding_bits attribute Show text_embedding_bits attribute object
      • embedding array[number] Required

        Text Embedding results containing bytes are represented as Dense Vectors of bytes.

    • text_embedding array[object]
      Hide text_embedding attribute Show text_embedding attribute object
      • embedding array[number] Required

        Text Embedding results are represented as Dense Vectors of floats.

POST /_inference/text_embedding/{inference_id}
curl \
 --request POST 'http://api.example.com/_inference/text_embedding/{inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"input\": \"The sky above the port was the color of television tuned to a dead channel.\",\n  \"task_settings\": {\n    \"input_type\": \"ingest\"\n  }\n}"'
Request example
Run `POST _inference/text_embedding/my-cohere-endpoint` to perform text embedding on the example sentence using the Cohere integration,
{
  "input": "The sky above the port was the color of television tuned to a dead channel.",
  "task_settings": {
    "input_type": "ingest"
  }
}
Response examples (200)
An abbreviated response from `POST _inference/text_embedding/my-cohere-endpoint`.
{
  "text_embedding": [
    {
      "embedding": [
        {
          0.018569946,
          -0.036895752,
          0.01486969,
          -0.0045204163,
          -0.04385376,
          0.0075950623,
          0.04260254,
          -0.004005432,
          0.007865906,
          0.030792236,
          -0.050476074,
          0.011795044,
          -0.011642456,
          -0.010070801
        }
      ]
    }
  ]
}

Get cluster info

GET /

Get basic build, version, and cluster information.

Responses

GET /
curl \
 --request GET 'http://api.example.com/' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET /`s.
{
  "name": "instance-0000000000",
  "cluster_name": "my_test_cluster",
  "cluster_uuid": "5QaxoN0pRZuOmWSxstBBwQ",
  "version": {
    "build_date": "2024-02-01T13:07:13.727175297Z",
    "minimum_wire_compatibility_version": "7.17.0",
    "build_hash": "6185ba65d27469afabc9bc951cded6c17c21e3f3",
    "number": "8.12.1",
    "lucene_version": "9.9.2",
    "minimum_index_compatibility_version": "7.0.0",
    "build_flavor": "default",
    "build_snapshot": false,
    "build_type": "docker"
  },
  "tagline": "You Know, for Search"
}

Ingest

Ingest APIs enable you to manage tasks and resources related to ingest pipelines and processors.

Get pipelines Added in 5.0.0

GET /_ingest/pipeline/{id}

Get information about one or more ingest pipelines. This API returns a local reference of the pipeline.

External documentation

Path parameters

  • id string Required

    Comma-separated list of pipeline IDs to retrieve. Wildcard (*) expressions are supported. To get all ingest pipelines, omit this parameter or use *.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • summary boolean

    Return pipelines without their definitions (default: false)

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • Description of the ingest pipeline.

      • on_failure array[object]

        Processors to run immediately after a processor failure.

        Hide on_failure attributes Show on_failure attributes object
        • append object
          Hide append attributes Show append attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If false, the processor does not append values already present in the field.

        • Hide attachment attributes Show attachment attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • The number of chars being used for extraction to prevent huge fields. Use -1 for no limit.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • properties array[string]

            Array of properties to select to be stored. Can be content, title, name, author, keywords, date, content_type, content_length, language.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true, the binary field will be removed from the document

          • Field containing the name of the resource to decode. If specified, the processor passes this resource name to the underlying Tika library to enable Resource Name Based Detection.

        • bytes object
          Hide bytes attributes Show bytes attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • circle object
          Hide circle attributes Show circle attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • error_distance number Required

            The difference between the resulting inscribed distance from center to side and the circle’s radius (measured in meters for geo_shape, unit-less for shape).

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • shape_type string Required

            Values are geo_shape or shape.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide community_id attributes Show community_id attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • seed number

            Seed for the community ID hash. Must be between 0 and 65535 (inclusive). The seed can prevent hash collisions between network domains, such as a staging and production network that use the same addressing scheme.

          • If true and any required fields are missing, the processor quietly exits without modifying the document.

        • convert object
          Hide convert attributes Show convert attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • type string Required

            Values are integer, long, double, float, boolean, ip, string, or auto.

        • csv object
          Hide csv attributes Show csv attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Value used to fill empty fields. Empty fields are skipped if this is not provided. An empty field is one with no value (2 consecutive separators) or empty quotes ("").

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • quote string

            Quote used in CSV, has to be single character string.

          • Separator used in CSV, has to be single character string.

          • target_fields string | array[string] Required
          • trim boolean

            Trim whitespaces in unquoted fields.

        • date object
          Hide date attributes Show date attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • formats array[string] Required

            An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

          • locale string

            The locale to use when parsing the date, relevant when parsing month names or week days. Supports template snippets.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • timezone string

            The timezone to use when parsing the date. Supports template snippets.

          • The format to use when writing the date to target_field. Must be a valid java time pattern.

        • Hide date_index_name attributes Show date_index_name attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • date_formats array[string]

            An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

          • date_rounding string Required

            How to round the date when formatting the date into the index name. Valid values are: y (year), M (month), w (week), d (day), h (hour), m (minute) and s (second). Supports template snippets.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • The format to be used when printing the parsed date into the index name. A valid java time pattern is expected here. Supports template snippets.

          • A prefix of the index name to be prepended before the printed date. Supports template snippets.

          • locale string

            The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.

          • timezone string

            The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.

        • dissect object
          Hide dissect attributes Show dissect attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • The character(s) that separate the appended fields.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • pattern string Required

            The pattern to apply to the field.

        • Hide dot_expander attributes Show dot_expander attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • override boolean

            Controls the behavior when there is already an existing nested object that conflicts with the expanded field. When false, the processor will merge conflicts by combining the old and the new values into an array. When true, the value from the expanded field will overwrite the existing value.

          • path string

            The field that contains the field to expand. Only required if the field to expand is part another object field, because the field option can only understand leaf fields.

        • drop object
          Hide drop attributes Show drop attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

        • enrich object
          Hide enrich attributes Show enrich attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • The maximum number of matched documents to include under the configured target field. The target_field will be turned into a json array if max_matches is higher than 1, otherwise target_field will become a json object. In order to avoid documents getting too large, the maximum allowed value is 128.

          • override boolean

            If processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

          • policy_name string Required

            The name of the enrich policy to use.

          • Values are intersects, disjoint, within, or contains.

          • target_field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • fail object
          Hide fail attributes Show fail attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • message string Required

            The error message thrown by the processor. Supports template snippets.

        • Hide fingerprint attributes Show fingerprint attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • fields string | array[string] Required
          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • salt string

            Salt value for the hash function.

          • method string

            Values are MD5, SHA-1, SHA-256, SHA-512, or MurmurHash3.

          • If true, the processor ignores any missing fields. If all fields are missing, the processor silently exits without modifying the document.

        • foreach object
          Hide foreach attributes Show foreach attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true, the processor silently exits without changing the document if the field is null or missing.

          • processor object Required
        • Hide ip_location attributes Show ip_location attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • first_only boolean

            If true, only the first found IP location data will be returned, even if the field contains an array.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • properties array[string]

            Controls what properties are added to the target_field based on the IP location lookup.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

        • geo_grid object
          Hide geo_grid attributes Show geo_grid attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            The field to interpret as a geo-tile.= The field format is determined by the tile_type.

          • tile_type string Required

            Values are geotile, geohex, or geohash.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • Values are geojson or wkt.

        • geoip object
          Hide geoip attributes Show geoip attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • first_only boolean

            If true, only the first found geoip data will be returned, even if the field contains an array.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • properties array[string]

            Controls what properties are added to the target_field based on the geoip lookup.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

        • grok object
          Hide grok attributes Show grok attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Must be disabled or v1. If v1, the processor uses patterns with Elastic Common Schema (ECS) field names.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • A map of pattern-name and pattern tuples defining custom patterns to be used by the current processor. Patterns matching existing names will override the pre-existing definition.

          • patterns array[string] Required

            An ordered list of grok expression to match and extract named captures with. Returns on the first expression in the list that matches.

          • When true, _ingest._grok_match_index will be inserted into your matched document’s metadata with the index into the pattern found in patterns that matched.

        • gsub object
          Hide gsub attributes Show gsub attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • pattern string Required

            The pattern to be replaced.

          • replacement string Required

            The string to replace the matching patterns with.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide html_strip attributes Show html_strip attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document,

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide inference attributes Show inference attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • model_id string Required
          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.

          • If true and any of the input fields defined in input_ouput are missing then those missing fields are quietly ignored, otherwise a missing field causes a failure. Only applies when using input_output configurations to explicitly list the input fields.

        • join object
          Hide join attributes Show join attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • separator string Required

            The separator character.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • json object
          Hide json attributes Show json attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Flag that forces the parsed JSON to be added at the top level of the document. target_field must not be set when this option is chosen.

          • Values are replace or merge.

          • When set to true, the JSON parser will not fail if the JSON contains duplicate keys. Instead, the last encountered value for any duplicate key wins.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • kv object
          Hide kv attributes Show kv attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • exclude_keys array[string]

            List of keys to exclude from document.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • field_split string Required

            Regex pattern to use for splitting key-value pairs.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • include_keys array[string]

            List of keys to filter and insert into document. Defaults to including all keys.

          • prefix string

            Prefix to be added to extracted keys.

          • If true. strip brackets (), <>, [] as well as quotes ' and " from extracted values.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • trim_key string

            String of characters to trim from extracted keys.

          • String of characters to trim from extracted values.

          • value_split string Required

            Regex pattern to use for splitting the key from the value within a key-value pair.

        • Hide lowercase attributes Show lowercase attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide network_direction attributes Show network_direction attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • internal_networks array[string]

            List of internal networks. Supports IPv4 and IPv6 addresses and ranges in CIDR notation. Also supports the named ranges listed below. These may be constructed with template snippets. Must specify only one of internal_networks or internal_networks_field.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and any required fields are missing, the processor quietly exits without modifying the document.

        • pipeline object
          Hide pipeline attributes Show pipeline attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • name string Required
          • Whether to ignore missing pipelines instead of failing.

        • redact object
          Hide redact attributes Show redact attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • patterns array[string] Required

            A list of grok expressions to match and redact named captures with

          • prefix string

            Start a redacted section with this token

          • suffix string

            End a redacted section with this token

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • If true and the current license does not support running redact processors, then the processor quietly exits without modifying the document

          • If true then ingest metadata _ingest._redact._is_redacted is set to true if the document has been redacted

        • Hide registered_domain attributes Show registered_domain attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and any required fields are missing, the processor quietly exits without modifying the document.

        • remove object
          Hide remove attributes Show remove attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string | array[string] Required
          • keep string | array[string]
          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • rename object
          Hide rename attributes Show rename attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • target_field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • reroute object
          Hide reroute attributes Show reroute attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • A static value for the target. Can’t be set when the dataset or namespace option is set.

        • script object
          Hide script attributes Show script attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • id string
          • params object

            Object containing parameters for the script.

        • set object
          Hide set attributes Show set attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and value is a template snippet that evaluates to null or the empty string, the processor quietly exits without modifying the document.

          • The media type for encoding value. Applies only when value is a template snippet. Must be one of application/json, text/plain, or application/x-www-form-urlencoded.

          • override boolean

            If true processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

          • value object

            The value to be set for the field. Supports template snippets. May specify only one of value or copy_from.

        • Hide set_security_user attributes Show set_security_user attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • properties array[string]

            Controls what user related properties are added to the field.

        • sort object
          Hide sort attributes Show sort attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • order string

            Values are asc or desc.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • split object
          Hide split attributes Show split attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • Preserves empty trailing fields, if any.

          • separator string Required

            A regex which matches the separator, for example, , or \s+.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide terminate attributes Show terminate attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

        • trim object
          Hide trim attributes Show trim attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide uppercase attributes Show uppercase attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide urldecode attributes Show urldecode attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide uri_parts attributes Show uri_parts attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • If true, the processor copies the unparsed URI to <target_field>.original.

          • If true, the processor removes the field after parsing the URI string. If parsing fails, the processor does not remove the field.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide user_agent attributes Show user_agent attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • The name of the file in the config/ingest-user-agent directory containing the regular expressions for parsing the user agent string. Both the directory and the file have to be created before starting Elasticsearch. If not specified, ingest-user-agent will use the regexes.yaml from uap-core it ships with.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • properties array[string]

            Controls what properties are added to target_field.

            Values are name, os, device, original, or version.

          • Extracts device type from the user agent string on a best-effort basis.

      • processors array[object]

        Processors used to perform transformations on documents before indexing. Processors run sequentially in the order specified.

        Hide processors attributes Show processors attributes object
        • append object
          Hide append attributes Show append attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If false, the processor does not append values already present in the field.

        • Hide attachment attributes Show attachment attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • The number of chars being used for extraction to prevent huge fields. Use -1 for no limit.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • properties array[string]

            Array of properties to select to be stored. Can be content, title, name, author, keywords, date, content_type, content_length, language.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true, the binary field will be removed from the document

          • Field containing the name of the resource to decode. If specified, the processor passes this resource name to the underlying Tika library to enable Resource Name Based Detection.

        • bytes object
          Hide bytes attributes Show bytes attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • circle object
          Hide circle attributes Show circle attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • error_distance number Required

            The difference between the resulting inscribed distance from center to side and the circle’s radius (measured in meters for geo_shape, unit-less for shape).

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • shape_type string Required

            Values are geo_shape or shape.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide community_id attributes Show community_id attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • seed number

            Seed for the community ID hash. Must be between 0 and 65535 (inclusive). The seed can prevent hash collisions between network domains, such as a staging and production network that use the same addressing scheme.

          • If true and any required fields are missing, the processor quietly exits without modifying the document.

        • convert object
          Hide convert attributes Show convert attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • type string Required

            Values are integer, long, double, float, boolean, ip, string, or auto.

        • csv object
          Hide csv attributes Show csv attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Value used to fill empty fields. Empty fields are skipped if this is not provided. An empty field is one with no value (2 consecutive separators) or empty quotes ("").

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • quote string

            Quote used in CSV, has to be single character string.

          • Separator used in CSV, has to be single character string.

          • target_fields string | array[string] Required
          • trim boolean

            Trim whitespaces in unquoted fields.

        • date object
          Hide date attributes Show date attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • formats array[string] Required

            An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

          • locale string

            The locale to use when parsing the date, relevant when parsing month names or week days. Supports template snippets.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • timezone string

            The timezone to use when parsing the date. Supports template snippets.

          • The format to use when writing the date to target_field. Must be a valid java time pattern.

        • Hide date_index_name attributes Show date_index_name attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • date_formats array[string]

            An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

          • date_rounding string Required

            How to round the date when formatting the date into the index name. Valid values are: y (year), M (month), w (week), d (day), h (hour), m (minute) and s (second). Supports template snippets.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • The format to be used when printing the parsed date into the index name. A valid java time pattern is expected here. Supports template snippets.

          • A prefix of the index name to be prepended before the printed date. Supports template snippets.

          • locale string

            The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.

          • timezone string

            The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.

        • dissect object
          Hide dissect attributes Show dissect attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • The character(s) that separate the appended fields.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • pattern string Required

            The pattern to apply to the field.

        • Hide dot_expander attributes Show dot_expander attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • override boolean

            Controls the behavior when there is already an existing nested object that conflicts with the expanded field. When false, the processor will merge conflicts by combining the old and the new values into an array. When true, the value from the expanded field will overwrite the existing value.

          • path string

            The field that contains the field to expand. Only required if the field to expand is part another object field, because the field option can only understand leaf fields.

        • drop object
          Hide drop attributes Show drop attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

        • enrich object
          Hide enrich attributes Show enrich attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • The maximum number of matched documents to include under the configured target field. The target_field will be turned into a json array if max_matches is higher than 1, otherwise target_field will become a json object. In order to avoid documents getting too large, the maximum allowed value is 128.

          • override boolean

            If processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

          • policy_name string Required

            The name of the enrich policy to use.

          • Values are intersects, disjoint, within, or contains.

          • target_field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • fail object
          Hide fail attributes Show fail attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • message string Required

            The error message thrown by the processor. Supports template snippets.

        • Hide fingerprint attributes Show fingerprint attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • fields string | array[string] Required
          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • salt string

            Salt value for the hash function.

          • method string

            Values are MD5, SHA-1, SHA-256, SHA-512, or MurmurHash3.

          • If true, the processor ignores any missing fields. If all fields are missing, the processor silently exits without modifying the document.

        • foreach object
          Hide foreach attributes Show foreach attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true, the processor silently exits without changing the document if the field is null or missing.

          • processor object Required
        • Hide ip_location attributes Show ip_location attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • first_only boolean

            If true, only the first found IP location data will be returned, even if the field contains an array.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • properties array[string]

            Controls what properties are added to the target_field based on the IP location lookup.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

        • geo_grid object
          Hide geo_grid attributes Show geo_grid attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            The field to interpret as a geo-tile.= The field format is determined by the tile_type.

          • tile_type string Required

            Values are geotile, geohex, or geohash.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • Values are geojson or wkt.

        • geoip object
          Hide geoip attributes Show geoip attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • first_only boolean

            If true, only the first found geoip data will be returned, even if the field contains an array.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • properties array[string]

            Controls what properties are added to the target_field based on the geoip lookup.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

        • grok object
          Hide grok attributes Show grok attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Must be disabled or v1. If v1, the processor uses patterns with Elastic Common Schema (ECS) field names.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • A map of pattern-name and pattern tuples defining custom patterns to be used by the current processor. Patterns matching existing names will override the pre-existing definition.

          • patterns array[string] Required

            An ordered list of grok expression to match and extract named captures with. Returns on the first expression in the list that matches.

          • When true, _ingest._grok_match_index will be inserted into your matched document’s metadata with the index into the pattern found in patterns that matched.

        • gsub object
          Hide gsub attributes Show gsub attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • pattern string Required

            The pattern to be replaced.

          • replacement string Required

            The string to replace the matching patterns with.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide html_strip attributes Show html_strip attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document,

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide inference attributes Show inference attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • model_id string Required
          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.

          • If true and any of the input fields defined in input_ouput are missing then those missing fields are quietly ignored, otherwise a missing field causes a failure. Only applies when using input_output configurations to explicitly list the input fields.

        • join object
          Hide join attributes Show join attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • separator string Required

            The separator character.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • json object
          Hide json attributes Show json attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Flag that forces the parsed JSON to be added at the top level of the document. target_field must not be set when this option is chosen.

          • Values are replace or merge.

          • When set to true, the JSON parser will not fail if the JSON contains duplicate keys. Instead, the last encountered value for any duplicate key wins.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • kv object
          Hide kv attributes Show kv attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • exclude_keys array[string]

            List of keys to exclude from document.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • field_split string Required

            Regex pattern to use for splitting key-value pairs.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • include_keys array[string]

            List of keys to filter and insert into document. Defaults to including all keys.

          • prefix string

            Prefix to be added to extracted keys.

          • If true. strip brackets (), <>, [] as well as quotes ' and " from extracted values.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • trim_key string

            String of characters to trim from extracted keys.

          • String of characters to trim from extracted values.

          • value_split string Required

            Regex pattern to use for splitting the key from the value within a key-value pair.

        • Hide lowercase attributes Show lowercase attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide network_direction attributes Show network_direction attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • internal_networks array[string]

            List of internal networks. Supports IPv4 and IPv6 addresses and ranges in CIDR notation. Also supports the named ranges listed below. These may be constructed with template snippets. Must specify only one of internal_networks or internal_networks_field.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and any required fields are missing, the processor quietly exits without modifying the document.

        • pipeline object
          Hide pipeline attributes Show pipeline attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • name string Required
          • Whether to ignore missing pipelines instead of failing.

        • redact object
          Hide redact attributes Show redact attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • patterns array[string] Required

            A list of grok expressions to match and redact named captures with

          • prefix string

            Start a redacted section with this token

          • suffix string

            End a redacted section with this token

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • If true and the current license does not support running redact processors, then the processor quietly exits without modifying the document

          • If true then ingest metadata _ingest._redact._is_redacted is set to true if the document has been redacted

        • Hide registered_domain attributes Show registered_domain attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and any required fields are missing, the processor quietly exits without modifying the document.

        • remove object
          Hide remove attributes Show remove attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string | array[string] Required
          • keep string | array[string]
          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • rename object
          Hide rename attributes Show rename attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • target_field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • reroute object
          Hide reroute attributes Show reroute attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • A static value for the target. Can’t be set when the dataset or namespace option is set.

        • script object
          Hide script attributes Show script attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • id string
          • params object

            Object containing parameters for the script.

        • set object
          Hide set attributes Show set attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and value is a template snippet that evaluates to null or the empty string, the processor quietly exits without modifying the document.

          • The media type for encoding value. Applies only when value is a template snippet. Must be one of application/json, text/plain, or application/x-www-form-urlencoded.

          • override boolean

            If true processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

          • value object

            The value to be set for the field. Supports template snippets. May specify only one of value or copy_from.

        • Hide set_security_user attributes Show set_security_user attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • properties array[string]

            Controls what user related properties are added to the field.

        • sort object
          Hide sort attributes Show sort attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • order string

            Values are asc or desc.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • split object
          Hide split attributes Show split attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • Preserves empty trailing fields, if any.

          • separator string Required

            A regex which matches the separator, for example, , or \s+.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide terminate attributes Show terminate attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

        • trim object
          Hide trim attributes Show trim attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide uppercase attributes Show uppercase attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide urldecode attributes Show urldecode attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide uri_parts attributes Show uri_parts attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • If true, the processor copies the unparsed URI to <target_field>.original.

          • If true, the processor removes the field after parsing the URI string. If parsing fails, the processor does not remove the field.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide user_agent attributes Show user_agent attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • The name of the file in the config/ingest-user-agent directory containing the regular expressions for parsing the user agent string. Both the directory and the file have to be created before starting Elasticsearch. If not specified, ingest-user-agent will use the regexes.yaml from uap-core it ships with.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • properties array[string]

            Controls what properties are added to target_field.

            Values are name, os, device, original, or version.

          • Extracts device type from the user agent string on a best-effort basis.

      • version number
      • deprecated boolean

        Marks this ingest pipeline as deprecated. When a deprecated ingest pipeline is referenced as the default or final pipeline when creating or updating a non-deprecated index template, Elasticsearch will emit a deprecation warning.

      • _meta object
        Hide _meta attribute Show _meta attribute object
        • * object Additional properties
GET /_ingest/pipeline/{id}
curl \
 --request GET 'http://api.example.com/_ingest/pipeline/{id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response for retrieving information about an ingest pipeline.
{
  "my-pipeline-id" : {
    "description" : "describe pipeline",
    "version" : 123,
    "processors" : [
      {
        "set" : {
          "field" : "foo",
          "value" : "bar"
        }
      }
    ]
  }
}

Create or update a pipeline Added in 5.0.0

PUT /_ingest/pipeline/{id}

Changes made using this API take effect immediately.

External documentation

Path parameters

  • id string Required

    ID of the ingest pipeline to create or update.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

  • Required version for optimistic concurrency control for pipeline updates

application/json

Body Required

  • _meta object
    Hide _meta attribute Show _meta attribute object
    • * object Additional properties
  • Description of the ingest pipeline.

  • on_failure array[object]

    Processors to run immediately after a processor failure. Each processor supports a processor-level on_failure value. If a processor without an on_failure value fails, Elasticsearch uses this pipeline-level parameter as a fallback. The processors in this parameter run sequentially in the order specified. Elasticsearch will not attempt to run the pipeline's remaining processors.

    Hide on_failure attributes Show on_failure attributes object
    • append object
      Hide append attributes Show append attributes object
    • Hide attachment attributes Show attachment attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • The number of chars being used for extraction to prevent huge fields. Use -1 for no limit.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • properties array[string]

        Array of properties to select to be stored. Can be content, title, name, author, keywords, date, content_type, content_length, language.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true, the binary field will be removed from the document

      • Field containing the name of the resource to decode. If specified, the processor passes this resource name to the underlying Tika library to enable Resource Name Based Detection.

    • bytes object
      Hide bytes attributes Show bytes attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • circle object
      Hide circle attributes Show circle attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • error_distance number Required

        The difference between the resulting inscribed distance from center to side and the circle’s radius (measured in meters for geo_shape, unit-less for shape).

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • shape_type string Required

        Values are geo_shape or shape.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide community_id attributes Show community_id attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • seed number

        Seed for the community ID hash. Must be between 0 and 65535 (inclusive). The seed can prevent hash collisions between network domains, such as a staging and production network that use the same addressing scheme.

      • If true and any required fields are missing, the processor quietly exits without modifying the document.

    • convert object
      Hide convert attributes Show convert attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • type string Required

        Values are integer, long, double, float, boolean, ip, string, or auto.

    • csv object
      Hide csv attributes Show csv attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • Value used to fill empty fields. Empty fields are skipped if this is not provided. An empty field is one with no value (2 consecutive separators) or empty quotes ("").

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • quote string

        Quote used in CSV, has to be single character string.

      • Separator used in CSV, has to be single character string.

      • target_fields string | array[string] Required
      • trim boolean

        Trim whitespaces in unquoted fields.

    • date object
      Hide date attributes Show date attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • formats array[string] Required

        An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

      • locale string

        The locale to use when parsing the date, relevant when parsing month names or week days. Supports template snippets.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • timezone string

        The timezone to use when parsing the date. Supports template snippets.

      • The format to use when writing the date to target_field. Must be a valid java time pattern.

    • Hide date_index_name attributes Show date_index_name attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • date_formats array[string]

        An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

      • date_rounding string Required

        How to round the date when formatting the date into the index name. Valid values are: y (year), M (month), w (week), d (day), h (hour), m (minute) and s (second). Supports template snippets.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • The format to be used when printing the parsed date into the index name. A valid java time pattern is expected here. Supports template snippets.

      • A prefix of the index name to be prepended before the printed date. Supports template snippets.

      • locale string

        The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.

      • timezone string

        The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.

    • dissect object
      Hide dissect attributes Show dissect attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • The character(s) that separate the appended fields.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • pattern string Required

        The pattern to apply to the field.

    • Hide dot_expander attributes Show dot_expander attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • override boolean

        Controls the behavior when there is already an existing nested object that conflicts with the expanded field. When false, the processor will merge conflicts by combining the old and the new values into an array. When true, the value from the expanded field will overwrite the existing value.

      • path string

        The field that contains the field to expand. Only required if the field to expand is part another object field, because the field option can only understand leaf fields.

    • drop object
      Hide drop attributes Show drop attributes object
    • enrich object
      Hide enrich attributes Show enrich attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • The maximum number of matched documents to include under the configured target field. The target_field will be turned into a json array if max_matches is higher than 1, otherwise target_field will become a json object. In order to avoid documents getting too large, the maximum allowed value is 128.

      • override boolean

        If processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

      • policy_name string Required

        The name of the enrich policy to use.

      • Values are intersects, disjoint, within, or contains.

      • target_field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • fail object
      Hide fail attributes Show fail attributes object
    • Hide fingerprint attributes Show fingerprint attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • fields string | array[string] Required
      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • salt string

        Salt value for the hash function.

      • method string

        Values are MD5, SHA-1, SHA-256, SHA-512, or MurmurHash3.

      • If true, the processor ignores any missing fields. If all fields are missing, the processor silently exits without modifying the document.

    • foreach object
      Hide foreach attributes Show foreach attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true, the processor silently exits without changing the document if the field is null or missing.

      • processor object Required
    • Hide ip_location attributes Show ip_location attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • first_only boolean

        If true, only the first found IP location data will be returned, even if the field contains an array.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • properties array[string]

        Controls what properties are added to the target_field based on the IP location lookup.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

    • geo_grid object
      Hide geo_grid attributes Show geo_grid attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        The field to interpret as a geo-tile.= The field format is determined by the tile_type.

      • tile_type string Required

        Values are geotile, geohex, or geohash.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • Values are geojson or wkt.

    • geoip object
      Hide geoip attributes Show geoip attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • first_only boolean

        If true, only the first found geoip data will be returned, even if the field contains an array.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • properties array[string]

        Controls what properties are added to the target_field based on the geoip lookup.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

    • grok object
      Hide grok attributes Show grok attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • Must be disabled or v1. If v1, the processor uses patterns with Elastic Common Schema (ECS) field names.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • A map of pattern-name and pattern tuples defining custom patterns to be used by the current processor. Patterns matching existing names will override the pre-existing definition.

        Hide pattern_definitions attribute Show pattern_definitions attribute object
        • * string Additional properties
      • patterns array[string] Required

        An ordered list of grok expression to match and extract named captures with. Returns on the first expression in the list that matches.

      • When true, _ingest._grok_match_index will be inserted into your matched document’s metadata with the index into the pattern found in patterns that matched.

    • gsub object
      Hide gsub attributes Show gsub attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • pattern string Required

        The pattern to be replaced.

      • replacement string Required

        The string to replace the matching patterns with.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide html_strip attributes Show html_strip attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document,

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide inference attributes Show inference attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • model_id string Required
      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.

        Hide field_map attribute Show field_map attribute object
        • * object Additional properties
      • Hide inference_config attributes Show inference_config attributes object
        • Hide regression attributes Show regression attributes object
        • Hide classification attributes Show classification attributes object
      • input_output object | array[object]

        Input fields for inference and output (destination) fields for the inference results. This option is incompatible with the target_field and field_map options.

        One of:
        Hide attributes Show attributes
      • If true and any of the input fields defined in input_ouput are missing then those missing fields are quietly ignored, otherwise a missing field causes a failure. Only applies when using input_output configurations to explicitly list the input fields.

    • join object
      Hide join attributes Show join attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • separator string Required

        The separator character.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • json object
      Hide json attributes Show json attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • Flag that forces the parsed JSON to be added at the top level of the document. target_field must not be set when this option is chosen.

      • Values are replace or merge.

      • When set to true, the JSON parser will not fail if the JSON contains duplicate keys. Instead, the last encountered value for any duplicate key wins.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • kv object
      Hide kv attributes Show kv attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • exclude_keys array[string]

        List of keys to exclude from document.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • field_split string Required

        Regex pattern to use for splitting key-value pairs.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • include_keys array[string]

        List of keys to filter and insert into document. Defaults to including all keys.

      • prefix string

        Prefix to be added to extracted keys.

      • If true. strip brackets (), <>, [] as well as quotes ' and " from extracted values.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • trim_key string

        String of characters to trim from extracted keys.

      • String of characters to trim from extracted values.

      • value_split string Required

        Regex pattern to use for splitting the key from the value within a key-value pair.

    • Hide lowercase attributes Show lowercase attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide network_direction attributes Show network_direction attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • internal_networks array[string]

        List of internal networks. Supports IPv4 and IPv6 addresses and ranges in CIDR notation. Also supports the named ranges listed below. These may be constructed with template snippets. Must specify only one of internal_networks or internal_networks_field.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and any required fields are missing, the processor quietly exits without modifying the document.

    • pipeline object
      Hide pipeline attributes Show pipeline attributes object
    • redact object
      Hide redact attributes Show redact attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • patterns array[string] Required

        A list of grok expressions to match and redact named captures with

      • Hide pattern_definitions attribute Show pattern_definitions attribute object
        • * string Additional properties
      • prefix string

        Start a redacted section with this token

      • suffix string

        End a redacted section with this token

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • If true and the current license does not support running redact processors, then the processor quietly exits without modifying the document

      • If true then ingest metadata _ingest._redact._is_redacted is set to true if the document has been redacted

    • Hide registered_domain attributes Show registered_domain attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and any required fields are missing, the processor quietly exits without modifying the document.

    • remove object
      Hide remove attributes Show remove attributes object
    • rename object
      Hide rename attributes Show rename attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • target_field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • reroute object
      Hide reroute attributes Show reroute attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • A static value for the target. Can’t be set when the dataset or namespace option is set.

      • dataset string | array[string]

        Field references or a static value for the dataset part of the data stream name. In addition to the criteria for index names, cannot contain - and must be no longer than 100 characters. Example values are nginx.access and nginx.error.

        Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

        default {{data_stream.dataset}}

      • namespace string | array[string]

        Field references or a static value for the namespace part of the data stream name. See the criteria for index names for allowed characters. Must be no longer than 100 characters.

        Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

        default {{data_stream.namespace}}

    • script object
      Hide script attributes Show script attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • id string
      • lang string

        Any of:

        Values are painless, expression, mustache, or java.

      • params object

        Object containing parameters for the script.

        Hide params attribute Show params attribute object
        • * object Additional properties
      • source string | object

        One of:
    • set object
      Hide set attributes Show set attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and value is a template snippet that evaluates to null or the empty string, the processor quietly exits without modifying the document.

      • The media type for encoding value. Applies only when value is a template snippet. Must be one of application/json, text/plain, or application/x-www-form-urlencoded.

      • override boolean

        If true processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

      • value object

        The value to be set for the field. Supports template snippets. May specify only one of value or copy_from.

    • Hide set_security_user attributes Show set_security_user attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • properties array[string]

        Controls what user related properties are added to the field.

    • sort object
      Hide sort attributes Show sort attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • order string

        Values are asc or desc.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • split object
      Hide split attributes Show split attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • Preserves empty trailing fields, if any.

      • separator string Required

        A regex which matches the separator, for example, , or \s+.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide terminate attributes Show terminate attributes object
    • trim object
      Hide trim attributes Show trim attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide uppercase attributes Show uppercase attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide urldecode attributes Show urldecode attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide uri_parts attributes Show uri_parts attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • If true, the processor copies the unparsed URI to <target_field>.original.

      • If true, the processor removes the field after parsing the URI string. If parsing fails, the processor does not remove the field.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide user_agent attributes Show user_agent attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • The name of the file in the config/ingest-user-agent directory containing the regular expressions for parsing the user agent string. Both the directory and the file have to be created before starting Elasticsearch. If not specified, ingest-user-agent will use the regexes.yaml from uap-core it ships with.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • properties array[string]

        Controls what properties are added to target_field.

        Values are name, os, device, original, or version.

      • Extracts device type from the user agent string on a best-effort basis.

  • processors array[object]

    Processors used to perform transformations on documents before indexing. Processors run sequentially in the order specified.

    Hide processors attributes Show processors attributes object
    • append object
      Hide append attributes Show append attributes object
    • Hide attachment attributes Show attachment attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • The number of chars being used for extraction to prevent huge fields. Use -1 for no limit.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • properties array[string]

        Array of properties to select to be stored. Can be content, title, name, author, keywords, date, content_type, content_length, language.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true, the binary field will be removed from the document

      • Field containing the name of the resource to decode. If specified, the processor passes this resource name to the underlying Tika library to enable Resource Name Based Detection.

    • bytes object
      Hide bytes attributes Show bytes attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • circle object
      Hide circle attributes Show circle attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • error_distance number Required

        The difference between the resulting inscribed distance from center to side and the circle’s radius (measured in meters for geo_shape, unit-less for shape).

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • shape_type string Required

        Values are geo_shape or shape.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide community_id attributes Show community_id attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • seed number

        Seed for the community ID hash. Must be between 0 and 65535 (inclusive). The seed can prevent hash collisions between network domains, such as a staging and production network that use the same addressing scheme.

      • If true and any required fields are missing, the processor quietly exits without modifying the document.

    • convert object
      Hide convert attributes Show convert attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • type string Required

        Values are integer, long, double, float, boolean, ip, string, or auto.

    • csv object
      Hide csv attributes Show csv attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • Value used to fill empty fields. Empty fields are skipped if this is not provided. An empty field is one with no value (2 consecutive separators) or empty quotes ("").

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • quote string

        Quote used in CSV, has to be single character string.

      • Separator used in CSV, has to be single character string.

      • target_fields string | array[string] Required
      • trim boolean

        Trim whitespaces in unquoted fields.

    • date object
      Hide date attributes Show date attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • formats array[string] Required

        An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

      • locale string

        The locale to use when parsing the date, relevant when parsing month names or week days. Supports template snippets.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • timezone string

        The timezone to use when parsing the date. Supports template snippets.

      • The format to use when writing the date to target_field. Must be a valid java time pattern.

    • Hide date_index_name attributes Show date_index_name attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • date_formats array[string]

        An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

      • date_rounding string Required

        How to round the date when formatting the date into the index name. Valid values are: y (year), M (month), w (week), d (day), h (hour), m (minute) and s (second). Supports template snippets.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • The format to be used when printing the parsed date into the index name. A valid java time pattern is expected here. Supports template snippets.

      • A prefix of the index name to be prepended before the printed date. Supports template snippets.

      • locale string

        The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.

      • timezone string

        The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.

    • dissect object
      Hide dissect attributes Show dissect attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • The character(s) that separate the appended fields.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • pattern string Required

        The pattern to apply to the field.

    • Hide dot_expander attributes Show dot_expander attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • override boolean

        Controls the behavior when there is already an existing nested object that conflicts with the expanded field. When false, the processor will merge conflicts by combining the old and the new values into an array. When true, the value from the expanded field will overwrite the existing value.

      • path string

        The field that contains the field to expand. Only required if the field to expand is part another object field, because the field option can only understand leaf fields.

    • drop object
      Hide drop attributes Show drop attributes object
    • enrich object
      Hide enrich attributes Show enrich attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • The maximum number of matched documents to include under the configured target field. The target_field will be turned into a json array if max_matches is higher than 1, otherwise target_field will become a json object. In order to avoid documents getting too large, the maximum allowed value is 128.

      • override boolean

        If processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

      • policy_name string Required

        The name of the enrich policy to use.

      • Values are intersects, disjoint, within, or contains.

      • target_field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • fail object
      Hide fail attributes Show fail attributes object
    • Hide fingerprint attributes Show fingerprint attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • fields string | array[string] Required
      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • salt string

        Salt value for the hash function.

      • method string

        Values are MD5, SHA-1, SHA-256, SHA-512, or MurmurHash3.

      • If true, the processor ignores any missing fields. If all fields are missing, the processor silently exits without modifying the document.

    • foreach object
      Hide foreach attributes Show foreach attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true, the processor silently exits without changing the document if the field is null or missing.

      • processor object Required
    • Hide ip_location attributes Show ip_location attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • first_only boolean

        If true, only the first found IP location data will be returned, even if the field contains an array.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • properties array[string]

        Controls what properties are added to the target_field based on the IP location lookup.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

    • geo_grid object
      Hide geo_grid attributes Show geo_grid attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        The field to interpret as a geo-tile.= The field format is determined by the tile_type.

      • tile_type string Required

        Values are geotile, geohex, or geohash.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • Values are geojson or wkt.

    • geoip object
      Hide geoip attributes Show geoip attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • first_only boolean

        If true, only the first found geoip data will be returned, even if the field contains an array.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • properties array[string]

        Controls what properties are added to the target_field based on the geoip lookup.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

    • grok object
      Hide grok attributes Show grok attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • Must be disabled or v1. If v1, the processor uses patterns with Elastic Common Schema (ECS) field names.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • A map of pattern-name and pattern tuples defining custom patterns to be used by the current processor. Patterns matching existing names will override the pre-existing definition.

        Hide pattern_definitions attribute Show pattern_definitions attribute object
        • * string Additional properties
      • patterns array[string] Required

        An ordered list of grok expression to match and extract named captures with. Returns on the first expression in the list that matches.

      • When true, _ingest._grok_match_index will be inserted into your matched document’s metadata with the index into the pattern found in patterns that matched.

    • gsub object
      Hide gsub attributes Show gsub attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • pattern string Required

        The pattern to be replaced.

      • replacement string Required

        The string to replace the matching patterns with.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide html_strip attributes Show html_strip attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document,

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide inference attributes Show inference attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • model_id string Required
      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.

        Hide field_map attribute Show field_map attribute object
        • * object Additional properties
      • Hide inference_config attributes Show inference_config attributes object
        • Hide regression attributes Show regression attributes object
        • Hide classification attributes Show classification attributes object
      • input_output object | array[object]

        Input fields for inference and output (destination) fields for the inference results. This option is incompatible with the target_field and field_map options.

        One of:
        Hide attributes Show attributes
      • If true and any of the input fields defined in input_ouput are missing then those missing fields are quietly ignored, otherwise a missing field causes a failure. Only applies when using input_output configurations to explicitly list the input fields.

    • join object
      Hide join attributes Show join attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • separator string Required

        The separator character.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • json object
      Hide json attributes Show json attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • Flag that forces the parsed JSON to be added at the top level of the document. target_field must not be set when this option is chosen.

      • Values are replace or merge.

      • When set to true, the JSON parser will not fail if the JSON contains duplicate keys. Instead, the last encountered value for any duplicate key wins.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • kv object
      Hide kv attributes Show kv attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • exclude_keys array[string]

        List of keys to exclude from document.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • field_split string Required

        Regex pattern to use for splitting key-value pairs.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • include_keys array[string]

        List of keys to filter and insert into document. Defaults to including all keys.

      • prefix string

        Prefix to be added to extracted keys.

      • If true. strip brackets (), <>, [] as well as quotes ' and " from extracted values.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • trim_key string

        String of characters to trim from extracted keys.

      • String of characters to trim from extracted values.

      • value_split string Required

        Regex pattern to use for splitting the key from the value within a key-value pair.

    • Hide lowercase attributes Show lowercase attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide network_direction attributes Show network_direction attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • internal_networks array[string]

        List of internal networks. Supports IPv4 and IPv6 addresses and ranges in CIDR notation. Also supports the named ranges listed below. These may be constructed with template snippets. Must specify only one of internal_networks or internal_networks_field.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and any required fields are missing, the processor quietly exits without modifying the document.

    • pipeline object
      Hide pipeline attributes Show pipeline attributes object
    • redact object
      Hide redact attributes Show redact attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • patterns array[string] Required

        A list of grok expressions to match and redact named captures with

      • Hide pattern_definitions attribute Show pattern_definitions attribute object
        • * string Additional properties
      • prefix string

        Start a redacted section with this token

      • suffix string

        End a redacted section with this token

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • If true and the current license does not support running redact processors, then the processor quietly exits without modifying the document

      • If true then ingest metadata _ingest._redact._is_redacted is set to true if the document has been redacted

    • Hide registered_domain attributes Show registered_domain attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and any required fields are missing, the processor quietly exits without modifying the document.

    • remove object
      Hide remove attributes Show remove attributes object
    • rename object
      Hide rename attributes Show rename attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • target_field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • reroute object
      Hide reroute attributes Show reroute attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • A static value for the target. Can’t be set when the dataset or namespace option is set.

      • dataset string | array[string]

        Field references or a static value for the dataset part of the data stream name. In addition to the criteria for index names, cannot contain - and must be no longer than 100 characters. Example values are nginx.access and nginx.error.

        Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

        default {{data_stream.dataset}}

      • namespace string | array[string]

        Field references or a static value for the namespace part of the data stream name. See the criteria for index names for allowed characters. Must be no longer than 100 characters.

        Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

        default {{data_stream.namespace}}

    • script object
      Hide script attributes Show script attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • id string
      • lang string

        Any of:

        Values are painless, expression, mustache, or java.

      • params object

        Object containing parameters for the script.

        Hide params attribute Show params attribute object
        • * object Additional properties
      • source string | object

        One of:
    • set object
      Hide set attributes Show set attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and value is a template snippet that evaluates to null or the empty string, the processor quietly exits without modifying the document.

      • The media type for encoding value. Applies only when value is a template snippet. Must be one of application/json, text/plain, or application/x-www-form-urlencoded.

      • override boolean

        If true processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

      • value object

        The value to be set for the field. Supports template snippets. May specify only one of value or copy_from.

    • Hide set_security_user attributes Show set_security_user attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • properties array[string]

        Controls what user related properties are added to the field.

    • sort object
      Hide sort attributes Show sort attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • order string

        Values are asc or desc.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • split object
      Hide split attributes Show split attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • Preserves empty trailing fields, if any.

      • separator string Required

        A regex which matches the separator, for example, , or \s+.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide terminate attributes Show terminate attributes object
    • trim object
      Hide trim attributes Show trim attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide uppercase attributes Show uppercase attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide urldecode attributes Show urldecode attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide uri_parts attributes Show uri_parts attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • If true, the processor copies the unparsed URI to <target_field>.original.

      • If true, the processor removes the field after parsing the URI string. If parsing fails, the processor does not remove the field.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide user_agent attributes Show user_agent attributes object
      • Description of the processor. Useful for describing the purpose of the processor or its configuration.

      • if object
        Hide if attributes Show if attributes object
      • Ignore failures for the processor.

      • on_failure array[object]

        Handle failures for the processor.

      • tag string

        Identifier for the processor. Useful for debugging and metrics.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If true and field does not exist, the processor quietly exits without modifying the document.

      • The name of the file in the config/ingest-user-agent directory containing the regular expressions for parsing the user agent string. Both the directory and the file have to be created before starting Elasticsearch. If not specified, ingest-user-agent will use the regexes.yaml from uap-core it ships with.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • properties array[string]

        Controls what properties are added to target_field.

        Values are name, os, device, original, or version.

      • Extracts device type from the user agent string on a best-effort basis.

  • version number
  • deprecated boolean

    Marks this ingest pipeline as deprecated. When a deprecated ingest pipeline is referenced as the default or final pipeline when creating or updating a non-deprecated index template, Elasticsearch will emit a deprecation warning.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /_ingest/pipeline/{id}
curl \
 --request PUT 'http://api.example.com/_ingest/pipeline/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"description\" : \"My optional pipeline description\",\n  \"processors\" : [\n    {\n      \"set\" : {\n        \"description\" : \"My optional processor description\",\n        \"field\": \"my-keyword-field\",\n        \"value\": \"foo\"\n      }\n    }\n  ]\n}"'
{
  "description" : "My optional pipeline description",
  "processors" : [
    {
      "set" : {
        "description" : "My optional processor description",
        "field": "my-keyword-field",
        "value": "foo"
      }
    }
  ]
}
You can use the `_meta` parameter to add arbitrary metadata to a pipeline.
{
  "description" : "My optional pipeline description",
  "processors" : [
    {
      "set" : {
        "description" : "My optional processor description",
        "field": "my-keyword-field",
        "value": "foo"
      }
    }
  ],
  "_meta": {
    "reason": "set my-keyword-field to foo",
    "serialization": {
      "class": "MyPipeline",
      "id": 10
    }
  }
}

Delete pipelines Added in 5.0.0

DELETE /_ingest/pipeline/{id}

Delete one or more ingest pipelines.

External documentation

Path parameters

  • id string Required

    Pipeline ID or wildcard expression of pipeline IDs used to limit the request. To delete all ingest pipelines in a cluster, use a value of *.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ingest/pipeline/{id}
curl \
 --request DELETE 'http://api.example.com/_ingest/pipeline/{id}' \
 --header "Authorization: $API_KEY"

Get pipelines Added in 5.0.0

GET /_ingest/pipeline

Get information about one or more ingest pipelines. This API returns a local reference of the pipeline.

External documentation

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • summary boolean

    Return pipelines without their definitions (default: false)

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • Description of the ingest pipeline.

      • on_failure array[object]

        Processors to run immediately after a processor failure.

        Hide on_failure attributes Show on_failure attributes object
        • append object
          Hide append attributes Show append attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If false, the processor does not append values already present in the field.

        • Hide attachment attributes Show attachment attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • The number of chars being used for extraction to prevent huge fields. Use -1 for no limit.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • properties array[string]

            Array of properties to select to be stored. Can be content, title, name, author, keywords, date, content_type, content_length, language.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true, the binary field will be removed from the document

          • Field containing the name of the resource to decode. If specified, the processor passes this resource name to the underlying Tika library to enable Resource Name Based Detection.

        • bytes object
          Hide bytes attributes Show bytes attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • circle object
          Hide circle attributes Show circle attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • error_distance number Required

            The difference between the resulting inscribed distance from center to side and the circle’s radius (measured in meters for geo_shape, unit-less for shape).

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • shape_type string Required

            Values are geo_shape or shape.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide community_id attributes Show community_id attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • seed number

            Seed for the community ID hash. Must be between 0 and 65535 (inclusive). The seed can prevent hash collisions between network domains, such as a staging and production network that use the same addressing scheme.

          • If true and any required fields are missing, the processor quietly exits without modifying the document.

        • convert object
          Hide convert attributes Show convert attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • type string Required

            Values are integer, long, double, float, boolean, ip, string, or auto.

        • csv object
          Hide csv attributes Show csv attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Value used to fill empty fields. Empty fields are skipped if this is not provided. An empty field is one with no value (2 consecutive separators) or empty quotes ("").

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • quote string

            Quote used in CSV, has to be single character string.

          • Separator used in CSV, has to be single character string.

          • target_fields string | array[string] Required
          • trim boolean

            Trim whitespaces in unquoted fields.

        • date object
          Hide date attributes Show date attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • formats array[string] Required

            An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

          • locale string

            The locale to use when parsing the date, relevant when parsing month names or week days. Supports template snippets.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • timezone string

            The timezone to use when parsing the date. Supports template snippets.

          • The format to use when writing the date to target_field. Must be a valid java time pattern.

        • Hide date_index_name attributes Show date_index_name attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • date_formats array[string]

            An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

          • date_rounding string Required

            How to round the date when formatting the date into the index name. Valid values are: y (year), M (month), w (week), d (day), h (hour), m (minute) and s (second). Supports template snippets.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • The format to be used when printing the parsed date into the index name. A valid java time pattern is expected here. Supports template snippets.

          • A prefix of the index name to be prepended before the printed date. Supports template snippets.

          • locale string

            The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.

          • timezone string

            The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.

        • dissect object
          Hide dissect attributes Show dissect attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • The character(s) that separate the appended fields.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • pattern string Required

            The pattern to apply to the field.

        • Hide dot_expander attributes Show dot_expander attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • override boolean

            Controls the behavior when there is already an existing nested object that conflicts with the expanded field. When false, the processor will merge conflicts by combining the old and the new values into an array. When true, the value from the expanded field will overwrite the existing value.

          • path string

            The field that contains the field to expand. Only required if the field to expand is part another object field, because the field option can only understand leaf fields.

        • drop object
          Hide drop attributes Show drop attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

        • enrich object
          Hide enrich attributes Show enrich attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • The maximum number of matched documents to include under the configured target field. The target_field will be turned into a json array if max_matches is higher than 1, otherwise target_field will become a json object. In order to avoid documents getting too large, the maximum allowed value is 128.

          • override boolean

            If processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

          • policy_name string Required

            The name of the enrich policy to use.

          • Values are intersects, disjoint, within, or contains.

          • target_field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • fail object
          Hide fail attributes Show fail attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • message string Required

            The error message thrown by the processor. Supports template snippets.

        • Hide fingerprint attributes Show fingerprint attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • fields string | array[string] Required
          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • salt string

            Salt value for the hash function.

          • method string

            Values are MD5, SHA-1, SHA-256, SHA-512, or MurmurHash3.

          • If true, the processor ignores any missing fields. If all fields are missing, the processor silently exits without modifying the document.

        • foreach object
          Hide foreach attributes Show foreach attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true, the processor silently exits without changing the document if the field is null or missing.

          • processor object Required
        • Hide ip_location attributes Show ip_location attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • first_only boolean

            If true, only the first found IP location data will be returned, even if the field contains an array.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • properties array[string]

            Controls what properties are added to the target_field based on the IP location lookup.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

        • geo_grid object
          Hide geo_grid attributes Show geo_grid attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            The field to interpret as a geo-tile.= The field format is determined by the tile_type.

          • tile_type string Required

            Values are geotile, geohex, or geohash.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • Values are geojson or wkt.

        • geoip object
          Hide geoip attributes Show geoip attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • first_only boolean

            If true, only the first found geoip data will be returned, even if the field contains an array.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • properties array[string]

            Controls what properties are added to the target_field based on the geoip lookup.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

        • grok object
          Hide grok attributes Show grok attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Must be disabled or v1. If v1, the processor uses patterns with Elastic Common Schema (ECS) field names.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • A map of pattern-name and pattern tuples defining custom patterns to be used by the current processor. Patterns matching existing names will override the pre-existing definition.

          • patterns array[string] Required

            An ordered list of grok expression to match and extract named captures with. Returns on the first expression in the list that matches.

          • When true, _ingest._grok_match_index will be inserted into your matched document’s metadata with the index into the pattern found in patterns that matched.

        • gsub object
          Hide gsub attributes Show gsub attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • pattern string Required

            The pattern to be replaced.

          • replacement string Required

            The string to replace the matching patterns with.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide html_strip attributes Show html_strip attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document,

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide inference attributes Show inference attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • model_id string Required
          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.

          • If true and any of the input fields defined in input_ouput are missing then those missing fields are quietly ignored, otherwise a missing field causes a failure. Only applies when using input_output configurations to explicitly list the input fields.

        • join object
          Hide join attributes Show join attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • separator string Required

            The separator character.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • json object
          Hide json attributes Show json attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Flag that forces the parsed JSON to be added at the top level of the document. target_field must not be set when this option is chosen.

          • Values are replace or merge.

          • When set to true, the JSON parser will not fail if the JSON contains duplicate keys. Instead, the last encountered value for any duplicate key wins.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • kv object
          Hide kv attributes Show kv attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • exclude_keys array[string]

            List of keys to exclude from document.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • field_split string Required

            Regex pattern to use for splitting key-value pairs.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • include_keys array[string]

            List of keys to filter and insert into document. Defaults to including all keys.

          • prefix string

            Prefix to be added to extracted keys.

          • If true. strip brackets (), <>, [] as well as quotes ' and " from extracted values.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • trim_key string

            String of characters to trim from extracted keys.

          • String of characters to trim from extracted values.

          • value_split string Required

            Regex pattern to use for splitting the key from the value within a key-value pair.

        • Hide lowercase attributes Show lowercase attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide network_direction attributes Show network_direction attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • internal_networks array[string]

            List of internal networks. Supports IPv4 and IPv6 addresses and ranges in CIDR notation. Also supports the named ranges listed below. These may be constructed with template snippets. Must specify only one of internal_networks or internal_networks_field.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and any required fields are missing, the processor quietly exits without modifying the document.

        • pipeline object
          Hide pipeline attributes Show pipeline attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • name string Required
          • Whether to ignore missing pipelines instead of failing.

        • redact object
          Hide redact attributes Show redact attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • patterns array[string] Required

            A list of grok expressions to match and redact named captures with

          • prefix string

            Start a redacted section with this token

          • suffix string

            End a redacted section with this token

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • If true and the current license does not support running redact processors, then the processor quietly exits without modifying the document

          • If true then ingest metadata _ingest._redact._is_redacted is set to true if the document has been redacted

        • Hide registered_domain attributes Show registered_domain attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and any required fields are missing, the processor quietly exits without modifying the document.

        • remove object
          Hide remove attributes Show remove attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string | array[string] Required
          • keep string | array[string]
          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • rename object
          Hide rename attributes Show rename attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • target_field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • reroute object
          Hide reroute attributes Show reroute attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • A static value for the target. Can’t be set when the dataset or namespace option is set.

        • script object
          Hide script attributes Show script attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • id string
          • params object

            Object containing parameters for the script.

        • set object
          Hide set attributes Show set attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and value is a template snippet that evaluates to null or the empty string, the processor quietly exits without modifying the document.

          • The media type for encoding value. Applies only when value is a template snippet. Must be one of application/json, text/plain, or application/x-www-form-urlencoded.

          • override boolean

            If true processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

          • value object

            The value to be set for the field. Supports template snippets. May specify only one of value or copy_from.

        • Hide set_security_user attributes Show set_security_user attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • properties array[string]

            Controls what user related properties are added to the field.

        • sort object
          Hide sort attributes Show sort attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • order string

            Values are asc or desc.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • split object
          Hide split attributes Show split attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • Preserves empty trailing fields, if any.

          • separator string Required

            A regex which matches the separator, for example, , or \s+.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide terminate attributes Show terminate attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

        • trim object
          Hide trim attributes Show trim attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide uppercase attributes Show uppercase attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide urldecode attributes Show urldecode attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide uri_parts attributes Show uri_parts attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • If true, the processor copies the unparsed URI to <target_field>.original.

          • If true, the processor removes the field after parsing the URI string. If parsing fails, the processor does not remove the field.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide user_agent attributes Show user_agent attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • The name of the file in the config/ingest-user-agent directory containing the regular expressions for parsing the user agent string. Both the directory and the file have to be created before starting Elasticsearch. If not specified, ingest-user-agent will use the regexes.yaml from uap-core it ships with.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • properties array[string]

            Controls what properties are added to target_field.

            Values are name, os, device, original, or version.

          • Extracts device type from the user agent string on a best-effort basis.

      • processors array[object]

        Processors used to perform transformations on documents before indexing. Processors run sequentially in the order specified.

        Hide processors attributes Show processors attributes object
        • append object
          Hide append attributes Show append attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If false, the processor does not append values already present in the field.

        • Hide attachment attributes Show attachment attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • The number of chars being used for extraction to prevent huge fields. Use -1 for no limit.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • properties array[string]

            Array of properties to select to be stored. Can be content, title, name, author, keywords, date, content_type, content_length, language.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true, the binary field will be removed from the document

          • Field containing the name of the resource to decode. If specified, the processor passes this resource name to the underlying Tika library to enable Resource Name Based Detection.

        • bytes object
          Hide bytes attributes Show bytes attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • circle object
          Hide circle attributes Show circle attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • error_distance number Required

            The difference between the resulting inscribed distance from center to side and the circle’s radius (measured in meters for geo_shape, unit-less for shape).

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • shape_type string Required

            Values are geo_shape or shape.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide community_id attributes Show community_id attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • seed number

            Seed for the community ID hash. Must be between 0 and 65535 (inclusive). The seed can prevent hash collisions between network domains, such as a staging and production network that use the same addressing scheme.

          • If true and any required fields are missing, the processor quietly exits without modifying the document.

        • convert object
          Hide convert attributes Show convert attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • type string Required

            Values are integer, long, double, float, boolean, ip, string, or auto.

        • csv object
          Hide csv attributes Show csv attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Value used to fill empty fields. Empty fields are skipped if this is not provided. An empty field is one with no value (2 consecutive separators) or empty quotes ("").

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • quote string

            Quote used in CSV, has to be single character string.

          • Separator used in CSV, has to be single character string.

          • target_fields string | array[string] Required
          • trim boolean

            Trim whitespaces in unquoted fields.

        • date object
          Hide date attributes Show date attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • formats array[string] Required

            An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

          • locale string

            The locale to use when parsing the date, relevant when parsing month names or week days. Supports template snippets.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • timezone string

            The timezone to use when parsing the date. Supports template snippets.

          • The format to use when writing the date to target_field. Must be a valid java time pattern.

        • Hide date_index_name attributes Show date_index_name attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • date_formats array[string]

            An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

          • date_rounding string Required

            How to round the date when formatting the date into the index name. Valid values are: y (year), M (month), w (week), d (day), h (hour), m (minute) and s (second). Supports template snippets.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • The format to be used when printing the parsed date into the index name. A valid java time pattern is expected here. Supports template snippets.

          • A prefix of the index name to be prepended before the printed date. Supports template snippets.

          • locale string

            The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.

          • timezone string

            The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.

        • dissect object
          Hide dissect attributes Show dissect attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • The character(s) that separate the appended fields.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • pattern string Required

            The pattern to apply to the field.

        • Hide dot_expander attributes Show dot_expander attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • override boolean

            Controls the behavior when there is already an existing nested object that conflicts with the expanded field. When false, the processor will merge conflicts by combining the old and the new values into an array. When true, the value from the expanded field will overwrite the existing value.

          • path string

            The field that contains the field to expand. Only required if the field to expand is part another object field, because the field option can only understand leaf fields.

        • drop object
          Hide drop attributes Show drop attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

        • enrich object
          Hide enrich attributes Show enrich attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • The maximum number of matched documents to include under the configured target field. The target_field will be turned into a json array if max_matches is higher than 1, otherwise target_field will become a json object. In order to avoid documents getting too large, the maximum allowed value is 128.

          • override boolean

            If processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

          • policy_name string Required

            The name of the enrich policy to use.

          • Values are intersects, disjoint, within, or contains.

          • target_field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • fail object
          Hide fail attributes Show fail attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • message string Required

            The error message thrown by the processor. Supports template snippets.

        • Hide fingerprint attributes Show fingerprint attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • fields string | array[string] Required
          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • salt string

            Salt value for the hash function.

          • method string

            Values are MD5, SHA-1, SHA-256, SHA-512, or MurmurHash3.

          • If true, the processor ignores any missing fields. If all fields are missing, the processor silently exits without modifying the document.

        • foreach object
          Hide foreach attributes Show foreach attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true, the processor silently exits without changing the document if the field is null or missing.

          • processor object Required
        • Hide ip_location attributes Show ip_location attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • first_only boolean

            If true, only the first found IP location data will be returned, even if the field contains an array.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • properties array[string]

            Controls what properties are added to the target_field based on the IP location lookup.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

        • geo_grid object
          Hide geo_grid attributes Show geo_grid attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            The field to interpret as a geo-tile.= The field format is determined by the tile_type.

          • tile_type string Required

            Values are geotile, geohex, or geohash.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • Values are geojson or wkt.

        • geoip object
          Hide geoip attributes Show geoip attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • first_only boolean

            If true, only the first found geoip data will be returned, even if the field contains an array.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • properties array[string]

            Controls what properties are added to the target_field based on the geoip lookup.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

        • grok object
          Hide grok attributes Show grok attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Must be disabled or v1. If v1, the processor uses patterns with Elastic Common Schema (ECS) field names.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • A map of pattern-name and pattern tuples defining custom patterns to be used by the current processor. Patterns matching existing names will override the pre-existing definition.

          • patterns array[string] Required

            An ordered list of grok expression to match and extract named captures with. Returns on the first expression in the list that matches.

          • When true, _ingest._grok_match_index will be inserted into your matched document’s metadata with the index into the pattern found in patterns that matched.

        • gsub object
          Hide gsub attributes Show gsub attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • pattern string Required

            The pattern to be replaced.

          • replacement string Required

            The string to replace the matching patterns with.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide html_strip attributes Show html_strip attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document,

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide inference attributes Show inference attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • model_id string Required
          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.

          • If true and any of the input fields defined in input_ouput are missing then those missing fields are quietly ignored, otherwise a missing field causes a failure. Only applies when using input_output configurations to explicitly list the input fields.

        • join object
          Hide join attributes Show join attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • separator string Required

            The separator character.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • json object
          Hide json attributes Show json attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Flag that forces the parsed JSON to be added at the top level of the document. target_field must not be set when this option is chosen.

          • Values are replace or merge.

          • When set to true, the JSON parser will not fail if the JSON contains duplicate keys. Instead, the last encountered value for any duplicate key wins.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • kv object
          Hide kv attributes Show kv attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • exclude_keys array[string]

            List of keys to exclude from document.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • field_split string Required

            Regex pattern to use for splitting key-value pairs.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • include_keys array[string]

            List of keys to filter and insert into document. Defaults to including all keys.

          • prefix string

            Prefix to be added to extracted keys.

          • If true. strip brackets (), <>, [] as well as quotes ' and " from extracted values.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • trim_key string

            String of characters to trim from extracted keys.

          • String of characters to trim from extracted values.

          • value_split string Required

            Regex pattern to use for splitting the key from the value within a key-value pair.

        • Hide lowercase attributes Show lowercase attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide network_direction attributes Show network_direction attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • internal_networks array[string]

            List of internal networks. Supports IPv4 and IPv6 addresses and ranges in CIDR notation. Also supports the named ranges listed below. These may be constructed with template snippets. Must specify only one of internal_networks or internal_networks_field.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and any required fields are missing, the processor quietly exits without modifying the document.

        • pipeline object
          Hide pipeline attributes Show pipeline attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • name string Required
          • Whether to ignore missing pipelines instead of failing.

        • redact object
          Hide redact attributes Show redact attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • patterns array[string] Required

            A list of grok expressions to match and redact named captures with

          • prefix string

            Start a redacted section with this token

          • suffix string

            End a redacted section with this token

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • If true and the current license does not support running redact processors, then the processor quietly exits without modifying the document

          • If true then ingest metadata _ingest._redact._is_redacted is set to true if the document has been redacted

        • Hide registered_domain attributes Show registered_domain attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and any required fields are missing, the processor quietly exits without modifying the document.

        • remove object
          Hide remove attributes Show remove attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string | array[string] Required
          • keep string | array[string]
          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • rename object
          Hide rename attributes Show rename attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • target_field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • reroute object
          Hide reroute attributes Show reroute attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • A static value for the target. Can’t be set when the dataset or namespace option is set.

        • script object
          Hide script attributes Show script attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • id string
          • params object

            Object containing parameters for the script.

        • set object
          Hide set attributes Show set attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and value is a template snippet that evaluates to null or the empty string, the processor quietly exits without modifying the document.

          • The media type for encoding value. Applies only when value is a template snippet. Must be one of application/json, text/plain, or application/x-www-form-urlencoded.

          • override boolean

            If true processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

          • value object

            The value to be set for the field. Supports template snippets. May specify only one of value or copy_from.

        • Hide set_security_user attributes Show set_security_user attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • properties array[string]

            Controls what user related properties are added to the field.

        • sort object
          Hide sort attributes Show sort attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • order string

            Values are asc or desc.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • split object
          Hide split attributes Show split attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • Preserves empty trailing fields, if any.

          • separator string Required

            A regex which matches the separator, for example, , or \s+.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide terminate attributes Show terminate attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

        • trim object
          Hide trim attributes Show trim attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide uppercase attributes Show uppercase attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide urldecode attributes Show urldecode attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist or is null, the processor quietly exits without modifying the document.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide uri_parts attributes Show uri_parts attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • If true, the processor copies the unparsed URI to <target_field>.original.

          • If true, the processor removes the field after parsing the URI string. If parsing fails, the processor does not remove the field.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Hide user_agent attributes Show user_agent attributes object
          • Description of the processor. Useful for describing the purpose of the processor or its configuration.

          • if object
          • Ignore failures for the processor.

          • on_failure array[object]

            Handle failures for the processor.

          • tag string

            Identifier for the processor. Useful for debugging and metrics.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • If true and field does not exist, the processor quietly exits without modifying the document.

          • The name of the file in the config/ingest-user-agent directory containing the regular expressions for parsing the user agent string. Both the directory and the file have to be created before starting Elasticsearch. If not specified, ingest-user-agent will use the regexes.yaml from uap-core it ships with.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • properties array[string]

            Controls what properties are added to target_field.

            Values are name, os, device, original, or version.

          • Extracts device type from the user agent string on a best-effort basis.

      • version number
      • deprecated boolean

        Marks this ingest pipeline as deprecated. When a deprecated ingest pipeline is referenced as the default or final pipeline when creating or updating a non-deprecated index template, Elasticsearch will emit a deprecation warning.

      • _meta object
        Hide _meta attribute Show _meta attribute object
        • * object Additional properties
GET /_ingest/pipeline
curl \
 --request GET 'http://api.example.com/_ingest/pipeline' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response for retrieving information about an ingest pipeline.
{
  "my-pipeline-id" : {
    "description" : "describe pipeline",
    "version" : 123,
    "processors" : [
      {
        "set" : {
          "field" : "foo",
          "value" : "bar"
        }
      }
    ]
  }
}

Run a grok processor Added in 6.1.0

GET /_ingest/processor/grok

Extract structured fields out of a single text field within a document. You must choose which field to extract matched fields from, as well as the grok pattern you expect will match. A grok pattern is like a regular expression that supports aliased expressions that can be reused.

External documentation

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • patterns object Required
      Hide patterns attribute Show patterns attribute object
      • * string Additional properties
GET /_ingest/processor/grok
curl \
 --request GET 'http://api.example.com/_ingest/processor/grok' \
 --header "Authorization: $API_KEY"

Simulate a pipeline Added in 5.0.0

GET /_ingest/pipeline/_simulate

Run an ingest pipeline against a set of provided documents. You can either specify an existing pipeline to use with the provided documents or supply a pipeline definition in the body of the request.

Query parameters

  • verbose boolean

    If true, the response includes output data for each processor in the executed pipeline.

application/json

Body Required

  • docs array[object] Required

    Sample documents to test in the pipeline.

    Hide docs attributes Show docs attributes object
  • pipeline object Additional properties
    Hide pipeline attributes Show pipeline attributes object
    • Description of the ingest pipeline.

    • on_failure array[object]

      Processors to run immediately after a processor failure.

      Hide on_failure attributes Show on_failure attributes object
      • append object
        Hide append attributes Show append attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • value object | array[object] Required

          The value to be appended. Supports template snippets.

        • If false, the processor does not append values already present in the field.

      • Hide attachment attributes Show attachment attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The number of chars being used for extraction to prevent huge fields. Use -1 for no limit.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Array of properties to select to be stored. Can be content, title, name, author, keywords, date, content_type, content_length, language.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true, the binary field will be removed from the document

        • Field containing the name of the resource to decode. If specified, the processor passes this resource name to the underlying Tika library to enable Resource Name Based Detection.

      • bytes object
        Hide bytes attributes Show bytes attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • circle object
        Hide circle attributes Show circle attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • error_distance number Required

          The difference between the resulting inscribed distance from center to side and the circle’s radius (measured in meters for geo_shape, unit-less for shape).

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • shape_type string Required

          Values are geo_shape or shape.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide community_id attributes Show community_id attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • seed number

          Seed for the community ID hash. Must be between 0 and 65535 (inclusive). The seed can prevent hash collisions between network domains, such as a staging and production network that use the same addressing scheme.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • convert object
        Hide convert attributes Show convert attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • type string Required

          Values are integer, long, double, float, boolean, ip, string, or auto.

      • csv object
        Hide csv attributes Show csv attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Value used to fill empty fields. Empty fields are skipped if this is not provided. An empty field is one with no value (2 consecutive separators) or empty quotes ("").

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • quote string

          Quote used in CSV, has to be single character string.

        • Separator used in CSV, has to be single character string.

        • target_fields string | array[string] Required
        • trim boolean

          Trim whitespaces in unquoted fields.

      • date object
        Hide date attributes Show date attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • formats array[string] Required

          An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

        • locale string

          The locale to use when parsing the date, relevant when parsing month names or week days. Supports template snippets.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • timezone string

          The timezone to use when parsing the date. Supports template snippets.

        • The format to use when writing the date to target_field. Must be a valid java time pattern.

      • Hide date_index_name attributes Show date_index_name attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • date_formats array[string]

          An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

        • date_rounding string Required

          How to round the date when formatting the date into the index name. Valid values are: y (year), M (month), w (week), d (day), h (hour), m (minute) and s (second). Supports template snippets.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • The format to be used when printing the parsed date into the index name. A valid java time pattern is expected here. Supports template snippets.

        • A prefix of the index name to be prepended before the printed date. Supports template snippets.

        • locale string

          The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.

        • timezone string

          The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.

      • dissect object
        Hide dissect attributes Show dissect attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The character(s) that separate the appended fields.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • pattern string Required

          The pattern to apply to the field.

      • Hide dot_expander attributes Show dot_expander attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • override boolean

          Controls the behavior when there is already an existing nested object that conflicts with the expanded field. When false, the processor will merge conflicts by combining the old and the new values into an array. When true, the value from the expanded field will overwrite the existing value.

        • path string

          The field that contains the field to expand. Only required if the field to expand is part another object field, because the field option can only understand leaf fields.

      • drop object
        Hide drop attributes Show drop attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

      • enrich object
        Hide enrich attributes Show enrich attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The maximum number of matched documents to include under the configured target field. The target_field will be turned into a json array if max_matches is higher than 1, otherwise target_field will become a json object. In order to avoid documents getting too large, the maximum allowed value is 128.

        • override boolean

          If processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

        • policy_name string Required

          The name of the enrich policy to use.

        • Values are intersects, disjoint, within, or contains.

        • target_field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • fail object
        Hide fail attributes Show fail attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • message string Required

          The error message thrown by the processor. Supports template snippets.

      • Hide fingerprint attributes Show fingerprint attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • fields string | array[string] Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • salt string

          Salt value for the hash function.

        • method string

          Values are MD5, SHA-1, SHA-256, SHA-512, or MurmurHash3.

        • If true, the processor ignores any missing fields. If all fields are missing, the processor silently exits without modifying the document.

      • foreach object
        Hide foreach attributes Show foreach attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true, the processor silently exits without changing the document if the field is null or missing.

        • processor object Required
      • Hide ip_location attributes Show ip_location attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • first_only boolean

          If true, only the first found IP location data will be returned, even if the field contains an array.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • properties array[string]

          Controls what properties are added to the target_field based on the IP location lookup.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

      • geo_grid object
        Hide geo_grid attributes Show geo_grid attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          The field to interpret as a geo-tile.= The field format is determined by the tile_type.

        • tile_type string Required

          Values are geotile, geohex, or geohash.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Values are geojson or wkt.

      • geoip object
        Hide geoip attributes Show geoip attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • first_only boolean

          If true, only the first found geoip data will be returned, even if the field contains an array.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • properties array[string]

          Controls what properties are added to the target_field based on the geoip lookup.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

      • grok object
        Hide grok attributes Show grok attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Must be disabled or v1. If v1, the processor uses patterns with Elastic Common Schema (ECS) field names.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • A map of pattern-name and pattern tuples defining custom patterns to be used by the current processor. Patterns matching existing names will override the pre-existing definition.

          Hide pattern_definitions attribute Show pattern_definitions attribute object
          • * string Additional properties
        • patterns array[string] Required

          An ordered list of grok expression to match and extract named captures with. Returns on the first expression in the list that matches.

        • When true, _ingest._grok_match_index will be inserted into your matched document’s metadata with the index into the pattern found in patterns that matched.

      • gsub object
        Hide gsub attributes Show gsub attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • pattern string Required

          The pattern to be replaced.

        • replacement string Required

          The string to replace the matching patterns with.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide html_strip attributes Show html_strip attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document,

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide inference attributes Show inference attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • model_id string Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.

          Hide field_map attribute Show field_map attribute object
          • * object Additional properties
        • Hide inference_config attributes Show inference_config attributes object
        • input_output object | array[object]

          Input fields for inference and output (destination) fields for the inference results. This option is incompatible with the target_field and field_map options.

        • If true and any of the input fields defined in input_ouput are missing then those missing fields are quietly ignored, otherwise a missing field causes a failure. Only applies when using input_output configurations to explicitly list the input fields.

      • join object
        Hide join attributes Show join attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • separator string Required

          The separator character.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • json object
        Hide json attributes Show json attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Flag that forces the parsed JSON to be added at the top level of the document. target_field must not be set when this option is chosen.

        • Values are replace or merge.

        • When set to true, the JSON parser will not fail if the JSON contains duplicate keys. Instead, the last encountered value for any duplicate key wins.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • kv object
        Hide kv attributes Show kv attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • exclude_keys array[string]

          List of keys to exclude from document.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • field_split string Required

          Regex pattern to use for splitting key-value pairs.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • include_keys array[string]

          List of keys to filter and insert into document. Defaults to including all keys.

        • prefix string

          Prefix to be added to extracted keys.

        • If true. strip brackets (), <>, [] as well as quotes ' and " from extracted values.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • trim_key string

          String of characters to trim from extracted keys.

        • String of characters to trim from extracted values.

        • value_split string Required

          Regex pattern to use for splitting the key from the value within a key-value pair.

      • Hide lowercase attributes Show lowercase attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide network_direction attributes Show network_direction attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • internal_networks array[string]

          List of internal networks. Supports IPv4 and IPv6 addresses and ranges in CIDR notation. Also supports the named ranges listed below. These may be constructed with template snippets. Must specify only one of internal_networks or internal_networks_field.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • pipeline object
        Hide pipeline attributes Show pipeline attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • name string Required
        • Whether to ignore missing pipelines instead of failing.

      • redact object
        Hide redact attributes Show redact attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • patterns array[string] Required

          A list of grok expressions to match and redact named captures with

        • Hide pattern_definitions attribute Show pattern_definitions attribute object
          • * string Additional properties
        • prefix string

          Start a redacted section with this token

        • suffix string

          End a redacted section with this token

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • If true and the current license does not support running redact processors, then the processor quietly exits without modifying the document

        • If true then ingest metadata _ingest._redact._is_redacted is set to true if the document has been redacted

      • Hide registered_domain attributes Show registered_domain attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • remove object
        Hide remove attributes Show remove attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string | array[string] Required
        • keep string | array[string]
        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • rename object
        Hide rename attributes Show rename attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • target_field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • reroute object
        Hide reroute attributes Show reroute attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • A static value for the target. Can’t be set when the dataset or namespace option is set.

        • dataset string | array[string]

          Field references or a static value for the dataset part of the data stream name. In addition to the criteria for index names, cannot contain - and must be no longer than 100 characters. Example values are nginx.access and nginx.error.

          Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

          default {{data_stream.dataset}}

        • namespace string | array[string]

          Field references or a static value for the namespace part of the data stream name. See the criteria for index names for allowed characters. Must be no longer than 100 characters.

          Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

          default {{data_stream.namespace}}

      • script object
        Hide script attributes Show script attributes object
      • set object
        Hide set attributes Show set attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and value is a template snippet that evaluates to null or the empty string, the processor quietly exits without modifying the document.

        • The media type for encoding value. Applies only when value is a template snippet. Must be one of application/json, text/plain, or application/x-www-form-urlencoded.

        • override boolean

          If true processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

        • value object

          The value to be set for the field. Supports template snippets. May specify only one of value or copy_from.

      • Hide set_security_user attributes Show set_security_user attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Controls what user related properties are added to the field.

      • sort object
        Hide sort attributes Show sort attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • order string

          Values are asc or desc.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • split object
        Hide split attributes Show split attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Preserves empty trailing fields, if any.

        • separator string Required

          A regex which matches the separator, for example, , or \s+.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide terminate attributes Show terminate attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

      • trim object
        Hide trim attributes Show trim attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide uppercase attributes Show uppercase attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide urldecode attributes Show urldecode attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide uri_parts attributes Show uri_parts attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • If true, the processor copies the unparsed URI to <target_field>.original.

        • If true, the processor removes the field after parsing the URI string. If parsing fails, the processor does not remove the field.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide user_agent attributes Show user_agent attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The name of the file in the config/ingest-user-agent directory containing the regular expressions for parsing the user agent string. Both the directory and the file have to be created before starting Elasticsearch. If not specified, ingest-user-agent will use the regexes.yaml from uap-core it ships with.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Controls what properties are added to target_field.

          Values are name, os, device, original, or version.

        • Extracts device type from the user agent string on a best-effort basis.

    • processors array[object]

      Processors used to perform transformations on documents before indexing. Processors run sequentially in the order specified.

      Hide processors attributes Show processors attributes object
      • append object
        Hide append attributes Show append attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • value object | array[object] Required

          The value to be appended. Supports template snippets.

        • If false, the processor does not append values already present in the field.

      • Hide attachment attributes Show attachment attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The number of chars being used for extraction to prevent huge fields. Use -1 for no limit.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Array of properties to select to be stored. Can be content, title, name, author, keywords, date, content_type, content_length, language.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true, the binary field will be removed from the document

        • Field containing the name of the resource to decode. If specified, the processor passes this resource name to the underlying Tika library to enable Resource Name Based Detection.

      • bytes object
        Hide bytes attributes Show bytes attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • circle object
        Hide circle attributes Show circle attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • error_distance number Required

          The difference between the resulting inscribed distance from center to side and the circle’s radius (measured in meters for geo_shape, unit-less for shape).

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • shape_type string Required

          Values are geo_shape or shape.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide community_id attributes Show community_id attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • seed number

          Seed for the community ID hash. Must be between 0 and 65535 (inclusive). The seed can prevent hash collisions between network domains, such as a staging and production network that use the same addressing scheme.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • convert object
        Hide convert attributes Show convert attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • type string Required

          Values are integer, long, double, float, boolean, ip, string, or auto.

      • csv object
        Hide csv attributes Show csv attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Value used to fill empty fields. Empty fields are skipped if this is not provided. An empty field is one with no value (2 consecutive separators) or empty quotes ("").

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • quote string

          Quote used in CSV, has to be single character string.

        • Separator used in CSV, has to be single character string.

        • target_fields string | array[string] Required
        • trim boolean

          Trim whitespaces in unquoted fields.

      • date object
        Hide date attributes Show date attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • formats array[string] Required

          An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

        • locale string

          The locale to use when parsing the date, relevant when parsing month names or week days. Supports template snippets.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • timezone string

          The timezone to use when parsing the date. Supports template snippets.

        • The format to use when writing the date to target_field. Must be a valid java time pattern.

      • Hide date_index_name attributes Show date_index_name attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • date_formats array[string]

          An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

        • date_rounding string Required

          How to round the date when formatting the date into the index name. Valid values are: y (year), M (month), w (week), d (day), h (hour), m (minute) and s (second). Supports template snippets.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • The format to be used when printing the parsed date into the index name. A valid java time pattern is expected here. Supports template snippets.

        • A prefix of the index name to be prepended before the printed date. Supports template snippets.

        • locale string

          The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.

        • timezone string

          The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.

      • dissect object
        Hide dissect attributes Show dissect attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The character(s) that separate the appended fields.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • pattern string Required

          The pattern to apply to the field.

      • Hide dot_expander attributes Show dot_expander attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • override boolean

          Controls the behavior when there is already an existing nested object that conflicts with the expanded field. When false, the processor will merge conflicts by combining the old and the new values into an array. When true, the value from the expanded field will overwrite the existing value.

        • path string

          The field that contains the field to expand. Only required if the field to expand is part another object field, because the field option can only understand leaf fields.

      • drop object
        Hide drop attributes Show drop attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

      • enrich object
        Hide enrich attributes Show enrich attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The maximum number of matched documents to include under the configured target field. The target_field will be turned into a json array if max_matches is higher than 1, otherwise target_field will become a json object. In order to avoid documents getting too large, the maximum allowed value is 128.

        • override boolean

          If processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

        • policy_name string Required

          The name of the enrich policy to use.

        • Values are intersects, disjoint, within, or contains.

        • target_field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • fail object
        Hide fail attributes Show fail attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • message string Required

          The error message thrown by the processor. Supports template snippets.

      • Hide fingerprint attributes Show fingerprint attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • fields string | array[string] Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • salt string

          Salt value for the hash function.

        • method string

          Values are MD5, SHA-1, SHA-256, SHA-512, or MurmurHash3.

        • If true, the processor ignores any missing fields. If all fields are missing, the processor silently exits without modifying the document.

      • foreach object
        Hide foreach attributes Show foreach attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true, the processor silently exits without changing the document if the field is null or missing.

        • processor object Required
      • Hide ip_location attributes Show ip_location attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • first_only boolean

          If true, only the first found IP location data will be returned, even if the field contains an array.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • properties array[string]

          Controls what properties are added to the target_field based on the IP location lookup.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

      • geo_grid object
        Hide geo_grid attributes Show geo_grid attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          The field to interpret as a geo-tile.= The field format is determined by the tile_type.

        • tile_type string Required

          Values are geotile, geohex, or geohash.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Values are geojson or wkt.

      • geoip object
        Hide geoip attributes Show geoip attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • first_only boolean

          If true, only the first found geoip data will be returned, even if the field contains an array.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • properties array[string]

          Controls what properties are added to the target_field based on the geoip lookup.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

      • grok object
        Hide grok attributes Show grok attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Must be disabled or v1. If v1, the processor uses patterns with Elastic Common Schema (ECS) field names.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • A map of pattern-name and pattern tuples defining custom patterns to be used by the current processor. Patterns matching existing names will override the pre-existing definition.

          Hide pattern_definitions attribute Show pattern_definitions attribute object
          • * string Additional properties
        • patterns array[string] Required

          An ordered list of grok expression to match and extract named captures with. Returns on the first expression in the list that matches.

        • When true, _ingest._grok_match_index will be inserted into your matched document’s metadata with the index into the pattern found in patterns that matched.

      • gsub object
        Hide gsub attributes Show gsub attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • pattern string Required

          The pattern to be replaced.

        • replacement string Required

          The string to replace the matching patterns with.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide html_strip attributes Show html_strip attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document,

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide inference attributes Show inference attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • model_id string Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.

          Hide field_map attribute Show field_map attribute object
          • * object Additional properties
        • Hide inference_config attributes Show inference_config attributes object
        • input_output object | array[object]

          Input fields for inference and output (destination) fields for the inference results. This option is incompatible with the target_field and field_map options.

        • If true and any of the input fields defined in input_ouput are missing then those missing fields are quietly ignored, otherwise a missing field causes a failure. Only applies when using input_output configurations to explicitly list the input fields.

      • join object
        Hide join attributes Show join attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • separator string Required

          The separator character.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • json object
        Hide json attributes Show json attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Flag that forces the parsed JSON to be added at the top level of the document. target_field must not be set when this option is chosen.

        • Values are replace or merge.

        • When set to true, the JSON parser will not fail if the JSON contains duplicate keys. Instead, the last encountered value for any duplicate key wins.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • kv object
        Hide kv attributes Show kv attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • exclude_keys array[string]

          List of keys to exclude from document.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • field_split string Required

          Regex pattern to use for splitting key-value pairs.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • include_keys array[string]

          List of keys to filter and insert into document. Defaults to including all keys.

        • prefix string

          Prefix to be added to extracted keys.

        • If true. strip brackets (), <>, [] as well as quotes ' and " from extracted values.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • trim_key string

          String of characters to trim from extracted keys.

        • String of characters to trim from extracted values.

        • value_split string Required

          Regex pattern to use for splitting the key from the value within a key-value pair.

      • Hide lowercase attributes Show lowercase attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide network_direction attributes Show network_direction attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • internal_networks array[string]

          List of internal networks. Supports IPv4 and IPv6 addresses and ranges in CIDR notation. Also supports the named ranges listed below. These may be constructed with template snippets. Must specify only one of internal_networks or internal_networks_field.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • pipeline object
        Hide pipeline attributes Show pipeline attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • name string Required
        • Whether to ignore missing pipelines instead of failing.

      • redact object
        Hide redact attributes Show redact attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • patterns array[string] Required

          A list of grok expressions to match and redact named captures with

        • Hide pattern_definitions attribute Show pattern_definitions attribute object
          • * string Additional properties
        • prefix string

          Start a redacted section with this token

        • suffix string

          End a redacted section with this token

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • If true and the current license does not support running redact processors, then the processor quietly exits without modifying the document

        • If true then ingest metadata _ingest._redact._is_redacted is set to true if the document has been redacted

      • Hide registered_domain attributes Show registered_domain attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • remove object
        Hide remove attributes Show remove attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string | array[string] Required
        • keep string | array[string]
        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • rename object
        Hide rename attributes Show rename attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • target_field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • reroute object
        Hide reroute attributes Show reroute attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • A static value for the target. Can’t be set when the dataset or namespace option is set.

        • dataset string | array[string]

          Field references or a static value for the dataset part of the data stream name. In addition to the criteria for index names, cannot contain - and must be no longer than 100 characters. Example values are nginx.access and nginx.error.

          Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

          default {{data_stream.dataset}}

        • namespace string | array[string]

          Field references or a static value for the namespace part of the data stream name. See the criteria for index names for allowed characters. Must be no longer than 100 characters.

          Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

          default {{data_stream.namespace}}

      • script object
        Hide script attributes Show script attributes object
      • set object
        Hide set attributes Show set attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and value is a template snippet that evaluates to null or the empty string, the processor quietly exits without modifying the document.

        • The media type for encoding value. Applies only when value is a template snippet. Must be one of application/json, text/plain, or application/x-www-form-urlencoded.

        • override boolean

          If true processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

        • value object

          The value to be set for the field. Supports template snippets. May specify only one of value or copy_from.

      • Hide set_security_user attributes Show set_security_user attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Controls what user related properties are added to the field.

      • sort object
        Hide sort attributes Show sort attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • order string

          Values are asc or desc.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • split object
        Hide split attributes Show split attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Preserves empty trailing fields, if any.

        • separator string Required

          A regex which matches the separator, for example, , or \s+.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide terminate attributes Show terminate attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

      • trim object
        Hide trim attributes Show trim attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide uppercase attributes Show uppercase attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide urldecode attributes Show urldecode attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide uri_parts attributes Show uri_parts attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • If true, the processor copies the unparsed URI to <target_field>.original.

        • If true, the processor removes the field after parsing the URI string. If parsing fails, the processor does not remove the field.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide user_agent attributes Show user_agent attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The name of the file in the config/ingest-user-agent directory containing the regular expressions for parsing the user agent string. Both the directory and the file have to be created before starting Elasticsearch. If not specified, ingest-user-agent will use the regexes.yaml from uap-core it ships with.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Controls what properties are added to target_field.

          Values are name, os, device, original, or version.

        • Extracts device type from the user agent string on a best-effort basis.

    • version number
    • deprecated boolean

      Marks this ingest pipeline as deprecated. When a deprecated ingest pipeline is referenced as the default or final pipeline when creating or updating a non-deprecated index template, Elasticsearch will emit a deprecation warning.

    • _meta object
      Hide _meta attribute Show _meta attribute object
      • * object Additional properties

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • docs array[object] Required
      Hide docs attributes Show docs attributes object
      • doc object
        Hide doc attributes Show doc attributes object
        • _id string Required
        • _index string Required
        • _ingest object Required
          Hide _ingest attributes Show _ingest attributes object
        • _routing string

          Value used to send the document to a specific primary shard.

        • _source object Required

          JSON body for the document.

          Hide _source attribute Show _source attribute object
          • * object Additional properties
        • _version number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Values are internal, external, external_gte, or force.

      • error object
        Hide error attributes Show error attributes object
      • processor_results array[object]
        Hide processor_results attributes Show processor_results attributes object
GET /_ingest/pipeline/_simulate
curl \
 --request GET 'http://api.example.com/_ingest/pipeline/_simulate' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"pipeline\" :\n  {\n    \"description\": \"_description\",\n    \"processors\": [\n      {\n        \"set\" : {\n          \"field\" : \"field2\",\n          \"value\" : \"_value\"\n        }\n      }\n    ]\n  },\n  \"docs\": [\n    {\n      \"_index\": \"index\",\n      \"_id\": \"id\",\n      \"_source\": {\n        \"foo\": \"bar\"\n      }\n    },\n    {\n      \"_index\": \"index\",\n      \"_id\": \"id\",\n      \"_source\": {\n        \"foo\": \"rab\"\n      }\n    }\n  ]\n}"'
Request example
You can specify the used pipeline either in the request body or as a path parameter.
{
  "pipeline" :
  {
    "description": "_description",
    "processors": [
      {
        "set" : {
          "field" : "field2",
          "value" : "_value"
        }
      }
    ]
  },
  "docs": [
    {
      "_index": "index",
      "_id": "id",
      "_source": {
        "foo": "bar"
      }
    },
    {
      "_index": "index",
      "_id": "id",
      "_source": {
        "foo": "rab"
      }
    }
  ]
}
Response examples (200)
A successful response for running an ingest pipeline against a set of provided documents.
{
   "docs": [
      {
         "doc": {
            "_id": "id",
            "_index": "index",
            "_version": "-3",
            "_source": {
               "field2": "_value",
               "foo": "bar"
            },
            "_ingest": {
               "timestamp": "2017-05-04T22:30:03.187Z"
            }
         }
      },
      {
         "doc": {
            "_id": "id",
            "_index": "index",
            "_version": "-3",
            "_source": {
               "field2": "_value",
               "foo": "rab"
            },
            "_ingest": {
               "timestamp": "2017-05-04T22:30:03.188Z"
            }
         }
      }
   ]
}

Simulate a pipeline Added in 5.0.0

POST /_ingest/pipeline/_simulate

Run an ingest pipeline against a set of provided documents. You can either specify an existing pipeline to use with the provided documents or supply a pipeline definition in the body of the request.

Query parameters

  • verbose boolean

    If true, the response includes output data for each processor in the executed pipeline.

application/json

Body Required

  • docs array[object] Required

    Sample documents to test in the pipeline.

    Hide docs attributes Show docs attributes object
  • pipeline object Additional properties
    Hide pipeline attributes Show pipeline attributes object
    • Description of the ingest pipeline.

    • on_failure array[object]

      Processors to run immediately after a processor failure.

      Hide on_failure attributes Show on_failure attributes object
      • append object
        Hide append attributes Show append attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • value object | array[object] Required

          The value to be appended. Supports template snippets.

        • If false, the processor does not append values already present in the field.

      • Hide attachment attributes Show attachment attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The number of chars being used for extraction to prevent huge fields. Use -1 for no limit.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Array of properties to select to be stored. Can be content, title, name, author, keywords, date, content_type, content_length, language.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true, the binary field will be removed from the document

        • Field containing the name of the resource to decode. If specified, the processor passes this resource name to the underlying Tika library to enable Resource Name Based Detection.

      • bytes object
        Hide bytes attributes Show bytes attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • circle object
        Hide circle attributes Show circle attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • error_distance number Required

          The difference between the resulting inscribed distance from center to side and the circle’s radius (measured in meters for geo_shape, unit-less for shape).

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • shape_type string Required

          Values are geo_shape or shape.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide community_id attributes Show community_id attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • seed number

          Seed for the community ID hash. Must be between 0 and 65535 (inclusive). The seed can prevent hash collisions between network domains, such as a staging and production network that use the same addressing scheme.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • convert object
        Hide convert attributes Show convert attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • type string Required

          Values are integer, long, double, float, boolean, ip, string, or auto.

      • csv object
        Hide csv attributes Show csv attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Value used to fill empty fields. Empty fields are skipped if this is not provided. An empty field is one with no value (2 consecutive separators) or empty quotes ("").

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • quote string

          Quote used in CSV, has to be single character string.

        • Separator used in CSV, has to be single character string.

        • target_fields string | array[string] Required
        • trim boolean

          Trim whitespaces in unquoted fields.

      • date object
        Hide date attributes Show date attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • formats array[string] Required

          An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

        • locale string

          The locale to use when parsing the date, relevant when parsing month names or week days. Supports template snippets.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • timezone string

          The timezone to use when parsing the date. Supports template snippets.

        • The format to use when writing the date to target_field. Must be a valid java time pattern.

      • Hide date_index_name attributes Show date_index_name attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • date_formats array[string]

          An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

        • date_rounding string Required

          How to round the date when formatting the date into the index name. Valid values are: y (year), M (month), w (week), d (day), h (hour), m (minute) and s (second). Supports template snippets.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • The format to be used when printing the parsed date into the index name. A valid java time pattern is expected here. Supports template snippets.

        • A prefix of the index name to be prepended before the printed date. Supports template snippets.

        • locale string

          The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.

        • timezone string

          The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.

      • dissect object
        Hide dissect attributes Show dissect attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The character(s) that separate the appended fields.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • pattern string Required

          The pattern to apply to the field.

      • Hide dot_expander attributes Show dot_expander attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • override boolean

          Controls the behavior when there is already an existing nested object that conflicts with the expanded field. When false, the processor will merge conflicts by combining the old and the new values into an array. When true, the value from the expanded field will overwrite the existing value.

        • path string

          The field that contains the field to expand. Only required if the field to expand is part another object field, because the field option can only understand leaf fields.

      • drop object
        Hide drop attributes Show drop attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

      • enrich object
        Hide enrich attributes Show enrich attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The maximum number of matched documents to include under the configured target field. The target_field will be turned into a json array if max_matches is higher than 1, otherwise target_field will become a json object. In order to avoid documents getting too large, the maximum allowed value is 128.

        • override boolean

          If processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

        • policy_name string Required

          The name of the enrich policy to use.

        • Values are intersects, disjoint, within, or contains.

        • target_field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • fail object
        Hide fail attributes Show fail attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • message string Required

          The error message thrown by the processor. Supports template snippets.

      • Hide fingerprint attributes Show fingerprint attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • fields string | array[string] Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • salt string

          Salt value for the hash function.

        • method string

          Values are MD5, SHA-1, SHA-256, SHA-512, or MurmurHash3.

        • If true, the processor ignores any missing fields. If all fields are missing, the processor silently exits without modifying the document.

      • foreach object
        Hide foreach attributes Show foreach attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true, the processor silently exits without changing the document if the field is null or missing.

        • processor object Required
      • Hide ip_location attributes Show ip_location attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • first_only boolean

          If true, only the first found IP location data will be returned, even if the field contains an array.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • properties array[string]

          Controls what properties are added to the target_field based on the IP location lookup.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

      • geo_grid object
        Hide geo_grid attributes Show geo_grid attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          The field to interpret as a geo-tile.= The field format is determined by the tile_type.

        • tile_type string Required

          Values are geotile, geohex, or geohash.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Values are geojson or wkt.

      • geoip object
        Hide geoip attributes Show geoip attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • first_only boolean

          If true, only the first found geoip data will be returned, even if the field contains an array.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • properties array[string]

          Controls what properties are added to the target_field based on the geoip lookup.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

      • grok object
        Hide grok attributes Show grok attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Must be disabled or v1. If v1, the processor uses patterns with Elastic Common Schema (ECS) field names.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • A map of pattern-name and pattern tuples defining custom patterns to be used by the current processor. Patterns matching existing names will override the pre-existing definition.

          Hide pattern_definitions attribute Show pattern_definitions attribute object
          • * string Additional properties
        • patterns array[string] Required

          An ordered list of grok expression to match and extract named captures with. Returns on the first expression in the list that matches.

        • When true, _ingest._grok_match_index will be inserted into your matched document’s metadata with the index into the pattern found in patterns that matched.

      • gsub object
        Hide gsub attributes Show gsub attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • pattern string Required

          The pattern to be replaced.

        • replacement string Required

          The string to replace the matching patterns with.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide html_strip attributes Show html_strip attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document,

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide inference attributes Show inference attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • model_id string Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.

          Hide field_map attribute Show field_map attribute object
          • * object Additional properties
        • Hide inference_config attributes Show inference_config attributes object
        • input_output object | array[object]

          Input fields for inference and output (destination) fields for the inference results. This option is incompatible with the target_field and field_map options.

        • If true and any of the input fields defined in input_ouput are missing then those missing fields are quietly ignored, otherwise a missing field causes a failure. Only applies when using input_output configurations to explicitly list the input fields.

      • join object
        Hide join attributes Show join attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • separator string Required

          The separator character.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • json object
        Hide json attributes Show json attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Flag that forces the parsed JSON to be added at the top level of the document. target_field must not be set when this option is chosen.

        • Values are replace or merge.

        • When set to true, the JSON parser will not fail if the JSON contains duplicate keys. Instead, the last encountered value for any duplicate key wins.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • kv object
        Hide kv attributes Show kv attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • exclude_keys array[string]

          List of keys to exclude from document.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • field_split string Required

          Regex pattern to use for splitting key-value pairs.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • include_keys array[string]

          List of keys to filter and insert into document. Defaults to including all keys.

        • prefix string

          Prefix to be added to extracted keys.

        • If true. strip brackets (), <>, [] as well as quotes ' and " from extracted values.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • trim_key string

          String of characters to trim from extracted keys.

        • String of characters to trim from extracted values.

        • value_split string Required

          Regex pattern to use for splitting the key from the value within a key-value pair.

      • Hide lowercase attributes Show lowercase attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide network_direction attributes Show network_direction attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • internal_networks array[string]

          List of internal networks. Supports IPv4 and IPv6 addresses and ranges in CIDR notation. Also supports the named ranges listed below. These may be constructed with template snippets. Must specify only one of internal_networks or internal_networks_field.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • pipeline object
        Hide pipeline attributes Show pipeline attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • name string Required
        • Whether to ignore missing pipelines instead of failing.

      • redact object
        Hide redact attributes Show redact attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • patterns array[string] Required

          A list of grok expressions to match and redact named captures with

        • Hide pattern_definitions attribute Show pattern_definitions attribute object
          • * string Additional properties
        • prefix string

          Start a redacted section with this token

        • suffix string

          End a redacted section with this token

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • If true and the current license does not support running redact processors, then the processor quietly exits without modifying the document

        • If true then ingest metadata _ingest._redact._is_redacted is set to true if the document has been redacted

      • Hide registered_domain attributes Show registered_domain attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • remove object
        Hide remove attributes Show remove attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string | array[string] Required
        • keep string | array[string]
        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • rename object
        Hide rename attributes Show rename attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • target_field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • reroute object
        Hide reroute attributes Show reroute attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • A static value for the target. Can’t be set when the dataset or namespace option is set.

        • dataset string | array[string]

          Field references or a static value for the dataset part of the data stream name. In addition to the criteria for index names, cannot contain - and must be no longer than 100 characters. Example values are nginx.access and nginx.error.

          Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

          default {{data_stream.dataset}}

        • namespace string | array[string]

          Field references or a static value for the namespace part of the data stream name. See the criteria for index names for allowed characters. Must be no longer than 100 characters.

          Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

          default {{data_stream.namespace}}

      • script object
        Hide script attributes Show script attributes object
      • set object
        Hide set attributes Show set attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and value is a template snippet that evaluates to null or the empty string, the processor quietly exits without modifying the document.

        • The media type for encoding value. Applies only when value is a template snippet. Must be one of application/json, text/plain, or application/x-www-form-urlencoded.

        • override boolean

          If true processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

        • value object

          The value to be set for the field. Supports template snippets. May specify only one of value or copy_from.

      • Hide set_security_user attributes Show set_security_user attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Controls what user related properties are added to the field.

      • sort object
        Hide sort attributes Show sort attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • order string

          Values are asc or desc.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • split object
        Hide split attributes Show split attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Preserves empty trailing fields, if any.

        • separator string Required

          A regex which matches the separator, for example, , or \s+.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide terminate attributes Show terminate attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

      • trim object
        Hide trim attributes Show trim attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide uppercase attributes Show uppercase attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide urldecode attributes Show urldecode attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide uri_parts attributes Show uri_parts attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • If true, the processor copies the unparsed URI to <target_field>.original.

        • If true, the processor removes the field after parsing the URI string. If parsing fails, the processor does not remove the field.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide user_agent attributes Show user_agent attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The name of the file in the config/ingest-user-agent directory containing the regular expressions for parsing the user agent string. Both the directory and the file have to be created before starting Elasticsearch. If not specified, ingest-user-agent will use the regexes.yaml from uap-core it ships with.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Controls what properties are added to target_field.

          Values are name, os, device, original, or version.

        • Extracts device type from the user agent string on a best-effort basis.

    • processors array[object]

      Processors used to perform transformations on documents before indexing. Processors run sequentially in the order specified.

      Hide processors attributes Show processors attributes object
      • append object
        Hide append attributes Show append attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • value object | array[object] Required

          The value to be appended. Supports template snippets.

        • If false, the processor does not append values already present in the field.

      • Hide attachment attributes Show attachment attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The number of chars being used for extraction to prevent huge fields. Use -1 for no limit.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Array of properties to select to be stored. Can be content, title, name, author, keywords, date, content_type, content_length, language.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true, the binary field will be removed from the document

        • Field containing the name of the resource to decode. If specified, the processor passes this resource name to the underlying Tika library to enable Resource Name Based Detection.

      • bytes object
        Hide bytes attributes Show bytes attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • circle object
        Hide circle attributes Show circle attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • error_distance number Required

          The difference between the resulting inscribed distance from center to side and the circle’s radius (measured in meters for geo_shape, unit-less for shape).

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • shape_type string Required

          Values are geo_shape or shape.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide community_id attributes Show community_id attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • seed number

          Seed for the community ID hash. Must be between 0 and 65535 (inclusive). The seed can prevent hash collisions between network domains, such as a staging and production network that use the same addressing scheme.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • convert object
        Hide convert attributes Show convert attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • type string Required

          Values are integer, long, double, float, boolean, ip, string, or auto.

      • csv object
        Hide csv attributes Show csv attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Value used to fill empty fields. Empty fields are skipped if this is not provided. An empty field is one with no value (2 consecutive separators) or empty quotes ("").

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • quote string

          Quote used in CSV, has to be single character string.

        • Separator used in CSV, has to be single character string.

        • target_fields string | array[string] Required
        • trim boolean

          Trim whitespaces in unquoted fields.

      • date object
        Hide date attributes Show date attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • formats array[string] Required

          An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

        • locale string

          The locale to use when parsing the date, relevant when parsing month names or week days. Supports template snippets.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • timezone string

          The timezone to use when parsing the date. Supports template snippets.

        • The format to use when writing the date to target_field. Must be a valid java time pattern.

      • Hide date_index_name attributes Show date_index_name attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • date_formats array[string]

          An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

        • date_rounding string Required

          How to round the date when formatting the date into the index name. Valid values are: y (year), M (month), w (week), d (day), h (hour), m (minute) and s (second). Supports template snippets.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • The format to be used when printing the parsed date into the index name. A valid java time pattern is expected here. Supports template snippets.

        • A prefix of the index name to be prepended before the printed date. Supports template snippets.

        • locale string

          The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.

        • timezone string

          The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.

      • dissect object
        Hide dissect attributes Show dissect attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The character(s) that separate the appended fields.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • pattern string Required

          The pattern to apply to the field.

      • Hide dot_expander attributes Show dot_expander attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • override boolean

          Controls the behavior when there is already an existing nested object that conflicts with the expanded field. When false, the processor will merge conflicts by combining the old and the new values into an array. When true, the value from the expanded field will overwrite the existing value.

        • path string

          The field that contains the field to expand. Only required if the field to expand is part another object field, because the field option can only understand leaf fields.

      • drop object
        Hide drop attributes Show drop attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

      • enrich object
        Hide enrich attributes Show enrich attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The maximum number of matched documents to include under the configured target field. The target_field will be turned into a json array if max_matches is higher than 1, otherwise target_field will become a json object. In order to avoid documents getting too large, the maximum allowed value is 128.

        • override boolean

          If processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

        • policy_name string Required

          The name of the enrich policy to use.

        • Values are intersects, disjoint, within, or contains.

        • target_field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • fail object
        Hide fail attributes Show fail attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • message string Required

          The error message thrown by the processor. Supports template snippets.

      • Hide fingerprint attributes Show fingerprint attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • fields string | array[string] Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • salt string

          Salt value for the hash function.

        • method string

          Values are MD5, SHA-1, SHA-256, SHA-512, or MurmurHash3.

        • If true, the processor ignores any missing fields. If all fields are missing, the processor silently exits without modifying the document.

      • foreach object
        Hide foreach attributes Show foreach attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true, the processor silently exits without changing the document if the field is null or missing.

        • processor object Required
      • Hide ip_location attributes Show ip_location attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • first_only boolean

          If true, only the first found IP location data will be returned, even if the field contains an array.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • properties array[string]

          Controls what properties are added to the target_field based on the IP location lookup.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

      • geo_grid object
        Hide geo_grid attributes Show geo_grid attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          The field to interpret as a geo-tile.= The field format is determined by the tile_type.

        • tile_type string Required

          Values are geotile, geohex, or geohash.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Values are geojson or wkt.

      • geoip object
        Hide geoip attributes Show geoip attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • first_only boolean

          If true, only the first found geoip data will be returned, even if the field contains an array.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • properties array[string]

          Controls what properties are added to the target_field based on the geoip lookup.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

      • grok object
        Hide grok attributes Show grok attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Must be disabled or v1. If v1, the processor uses patterns with Elastic Common Schema (ECS) field names.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • A map of pattern-name and pattern tuples defining custom patterns to be used by the current processor. Patterns matching existing names will override the pre-existing definition.

          Hide pattern_definitions attribute Show pattern_definitions attribute object
          • * string Additional properties
        • patterns array[string] Required

          An ordered list of grok expression to match and extract named captures with. Returns on the first expression in the list that matches.

        • When true, _ingest._grok_match_index will be inserted into your matched document’s metadata with the index into the pattern found in patterns that matched.

      • gsub object
        Hide gsub attributes Show gsub attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • pattern string Required

          The pattern to be replaced.

        • replacement string Required

          The string to replace the matching patterns with.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide html_strip attributes Show html_strip attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document,

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide inference attributes Show inference attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • model_id string Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.

          Hide field_map attribute Show field_map attribute object
          • * object Additional properties
        • Hide inference_config attributes Show inference_config attributes object
        • input_output object | array[object]

          Input fields for inference and output (destination) fields for the inference results. This option is incompatible with the target_field and field_map options.

        • If true and any of the input fields defined in input_ouput are missing then those missing fields are quietly ignored, otherwise a missing field causes a failure. Only applies when using input_output configurations to explicitly list the input fields.

      • join object
        Hide join attributes Show join attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • separator string Required

          The separator character.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • json object
        Hide json attributes Show json attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Flag that forces the parsed JSON to be added at the top level of the document. target_field must not be set when this option is chosen.

        • Values are replace or merge.

        • When set to true, the JSON parser will not fail if the JSON contains duplicate keys. Instead, the last encountered value for any duplicate key wins.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • kv object
        Hide kv attributes Show kv attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • exclude_keys array[string]

          List of keys to exclude from document.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • field_split string Required

          Regex pattern to use for splitting key-value pairs.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • include_keys array[string]

          List of keys to filter and insert into document. Defaults to including all keys.

        • prefix string

          Prefix to be added to extracted keys.

        • If true. strip brackets (), <>, [] as well as quotes ' and " from extracted values.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • trim_key string

          String of characters to trim from extracted keys.

        • String of characters to trim from extracted values.

        • value_split string Required

          Regex pattern to use for splitting the key from the value within a key-value pair.

      • Hide lowercase attributes Show lowercase attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide network_direction attributes Show network_direction attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • internal_networks array[string]

          List of internal networks. Supports IPv4 and IPv6 addresses and ranges in CIDR notation. Also supports the named ranges listed below. These may be constructed with template snippets. Must specify only one of internal_networks or internal_networks_field.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • pipeline object
        Hide pipeline attributes Show pipeline attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • name string Required
        • Whether to ignore missing pipelines instead of failing.

      • redact object
        Hide redact attributes Show redact attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • patterns array[string] Required

          A list of grok expressions to match and redact named captures with

        • Hide pattern_definitions attribute Show pattern_definitions attribute object
          • * string Additional properties
        • prefix string

          Start a redacted section with this token

        • suffix string

          End a redacted section with this token

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • If true and the current license does not support running redact processors, then the processor quietly exits without modifying the document

        • If true then ingest metadata _ingest._redact._is_redacted is set to true if the document has been redacted

      • Hide registered_domain attributes Show registered_domain attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • remove object
        Hide remove attributes Show remove attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string | array[string] Required
        • keep string | array[string]
        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • rename object
        Hide rename attributes Show rename attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • target_field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • reroute object
        Hide reroute attributes Show reroute attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • A static value for the target. Can’t be set when the dataset or namespace option is set.

        • dataset string | array[string]

          Field references or a static value for the dataset part of the data stream name. In addition to the criteria for index names, cannot contain - and must be no longer than 100 characters. Example values are nginx.access and nginx.error.

          Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

          default {{data_stream.dataset}}

        • namespace string | array[string]

          Field references or a static value for the namespace part of the data stream name. See the criteria for index names for allowed characters. Must be no longer than 100 characters.

          Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

          default {{data_stream.namespace}}

      • script object
        Hide script attributes Show script attributes object
      • set object
        Hide set attributes Show set attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and value is a template snippet that evaluates to null or the empty string, the processor quietly exits without modifying the document.

        • The media type for encoding value. Applies only when value is a template snippet. Must be one of application/json, text/plain, or application/x-www-form-urlencoded.

        • override boolean

          If true processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

        • value object

          The value to be set for the field. Supports template snippets. May specify only one of value or copy_from.

      • Hide set_security_user attributes Show set_security_user attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Controls what user related properties are added to the field.

      • sort object
        Hide sort attributes Show sort attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • order string

          Values are asc or desc.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • split object
        Hide split attributes Show split attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Preserves empty trailing fields, if any.

        • separator string Required

          A regex which matches the separator, for example, , or \s+.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide terminate attributes Show terminate attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

      • trim object
        Hide trim attributes Show trim attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide uppercase attributes Show uppercase attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide urldecode attributes Show urldecode attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide uri_parts attributes Show uri_parts attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • If true, the processor copies the unparsed URI to <target_field>.original.

        • If true, the processor removes the field after parsing the URI string. If parsing fails, the processor does not remove the field.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide user_agent attributes Show user_agent attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The name of the file in the config/ingest-user-agent directory containing the regular expressions for parsing the user agent string. Both the directory and the file have to be created before starting Elasticsearch. If not specified, ingest-user-agent will use the regexes.yaml from uap-core it ships with.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Controls what properties are added to target_field.

          Values are name, os, device, original, or version.

        • Extracts device type from the user agent string on a best-effort basis.

    • version number
    • deprecated boolean

      Marks this ingest pipeline as deprecated. When a deprecated ingest pipeline is referenced as the default or final pipeline when creating or updating a non-deprecated index template, Elasticsearch will emit a deprecation warning.

    • _meta object
      Hide _meta attribute Show _meta attribute object
      • * object Additional properties

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • docs array[object] Required
      Hide docs attributes Show docs attributes object
      • doc object
        Hide doc attributes Show doc attributes object
        • _id string Required
        • _index string Required
        • _ingest object Required
          Hide _ingest attributes Show _ingest attributes object
        • _routing string

          Value used to send the document to a specific primary shard.

        • _source object Required

          JSON body for the document.

          Hide _source attribute Show _source attribute object
          • * object Additional properties
        • _version number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Values are internal, external, external_gte, or force.

      • error object
        Hide error attributes Show error attributes object
      • processor_results array[object]
        Hide processor_results attributes Show processor_results attributes object
POST /_ingest/pipeline/_simulate
curl \
 --request POST 'http://api.example.com/_ingest/pipeline/_simulate' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"pipeline\" :\n  {\n    \"description\": \"_description\",\n    \"processors\": [\n      {\n        \"set\" : {\n          \"field\" : \"field2\",\n          \"value\" : \"_value\"\n        }\n      }\n    ]\n  },\n  \"docs\": [\n    {\n      \"_index\": \"index\",\n      \"_id\": \"id\",\n      \"_source\": {\n        \"foo\": \"bar\"\n      }\n    },\n    {\n      \"_index\": \"index\",\n      \"_id\": \"id\",\n      \"_source\": {\n        \"foo\": \"rab\"\n      }\n    }\n  ]\n}"'
Request example
You can specify the used pipeline either in the request body or as a path parameter.
{
  "pipeline" :
  {
    "description": "_description",
    "processors": [
      {
        "set" : {
          "field" : "field2",
          "value" : "_value"
        }
      }
    ]
  },
  "docs": [
    {
      "_index": "index",
      "_id": "id",
      "_source": {
        "foo": "bar"
      }
    },
    {
      "_index": "index",
      "_id": "id",
      "_source": {
        "foo": "rab"
      }
    }
  ]
}
Response examples (200)
A successful response for running an ingest pipeline against a set of provided documents.
{
   "docs": [
      {
         "doc": {
            "_id": "id",
            "_index": "index",
            "_version": "-3",
            "_source": {
               "field2": "_value",
               "foo": "bar"
            },
            "_ingest": {
               "timestamp": "2017-05-04T22:30:03.187Z"
            }
         }
      },
      {
         "doc": {
            "_id": "id",
            "_index": "index",
            "_version": "-3",
            "_source": {
               "field2": "_value",
               "foo": "rab"
            },
            "_ingest": {
               "timestamp": "2017-05-04T22:30:03.188Z"
            }
         }
      }
   ]
}

Simulate a pipeline Added in 5.0.0

GET /_ingest/pipeline/{id}/_simulate

Run an ingest pipeline against a set of provided documents. You can either specify an existing pipeline to use with the provided documents or supply a pipeline definition in the body of the request.

Path parameters

  • id string Required

    The pipeline to test. If you don't specify a pipeline in the request body, this parameter is required.

Query parameters

  • verbose boolean

    If true, the response includes output data for each processor in the executed pipeline.

application/json

Body Required

  • docs array[object] Required

    Sample documents to test in the pipeline.

    Hide docs attributes Show docs attributes object
  • pipeline object Additional properties
    Hide pipeline attributes Show pipeline attributes object
    • Description of the ingest pipeline.

    • on_failure array[object]

      Processors to run immediately after a processor failure.

      Hide on_failure attributes Show on_failure attributes object
      • append object
        Hide append attributes Show append attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • value object | array[object] Required

          The value to be appended. Supports template snippets.

        • If false, the processor does not append values already present in the field.

      • Hide attachment attributes Show attachment attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The number of chars being used for extraction to prevent huge fields. Use -1 for no limit.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Array of properties to select to be stored. Can be content, title, name, author, keywords, date, content_type, content_length, language.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true, the binary field will be removed from the document

        • Field containing the name of the resource to decode. If specified, the processor passes this resource name to the underlying Tika library to enable Resource Name Based Detection.

      • bytes object
        Hide bytes attributes Show bytes attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • circle object
        Hide circle attributes Show circle attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • error_distance number Required

          The difference between the resulting inscribed distance from center to side and the circle’s radius (measured in meters for geo_shape, unit-less for shape).

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • shape_type string Required

          Values are geo_shape or shape.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide community_id attributes Show community_id attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • seed number

          Seed for the community ID hash. Must be between 0 and 65535 (inclusive). The seed can prevent hash collisions between network domains, such as a staging and production network that use the same addressing scheme.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • convert object
        Hide convert attributes Show convert attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • type string Required

          Values are integer, long, double, float, boolean, ip, string, or auto.

      • csv object
        Hide csv attributes Show csv attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Value used to fill empty fields. Empty fields are skipped if this is not provided. An empty field is one with no value (2 consecutive separators) or empty quotes ("").

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • quote string

          Quote used in CSV, has to be single character string.

        • Separator used in CSV, has to be single character string.

        • target_fields string | array[string] Required
        • trim boolean

          Trim whitespaces in unquoted fields.

      • date object
        Hide date attributes Show date attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • formats array[string] Required

          An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

        • locale string

          The locale to use when parsing the date, relevant when parsing month names or week days. Supports template snippets.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • timezone string

          The timezone to use when parsing the date. Supports template snippets.

        • The format to use when writing the date to target_field. Must be a valid java time pattern.

      • Hide date_index_name attributes Show date_index_name attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • date_formats array[string]

          An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

        • date_rounding string Required

          How to round the date when formatting the date into the index name. Valid values are: y (year), M (month), w (week), d (day), h (hour), m (minute) and s (second). Supports template snippets.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • The format to be used when printing the parsed date into the index name. A valid java time pattern is expected here. Supports template snippets.

        • A prefix of the index name to be prepended before the printed date. Supports template snippets.

        • locale string

          The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.

        • timezone string

          The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.

      • dissect object
        Hide dissect attributes Show dissect attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The character(s) that separate the appended fields.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • pattern string Required

          The pattern to apply to the field.

      • Hide dot_expander attributes Show dot_expander attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • override boolean

          Controls the behavior when there is already an existing nested object that conflicts with the expanded field. When false, the processor will merge conflicts by combining the old and the new values into an array. When true, the value from the expanded field will overwrite the existing value.

        • path string

          The field that contains the field to expand. Only required if the field to expand is part another object field, because the field option can only understand leaf fields.

      • drop object
        Hide drop attributes Show drop attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

      • enrich object
        Hide enrich attributes Show enrich attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The maximum number of matched documents to include under the configured target field. The target_field will be turned into a json array if max_matches is higher than 1, otherwise target_field will become a json object. In order to avoid documents getting too large, the maximum allowed value is 128.

        • override boolean

          If processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

        • policy_name string Required

          The name of the enrich policy to use.

        • Values are intersects, disjoint, within, or contains.

        • target_field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • fail object
        Hide fail attributes Show fail attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • message string Required

          The error message thrown by the processor. Supports template snippets.

      • Hide fingerprint attributes Show fingerprint attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • fields string | array[string] Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • salt string

          Salt value for the hash function.

        • method string

          Values are MD5, SHA-1, SHA-256, SHA-512, or MurmurHash3.

        • If true, the processor ignores any missing fields. If all fields are missing, the processor silently exits without modifying the document.

      • foreach object
        Hide foreach attributes Show foreach attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true, the processor silently exits without changing the document if the field is null or missing.

        • processor object Required
      • Hide ip_location attributes Show ip_location attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • first_only boolean

          If true, only the first found IP location data will be returned, even if the field contains an array.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • properties array[string]

          Controls what properties are added to the target_field based on the IP location lookup.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

      • geo_grid object
        Hide geo_grid attributes Show geo_grid attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          The field to interpret as a geo-tile.= The field format is determined by the tile_type.

        • tile_type string Required

          Values are geotile, geohex, or geohash.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Values are geojson or wkt.

      • geoip object
        Hide geoip attributes Show geoip attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • first_only boolean

          If true, only the first found geoip data will be returned, even if the field contains an array.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • properties array[string]

          Controls what properties are added to the target_field based on the geoip lookup.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

      • grok object
        Hide grok attributes Show grok attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Must be disabled or v1. If v1, the processor uses patterns with Elastic Common Schema (ECS) field names.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • A map of pattern-name and pattern tuples defining custom patterns to be used by the current processor. Patterns matching existing names will override the pre-existing definition.

          Hide pattern_definitions attribute Show pattern_definitions attribute object
          • * string Additional properties
        • patterns array[string] Required

          An ordered list of grok expression to match and extract named captures with. Returns on the first expression in the list that matches.

        • When true, _ingest._grok_match_index will be inserted into your matched document’s metadata with the index into the pattern found in patterns that matched.

      • gsub object
        Hide gsub attributes Show gsub attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • pattern string Required

          The pattern to be replaced.

        • replacement string Required

          The string to replace the matching patterns with.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide html_strip attributes Show html_strip attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document,

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide inference attributes Show inference attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • model_id string Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.

          Hide field_map attribute Show field_map attribute object
          • * object Additional properties
        • Hide inference_config attributes Show inference_config attributes object
        • input_output object | array[object]

          Input fields for inference and output (destination) fields for the inference results. This option is incompatible with the target_field and field_map options.

        • If true and any of the input fields defined in input_ouput are missing then those missing fields are quietly ignored, otherwise a missing field causes a failure. Only applies when using input_output configurations to explicitly list the input fields.

      • join object
        Hide join attributes Show join attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • separator string Required

          The separator character.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • json object
        Hide json attributes Show json attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Flag that forces the parsed JSON to be added at the top level of the document. target_field must not be set when this option is chosen.

        • Values are replace or merge.

        • When set to true, the JSON parser will not fail if the JSON contains duplicate keys. Instead, the last encountered value for any duplicate key wins.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • kv object
        Hide kv attributes Show kv attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • exclude_keys array[string]

          List of keys to exclude from document.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • field_split string Required

          Regex pattern to use for splitting key-value pairs.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • include_keys array[string]

          List of keys to filter and insert into document. Defaults to including all keys.

        • prefix string

          Prefix to be added to extracted keys.

        • If true. strip brackets (), <>, [] as well as quotes ' and " from extracted values.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • trim_key string

          String of characters to trim from extracted keys.

        • String of characters to trim from extracted values.

        • value_split string Required

          Regex pattern to use for splitting the key from the value within a key-value pair.

      • Hide lowercase attributes Show lowercase attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide network_direction attributes Show network_direction attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • internal_networks array[string]

          List of internal networks. Supports IPv4 and IPv6 addresses and ranges in CIDR notation. Also supports the named ranges listed below. These may be constructed with template snippets. Must specify only one of internal_networks or internal_networks_field.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • pipeline object
        Hide pipeline attributes Show pipeline attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • name string Required
        • Whether to ignore missing pipelines instead of failing.

      • redact object
        Hide redact attributes Show redact attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • patterns array[string] Required

          A list of grok expressions to match and redact named captures with

        • Hide pattern_definitions attribute Show pattern_definitions attribute object
          • * string Additional properties
        • prefix string

          Start a redacted section with this token

        • suffix string

          End a redacted section with this token

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • If true and the current license does not support running redact processors, then the processor quietly exits without modifying the document

        • If true then ingest metadata _ingest._redact._is_redacted is set to true if the document has been redacted

      • Hide registered_domain attributes Show registered_domain attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • remove object
        Hide remove attributes Show remove attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string | array[string] Required
        • keep string | array[string]
        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • rename object
        Hide rename attributes Show rename attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • target_field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • reroute object
        Hide reroute attributes Show reroute attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • A static value for the target. Can’t be set when the dataset or namespace option is set.

        • dataset string | array[string]

          Field references or a static value for the dataset part of the data stream name. In addition to the criteria for index names, cannot contain - and must be no longer than 100 characters. Example values are nginx.access and nginx.error.

          Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

          default {{data_stream.dataset}}

        • namespace string | array[string]

          Field references or a static value for the namespace part of the data stream name. See the criteria for index names for allowed characters. Must be no longer than 100 characters.

          Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

          default {{data_stream.namespace}}

      • script object
        Hide script attributes Show script attributes object
      • set object
        Hide set attributes Show set attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and value is a template snippet that evaluates to null or the empty string, the processor quietly exits without modifying the document.

        • The media type for encoding value. Applies only when value is a template snippet. Must be one of application/json, text/plain, or application/x-www-form-urlencoded.

        • override boolean

          If true processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

        • value object

          The value to be set for the field. Supports template snippets. May specify only one of value or copy_from.

      • Hide set_security_user attributes Show set_security_user attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Controls what user related properties are added to the field.

      • sort object
        Hide sort attributes Show sort attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • order string

          Values are asc or desc.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • split object
        Hide split attributes Show split attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Preserves empty trailing fields, if any.

        • separator string Required

          A regex which matches the separator, for example, , or \s+.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide terminate attributes Show terminate attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

      • trim object
        Hide trim attributes Show trim attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide uppercase attributes Show uppercase attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide urldecode attributes Show urldecode attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide uri_parts attributes Show uri_parts attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • If true, the processor copies the unparsed URI to <target_field>.original.

        • If true, the processor removes the field after parsing the URI string. If parsing fails, the processor does not remove the field.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide user_agent attributes Show user_agent attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The name of the file in the config/ingest-user-agent directory containing the regular expressions for parsing the user agent string. Both the directory and the file have to be created before starting Elasticsearch. If not specified, ingest-user-agent will use the regexes.yaml from uap-core it ships with.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Controls what properties are added to target_field.

          Values are name, os, device, original, or version.

        • Extracts device type from the user agent string on a best-effort basis.

    • processors array[object]

      Processors used to perform transformations on documents before indexing. Processors run sequentially in the order specified.

      Hide processors attributes Show processors attributes object
      • append object
        Hide append attributes Show append attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • value object | array[object] Required

          The value to be appended. Supports template snippets.

        • If false, the processor does not append values already present in the field.

      • Hide attachment attributes Show attachment attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The number of chars being used for extraction to prevent huge fields. Use -1 for no limit.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Array of properties to select to be stored. Can be content, title, name, author, keywords, date, content_type, content_length, language.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true, the binary field will be removed from the document

        • Field containing the name of the resource to decode. If specified, the processor passes this resource name to the underlying Tika library to enable Resource Name Based Detection.

      • bytes object
        Hide bytes attributes Show bytes attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • circle object
        Hide circle attributes Show circle attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • error_distance number Required

          The difference between the resulting inscribed distance from center to side and the circle’s radius (measured in meters for geo_shape, unit-less for shape).

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • shape_type string Required

          Values are geo_shape or shape.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide community_id attributes Show community_id attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • seed number

          Seed for the community ID hash. Must be between 0 and 65535 (inclusive). The seed can prevent hash collisions between network domains, such as a staging and production network that use the same addressing scheme.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • convert object
        Hide convert attributes Show convert attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • type string Required

          Values are integer, long, double, float, boolean, ip, string, or auto.

      • csv object
        Hide csv attributes Show csv attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Value used to fill empty fields. Empty fields are skipped if this is not provided. An empty field is one with no value (2 consecutive separators) or empty quotes ("").

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • quote string

          Quote used in CSV, has to be single character string.

        • Separator used in CSV, has to be single character string.

        • target_fields string | array[string] Required
        • trim boolean

          Trim whitespaces in unquoted fields.

      • date object
        Hide date attributes Show date attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • formats array[string] Required

          An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

        • locale string

          The locale to use when parsing the date, relevant when parsing month names or week days. Supports template snippets.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • timezone string

          The timezone to use when parsing the date. Supports template snippets.

        • The format to use when writing the date to target_field. Must be a valid java time pattern.

      • Hide date_index_name attributes Show date_index_name attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • date_formats array[string]

          An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

        • date_rounding string Required

          How to round the date when formatting the date into the index name. Valid values are: y (year), M (month), w (week), d (day), h (hour), m (minute) and s (second). Supports template snippets.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • The format to be used when printing the parsed date into the index name. A valid java time pattern is expected here. Supports template snippets.

        • A prefix of the index name to be prepended before the printed date. Supports template snippets.

        • locale string

          The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.

        • timezone string

          The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.

      • dissect object
        Hide dissect attributes Show dissect attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The character(s) that separate the appended fields.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • pattern string Required

          The pattern to apply to the field.

      • Hide dot_expander attributes Show dot_expander attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • override boolean

          Controls the behavior when there is already an existing nested object that conflicts with the expanded field. When false, the processor will merge conflicts by combining the old and the new values into an array. When true, the value from the expanded field will overwrite the existing value.

        • path string

          The field that contains the field to expand. Only required if the field to expand is part another object field, because the field option can only understand leaf fields.

      • drop object
        Hide drop attributes Show drop attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

      • enrich object
        Hide enrich attributes Show enrich attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The maximum number of matched documents to include under the configured target field. The target_field will be turned into a json array if max_matches is higher than 1, otherwise target_field will become a json object. In order to avoid documents getting too large, the maximum allowed value is 128.

        • override boolean

          If processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

        • policy_name string Required

          The name of the enrich policy to use.

        • Values are intersects, disjoint, within, or contains.

        • target_field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • fail object
        Hide fail attributes Show fail attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • message string Required

          The error message thrown by the processor. Supports template snippets.

      • Hide fingerprint attributes Show fingerprint attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • fields string | array[string] Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • salt string

          Salt value for the hash function.

        • method string

          Values are MD5, SHA-1, SHA-256, SHA-512, or MurmurHash3.

        • If true, the processor ignores any missing fields. If all fields are missing, the processor silently exits without modifying the document.

      • foreach object
        Hide foreach attributes Show foreach attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true, the processor silently exits without changing the document if the field is null or missing.

        • processor object Required
      • Hide ip_location attributes Show ip_location attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • first_only boolean

          If true, only the first found IP location data will be returned, even if the field contains an array.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • properties array[string]

          Controls what properties are added to the target_field based on the IP location lookup.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

      • geo_grid object
        Hide geo_grid attributes Show geo_grid attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          The field to interpret as a geo-tile.= The field format is determined by the tile_type.

        • tile_type string Required

          Values are geotile, geohex, or geohash.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Values are geojson or wkt.

      • geoip object
        Hide geoip attributes Show geoip attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • first_only boolean

          If true, only the first found geoip data will be returned, even if the field contains an array.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • properties array[string]

          Controls what properties are added to the target_field based on the geoip lookup.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

      • grok object
        Hide grok attributes Show grok attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Must be disabled or v1. If v1, the processor uses patterns with Elastic Common Schema (ECS) field names.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • A map of pattern-name and pattern tuples defining custom patterns to be used by the current processor. Patterns matching existing names will override the pre-existing definition.

          Hide pattern_definitions attribute Show pattern_definitions attribute object
          • * string Additional properties
        • patterns array[string] Required

          An ordered list of grok expression to match and extract named captures with. Returns on the first expression in the list that matches.

        • When true, _ingest._grok_match_index will be inserted into your matched document’s metadata with the index into the pattern found in patterns that matched.

      • gsub object
        Hide gsub attributes Show gsub attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • pattern string Required

          The pattern to be replaced.

        • replacement string Required

          The string to replace the matching patterns with.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide html_strip attributes Show html_strip attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document,

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide inference attributes Show inference attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • model_id string Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.

          Hide field_map attribute Show field_map attribute object
          • * object Additional properties
        • Hide inference_config attributes Show inference_config attributes object
        • input_output object | array[object]

          Input fields for inference and output (destination) fields for the inference results. This option is incompatible with the target_field and field_map options.

        • If true and any of the input fields defined in input_ouput are missing then those missing fields are quietly ignored, otherwise a missing field causes a failure. Only applies when using input_output configurations to explicitly list the input fields.

      • join object
        Hide join attributes Show join attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • separator string Required

          The separator character.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • json object
        Hide json attributes Show json attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Flag that forces the parsed JSON to be added at the top level of the document. target_field must not be set when this option is chosen.

        • Values are replace or merge.

        • When set to true, the JSON parser will not fail if the JSON contains duplicate keys. Instead, the last encountered value for any duplicate key wins.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • kv object
        Hide kv attributes Show kv attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • exclude_keys array[string]

          List of keys to exclude from document.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • field_split string Required

          Regex pattern to use for splitting key-value pairs.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • include_keys array[string]

          List of keys to filter and insert into document. Defaults to including all keys.

        • prefix string

          Prefix to be added to extracted keys.

        • If true. strip brackets (), <>, [] as well as quotes ' and " from extracted values.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • trim_key string

          String of characters to trim from extracted keys.

        • String of characters to trim from extracted values.

        • value_split string Required

          Regex pattern to use for splitting the key from the value within a key-value pair.

      • Hide lowercase attributes Show lowercase attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide network_direction attributes Show network_direction attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • internal_networks array[string]

          List of internal networks. Supports IPv4 and IPv6 addresses and ranges in CIDR notation. Also supports the named ranges listed below. These may be constructed with template snippets. Must specify only one of internal_networks or internal_networks_field.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • pipeline object
        Hide pipeline attributes Show pipeline attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • name string Required
        • Whether to ignore missing pipelines instead of failing.

      • redact object
        Hide redact attributes Show redact attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • patterns array[string] Required

          A list of grok expressions to match and redact named captures with

        • Hide pattern_definitions attribute Show pattern_definitions attribute object
          • * string Additional properties
        • prefix string

          Start a redacted section with this token

        • suffix string

          End a redacted section with this token

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • If true and the current license does not support running redact processors, then the processor quietly exits without modifying the document

        • If true then ingest metadata _ingest._redact._is_redacted is set to true if the document has been redacted

      • Hide registered_domain attributes Show registered_domain attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • remove object
        Hide remove attributes Show remove attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string | array[string] Required
        • keep string | array[string]
        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • rename object
        Hide rename attributes Show rename attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • target_field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • reroute object
        Hide reroute attributes Show reroute attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • A static value for the target. Can’t be set when the dataset or namespace option is set.

        • dataset string | array[string]

          Field references or a static value for the dataset part of the data stream name. In addition to the criteria for index names, cannot contain - and must be no longer than 100 characters. Example values are nginx.access and nginx.error.

          Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

          default {{data_stream.dataset}}

        • namespace string | array[string]

          Field references or a static value for the namespace part of the data stream name. See the criteria for index names for allowed characters. Must be no longer than 100 characters.

          Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

          default {{data_stream.namespace}}

      • script object
        Hide script attributes Show script attributes object
      • set object
        Hide set attributes Show set attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and value is a template snippet that evaluates to null or the empty string, the processor quietly exits without modifying the document.

        • The media type for encoding value. Applies only when value is a template snippet. Must be one of application/json, text/plain, or application/x-www-form-urlencoded.

        • override boolean

          If true processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

        • value object

          The value to be set for the field. Supports template snippets. May specify only one of value or copy_from.

      • Hide set_security_user attributes Show set_security_user attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Controls what user related properties are added to the field.

      • sort object
        Hide sort attributes Show sort attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • order string

          Values are asc or desc.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • split object
        Hide split attributes Show split attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Preserves empty trailing fields, if any.

        • separator string Required

          A regex which matches the separator, for example, , or \s+.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide terminate attributes Show terminate attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

      • trim object
        Hide trim attributes Show trim attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide uppercase attributes Show uppercase attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide urldecode attributes Show urldecode attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide uri_parts attributes Show uri_parts attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • If true, the processor copies the unparsed URI to <target_field>.original.

        • If true, the processor removes the field after parsing the URI string. If parsing fails, the processor does not remove the field.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide user_agent attributes Show user_agent attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The name of the file in the config/ingest-user-agent directory containing the regular expressions for parsing the user agent string. Both the directory and the file have to be created before starting Elasticsearch. If not specified, ingest-user-agent will use the regexes.yaml from uap-core it ships with.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Controls what properties are added to target_field.

          Values are name, os, device, original, or version.

        • Extracts device type from the user agent string on a best-effort basis.

    • version number
    • deprecated boolean

      Marks this ingest pipeline as deprecated. When a deprecated ingest pipeline is referenced as the default or final pipeline when creating or updating a non-deprecated index template, Elasticsearch will emit a deprecation warning.

    • _meta object
      Hide _meta attribute Show _meta attribute object
      • * object Additional properties

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • docs array[object] Required
      Hide docs attributes Show docs attributes object
      • doc object
        Hide doc attributes Show doc attributes object
        • _id string Required
        • _index string Required
        • _ingest object Required
          Hide _ingest attributes Show _ingest attributes object
        • _routing string

          Value used to send the document to a specific primary shard.

        • _source object Required

          JSON body for the document.

          Hide _source attribute Show _source attribute object
          • * object Additional properties
        • _version number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Values are internal, external, external_gte, or force.

      • error object
        Hide error attributes Show error attributes object
      • processor_results array[object]
        Hide processor_results attributes Show processor_results attributes object
GET /_ingest/pipeline/{id}/_simulate
curl \
 --request GET 'http://api.example.com/_ingest/pipeline/{id}/_simulate' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"pipeline\" :\n  {\n    \"description\": \"_description\",\n    \"processors\": [\n      {\n        \"set\" : {\n          \"field\" : \"field2\",\n          \"value\" : \"_value\"\n        }\n      }\n    ]\n  },\n  \"docs\": [\n    {\n      \"_index\": \"index\",\n      \"_id\": \"id\",\n      \"_source\": {\n        \"foo\": \"bar\"\n      }\n    },\n    {\n      \"_index\": \"index\",\n      \"_id\": \"id\",\n      \"_source\": {\n        \"foo\": \"rab\"\n      }\n    }\n  ]\n}"'
Request example
You can specify the used pipeline either in the request body or as a path parameter.
{
  "pipeline" :
  {
    "description": "_description",
    "processors": [
      {
        "set" : {
          "field" : "field2",
          "value" : "_value"
        }
      }
    ]
  },
  "docs": [
    {
      "_index": "index",
      "_id": "id",
      "_source": {
        "foo": "bar"
      }
    },
    {
      "_index": "index",
      "_id": "id",
      "_source": {
        "foo": "rab"
      }
    }
  ]
}
Response examples (200)
A successful response for running an ingest pipeline against a set of provided documents.
{
   "docs": [
      {
         "doc": {
            "_id": "id",
            "_index": "index",
            "_version": "-3",
            "_source": {
               "field2": "_value",
               "foo": "bar"
            },
            "_ingest": {
               "timestamp": "2017-05-04T22:30:03.187Z"
            }
         }
      },
      {
         "doc": {
            "_id": "id",
            "_index": "index",
            "_version": "-3",
            "_source": {
               "field2": "_value",
               "foo": "rab"
            },
            "_ingest": {
               "timestamp": "2017-05-04T22:30:03.188Z"
            }
         }
      }
   ]
}

Simulate a pipeline Added in 5.0.0

POST /_ingest/pipeline/{id}/_simulate

Run an ingest pipeline against a set of provided documents. You can either specify an existing pipeline to use with the provided documents or supply a pipeline definition in the body of the request.

Path parameters

  • id string Required

    The pipeline to test. If you don't specify a pipeline in the request body, this parameter is required.

Query parameters

  • verbose boolean

    If true, the response includes output data for each processor in the executed pipeline.

application/json

Body Required

  • docs array[object] Required

    Sample documents to test in the pipeline.

    Hide docs attributes Show docs attributes object
  • pipeline object Additional properties
    Hide pipeline attributes Show pipeline attributes object
    • Description of the ingest pipeline.

    • on_failure array[object]

      Processors to run immediately after a processor failure.

      Hide on_failure attributes Show on_failure attributes object
      • append object
        Hide append attributes Show append attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • value object | array[object] Required

          The value to be appended. Supports template snippets.

        • If false, the processor does not append values already present in the field.

      • Hide attachment attributes Show attachment attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The number of chars being used for extraction to prevent huge fields. Use -1 for no limit.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Array of properties to select to be stored. Can be content, title, name, author, keywords, date, content_type, content_length, language.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true, the binary field will be removed from the document

        • Field containing the name of the resource to decode. If specified, the processor passes this resource name to the underlying Tika library to enable Resource Name Based Detection.

      • bytes object
        Hide bytes attributes Show bytes attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • circle object
        Hide circle attributes Show circle attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • error_distance number Required

          The difference between the resulting inscribed distance from center to side and the circle’s radius (measured in meters for geo_shape, unit-less for shape).

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • shape_type string Required

          Values are geo_shape or shape.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide community_id attributes Show community_id attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • seed number

          Seed for the community ID hash. Must be between 0 and 65535 (inclusive). The seed can prevent hash collisions between network domains, such as a staging and production network that use the same addressing scheme.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • convert object
        Hide convert attributes Show convert attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • type string Required

          Values are integer, long, double, float, boolean, ip, string, or auto.

      • csv object
        Hide csv attributes Show csv attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Value used to fill empty fields. Empty fields are skipped if this is not provided. An empty field is one with no value (2 consecutive separators) or empty quotes ("").

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • quote string

          Quote used in CSV, has to be single character string.

        • Separator used in CSV, has to be single character string.

        • target_fields string | array[string] Required
        • trim boolean

          Trim whitespaces in unquoted fields.

      • date object
        Hide date attributes Show date attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • formats array[string] Required

          An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

        • locale string

          The locale to use when parsing the date, relevant when parsing month names or week days. Supports template snippets.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • timezone string

          The timezone to use when parsing the date. Supports template snippets.

        • The format to use when writing the date to target_field. Must be a valid java time pattern.

      • Hide date_index_name attributes Show date_index_name attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • date_formats array[string]

          An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

        • date_rounding string Required

          How to round the date when formatting the date into the index name. Valid values are: y (year), M (month), w (week), d (day), h (hour), m (minute) and s (second). Supports template snippets.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • The format to be used when printing the parsed date into the index name. A valid java time pattern is expected here. Supports template snippets.

        • A prefix of the index name to be prepended before the printed date. Supports template snippets.

        • locale string

          The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.

        • timezone string

          The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.

      • dissect object
        Hide dissect attributes Show dissect attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The character(s) that separate the appended fields.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • pattern string Required

          The pattern to apply to the field.

      • Hide dot_expander attributes Show dot_expander attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • override boolean

          Controls the behavior when there is already an existing nested object that conflicts with the expanded field. When false, the processor will merge conflicts by combining the old and the new values into an array. When true, the value from the expanded field will overwrite the existing value.

        • path string

          The field that contains the field to expand. Only required if the field to expand is part another object field, because the field option can only understand leaf fields.

      • drop object
        Hide drop attributes Show drop attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

      • enrich object
        Hide enrich attributes Show enrich attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The maximum number of matched documents to include under the configured target field. The target_field will be turned into a json array if max_matches is higher than 1, otherwise target_field will become a json object. In order to avoid documents getting too large, the maximum allowed value is 128.

        • override boolean

          If processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

        • policy_name string Required

          The name of the enrich policy to use.

        • Values are intersects, disjoint, within, or contains.

        • target_field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • fail object
        Hide fail attributes Show fail attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • message string Required

          The error message thrown by the processor. Supports template snippets.

      • Hide fingerprint attributes Show fingerprint attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • fields string | array[string] Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • salt string

          Salt value for the hash function.

        • method string

          Values are MD5, SHA-1, SHA-256, SHA-512, or MurmurHash3.

        • If true, the processor ignores any missing fields. If all fields are missing, the processor silently exits without modifying the document.

      • foreach object
        Hide foreach attributes Show foreach attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true, the processor silently exits without changing the document if the field is null or missing.

        • processor object Required
      • Hide ip_location attributes Show ip_location attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • first_only boolean

          If true, only the first found IP location data will be returned, even if the field contains an array.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • properties array[string]

          Controls what properties are added to the target_field based on the IP location lookup.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

      • geo_grid object
        Hide geo_grid attributes Show geo_grid attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          The field to interpret as a geo-tile.= The field format is determined by the tile_type.

        • tile_type string Required

          Values are geotile, geohex, or geohash.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Values are geojson or wkt.

      • geoip object
        Hide geoip attributes Show geoip attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • first_only boolean

          If true, only the first found geoip data will be returned, even if the field contains an array.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • properties array[string]

          Controls what properties are added to the target_field based on the geoip lookup.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

      • grok object
        Hide grok attributes Show grok attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Must be disabled or v1. If v1, the processor uses patterns with Elastic Common Schema (ECS) field names.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • A map of pattern-name and pattern tuples defining custom patterns to be used by the current processor. Patterns matching existing names will override the pre-existing definition.

          Hide pattern_definitions attribute Show pattern_definitions attribute object
          • * string Additional properties
        • patterns array[string] Required

          An ordered list of grok expression to match and extract named captures with. Returns on the first expression in the list that matches.

        • When true, _ingest._grok_match_index will be inserted into your matched document’s metadata with the index into the pattern found in patterns that matched.

      • gsub object
        Hide gsub attributes Show gsub attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • pattern string Required

          The pattern to be replaced.

        • replacement string Required

          The string to replace the matching patterns with.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide html_strip attributes Show html_strip attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document,

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide inference attributes Show inference attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • model_id string Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.

          Hide field_map attribute Show field_map attribute object
          • * object Additional properties
        • Hide inference_config attributes Show inference_config attributes object
        • input_output object | array[object]

          Input fields for inference and output (destination) fields for the inference results. This option is incompatible with the target_field and field_map options.

        • If true and any of the input fields defined in input_ouput are missing then those missing fields are quietly ignored, otherwise a missing field causes a failure. Only applies when using input_output configurations to explicitly list the input fields.

      • join object
        Hide join attributes Show join attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • separator string Required

          The separator character.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • json object
        Hide json attributes Show json attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Flag that forces the parsed JSON to be added at the top level of the document. target_field must not be set when this option is chosen.

        • Values are replace or merge.

        • When set to true, the JSON parser will not fail if the JSON contains duplicate keys. Instead, the last encountered value for any duplicate key wins.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • kv object
        Hide kv attributes Show kv attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • exclude_keys array[string]

          List of keys to exclude from document.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • field_split string Required

          Regex pattern to use for splitting key-value pairs.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • include_keys array[string]

          List of keys to filter and insert into document. Defaults to including all keys.

        • prefix string

          Prefix to be added to extracted keys.

        • If true. strip brackets (), <>, [] as well as quotes ' and " from extracted values.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • trim_key string

          String of characters to trim from extracted keys.

        • String of characters to trim from extracted values.

        • value_split string Required

          Regex pattern to use for splitting the key from the value within a key-value pair.

      • Hide lowercase attributes Show lowercase attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide network_direction attributes Show network_direction attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • internal_networks array[string]

          List of internal networks. Supports IPv4 and IPv6 addresses and ranges in CIDR notation. Also supports the named ranges listed below. These may be constructed with template snippets. Must specify only one of internal_networks or internal_networks_field.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • pipeline object
        Hide pipeline attributes Show pipeline attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • name string Required
        • Whether to ignore missing pipelines instead of failing.

      • redact object
        Hide redact attributes Show redact attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • patterns array[string] Required

          A list of grok expressions to match and redact named captures with

        • Hide pattern_definitions attribute Show pattern_definitions attribute object
          • * string Additional properties
        • prefix string

          Start a redacted section with this token

        • suffix string

          End a redacted section with this token

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • If true and the current license does not support running redact processors, then the processor quietly exits without modifying the document

        • If true then ingest metadata _ingest._redact._is_redacted is set to true if the document has been redacted

      • Hide registered_domain attributes Show registered_domain attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • remove object
        Hide remove attributes Show remove attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string | array[string] Required
        • keep string | array[string]
        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • rename object
        Hide rename attributes Show rename attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • target_field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • reroute object
        Hide reroute attributes Show reroute attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • A static value for the target. Can’t be set when the dataset or namespace option is set.

        • dataset string | array[string]

          Field references or a static value for the dataset part of the data stream name. In addition to the criteria for index names, cannot contain - and must be no longer than 100 characters. Example values are nginx.access and nginx.error.

          Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

          default {{data_stream.dataset}}

        • namespace string | array[string]

          Field references or a static value for the namespace part of the data stream name. See the criteria for index names for allowed characters. Must be no longer than 100 characters.

          Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

          default {{data_stream.namespace}}

      • script object
        Hide script attributes Show script attributes object
      • set object
        Hide set attributes Show set attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and value is a template snippet that evaluates to null or the empty string, the processor quietly exits without modifying the document.

        • The media type for encoding value. Applies only when value is a template snippet. Must be one of application/json, text/plain, or application/x-www-form-urlencoded.

        • override boolean

          If true processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

        • value object

          The value to be set for the field. Supports template snippets. May specify only one of value or copy_from.

      • Hide set_security_user attributes Show set_security_user attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Controls what user related properties are added to the field.

      • sort object
        Hide sort attributes Show sort attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • order string

          Values are asc or desc.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • split object
        Hide split attributes Show split attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Preserves empty trailing fields, if any.

        • separator string Required

          A regex which matches the separator, for example, , or \s+.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide terminate attributes Show terminate attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

      • trim object
        Hide trim attributes Show trim attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide uppercase attributes Show uppercase attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide urldecode attributes Show urldecode attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide uri_parts attributes Show uri_parts attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • If true, the processor copies the unparsed URI to <target_field>.original.

        • If true, the processor removes the field after parsing the URI string. If parsing fails, the processor does not remove the field.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide user_agent attributes Show user_agent attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The name of the file in the config/ingest-user-agent directory containing the regular expressions for parsing the user agent string. Both the directory and the file have to be created before starting Elasticsearch. If not specified, ingest-user-agent will use the regexes.yaml from uap-core it ships with.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Controls what properties are added to target_field.

          Values are name, os, device, original, or version.

        • Extracts device type from the user agent string on a best-effort basis.

    • processors array[object]

      Processors used to perform transformations on documents before indexing. Processors run sequentially in the order specified.

      Hide processors attributes Show processors attributes object
      • append object
        Hide append attributes Show append attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • value object | array[object] Required

          The value to be appended. Supports template snippets.

        • If false, the processor does not append values already present in the field.

      • Hide attachment attributes Show attachment attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The number of chars being used for extraction to prevent huge fields. Use -1 for no limit.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Array of properties to select to be stored. Can be content, title, name, author, keywords, date, content_type, content_length, language.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true, the binary field will be removed from the document

        • Field containing the name of the resource to decode. If specified, the processor passes this resource name to the underlying Tika library to enable Resource Name Based Detection.

      • bytes object
        Hide bytes attributes Show bytes attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • circle object
        Hide circle attributes Show circle attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • error_distance number Required

          The difference between the resulting inscribed distance from center to side and the circle’s radius (measured in meters for geo_shape, unit-less for shape).

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • shape_type string Required

          Values are geo_shape or shape.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide community_id attributes Show community_id attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • seed number

          Seed for the community ID hash. Must be between 0 and 65535 (inclusive). The seed can prevent hash collisions between network domains, such as a staging and production network that use the same addressing scheme.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • convert object
        Hide convert attributes Show convert attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • type string Required

          Values are integer, long, double, float, boolean, ip, string, or auto.

      • csv object
        Hide csv attributes Show csv attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Value used to fill empty fields. Empty fields are skipped if this is not provided. An empty field is one with no value (2 consecutive separators) or empty quotes ("").

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • quote string

          Quote used in CSV, has to be single character string.

        • Separator used in CSV, has to be single character string.

        • target_fields string | array[string] Required
        • trim boolean

          Trim whitespaces in unquoted fields.

      • date object
        Hide date attributes Show date attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • formats array[string] Required

          An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

        • locale string

          The locale to use when parsing the date, relevant when parsing month names or week days. Supports template snippets.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • timezone string

          The timezone to use when parsing the date. Supports template snippets.

        • The format to use when writing the date to target_field. Must be a valid java time pattern.

      • Hide date_index_name attributes Show date_index_name attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • date_formats array[string]

          An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.

        • date_rounding string Required

          How to round the date when formatting the date into the index name. Valid values are: y (year), M (month), w (week), d (day), h (hour), m (minute) and s (second). Supports template snippets.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • The format to be used when printing the parsed date into the index name. A valid java time pattern is expected here. Supports template snippets.

        • A prefix of the index name to be prepended before the printed date. Supports template snippets.

        • locale string

          The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.

        • timezone string

          The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.

      • dissect object
        Hide dissect attributes Show dissect attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The character(s) that separate the appended fields.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • pattern string Required

          The pattern to apply to the field.

      • Hide dot_expander attributes Show dot_expander attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • override boolean

          Controls the behavior when there is already an existing nested object that conflicts with the expanded field. When false, the processor will merge conflicts by combining the old and the new values into an array. When true, the value from the expanded field will overwrite the existing value.

        • path string

          The field that contains the field to expand. Only required if the field to expand is part another object field, because the field option can only understand leaf fields.

      • drop object
        Hide drop attributes Show drop attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

      • enrich object
        Hide enrich attributes Show enrich attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The maximum number of matched documents to include under the configured target field. The target_field will be turned into a json array if max_matches is higher than 1, otherwise target_field will become a json object. In order to avoid documents getting too large, the maximum allowed value is 128.

        • override boolean

          If processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

        • policy_name string Required

          The name of the enrich policy to use.

        • Values are intersects, disjoint, within, or contains.

        • target_field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • fail object
        Hide fail attributes Show fail attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • message string Required

          The error message thrown by the processor. Supports template snippets.

      • Hide fingerprint attributes Show fingerprint attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • fields string | array[string] Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • salt string

          Salt value for the hash function.

        • method string

          Values are MD5, SHA-1, SHA-256, SHA-512, or MurmurHash3.

        • If true, the processor ignores any missing fields. If all fields are missing, the processor silently exits without modifying the document.

      • foreach object
        Hide foreach attributes Show foreach attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true, the processor silently exits without changing the document if the field is null or missing.

        • processor object Required
      • Hide ip_location attributes Show ip_location attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • first_only boolean

          If true, only the first found IP location data will be returned, even if the field contains an array.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • properties array[string]

          Controls what properties are added to the target_field based on the IP location lookup.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

      • geo_grid object
        Hide geo_grid attributes Show geo_grid attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          The field to interpret as a geo-tile.= The field format is determined by the tile_type.

        • tile_type string Required

          Values are geotile, geohex, or geohash.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Values are geojson or wkt.

      • geoip object
        Hide geoip attributes Show geoip attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • first_only boolean

          If true, only the first found geoip data will be returned, even if the field contains an array.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • properties array[string]

          Controls what properties are added to the target_field based on the geoip lookup.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created. Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.

      • grok object
        Hide grok attributes Show grok attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Must be disabled or v1. If v1, the processor uses patterns with Elastic Common Schema (ECS) field names.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • A map of pattern-name and pattern tuples defining custom patterns to be used by the current processor. Patterns matching existing names will override the pre-existing definition.

          Hide pattern_definitions attribute Show pattern_definitions attribute object
          • * string Additional properties
        • patterns array[string] Required

          An ordered list of grok expression to match and extract named captures with. Returns on the first expression in the list that matches.

        • When true, _ingest._grok_match_index will be inserted into your matched document’s metadata with the index into the pattern found in patterns that matched.

      • gsub object
        Hide gsub attributes Show gsub attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • pattern string Required

          The pattern to be replaced.

        • replacement string Required

          The string to replace the matching patterns with.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide html_strip attributes Show html_strip attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document,

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide inference attributes Show inference attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • model_id string Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.

          Hide field_map attribute Show field_map attribute object
          • * object Additional properties
        • Hide inference_config attributes Show inference_config attributes object
        • input_output object | array[object]

          Input fields for inference and output (destination) fields for the inference results. This option is incompatible with the target_field and field_map options.

        • If true and any of the input fields defined in input_ouput are missing then those missing fields are quietly ignored, otherwise a missing field causes a failure. Only applies when using input_output configurations to explicitly list the input fields.

      • join object
        Hide join attributes Show join attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • separator string Required

          The separator character.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • json object
        Hide json attributes Show json attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Flag that forces the parsed JSON to be added at the top level of the document. target_field must not be set when this option is chosen.

        • Values are replace or merge.

        • When set to true, the JSON parser will not fail if the JSON contains duplicate keys. Instead, the last encountered value for any duplicate key wins.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • kv object
        Hide kv attributes Show kv attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • exclude_keys array[string]

          List of keys to exclude from document.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • field_split string Required

          Regex pattern to use for splitting key-value pairs.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • include_keys array[string]

          List of keys to filter and insert into document. Defaults to including all keys.

        • prefix string

          Prefix to be added to extracted keys.

        • If true. strip brackets (), <>, [] as well as quotes ' and " from extracted values.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • trim_key string

          String of characters to trim from extracted keys.

        • String of characters to trim from extracted values.

        • value_split string Required

          Regex pattern to use for splitting the key from the value within a key-value pair.

      • Hide lowercase attributes Show lowercase attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide network_direction attributes Show network_direction attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • internal_networks array[string]

          List of internal networks. Supports IPv4 and IPv6 addresses and ranges in CIDR notation. Also supports the named ranges listed below. These may be constructed with template snippets. Must specify only one of internal_networks or internal_networks_field.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • pipeline object
        Hide pipeline attributes Show pipeline attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • name string Required
        • Whether to ignore missing pipelines instead of failing.

      • redact object
        Hide redact attributes Show redact attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • patterns array[string] Required

          A list of grok expressions to match and redact named captures with

        • Hide pattern_definitions attribute Show pattern_definitions attribute object
          • * string Additional properties
        • prefix string

          Start a redacted section with this token

        • suffix string

          End a redacted section with this token

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • If true and the current license does not support running redact processors, then the processor quietly exits without modifying the document

        • If true then ingest metadata _ingest._redact._is_redacted is set to true if the document has been redacted

      • Hide registered_domain attributes Show registered_domain attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and any required fields are missing, the processor quietly exits without modifying the document.

      • remove object
        Hide remove attributes Show remove attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string | array[string] Required
        • keep string | array[string]
        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

      • rename object
        Hide rename attributes Show rename attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • target_field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • reroute object
        Hide reroute attributes Show reroute attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • A static value for the target. Can’t be set when the dataset or namespace option is set.

        • dataset string | array[string]

          Field references or a static value for the dataset part of the data stream name. In addition to the criteria for index names, cannot contain - and must be no longer than 100 characters. Example values are nginx.access and nginx.error.

          Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

          default {{data_stream.dataset}}

        • namespace string | array[string]

          Field references or a static value for the namespace part of the data stream name. See the criteria for index names for allowed characters. Must be no longer than 100 characters.

          Supports field references with a mustache-like syntax (denoted as {{double}} or {{{triple}}} curly braces). When resolving field references, the processor replaces invalid characters with _. Uses the part of the index name as a fallback if all field references resolve to a null, missing, or non-string value.

          default {{data_stream.namespace}}

      • script object
        Hide script attributes Show script attributes object
      • set object
        Hide set attributes Show set attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and value is a template snippet that evaluates to null or the empty string, the processor quietly exits without modifying the document.

        • The media type for encoding value. Applies only when value is a template snippet. Must be one of application/json, text/plain, or application/x-www-form-urlencoded.

        • override boolean

          If true processor will update fields with pre-existing non-null-valued field. When set to false, such fields will not be touched.

        • value object

          The value to be set for the field. Supports template snippets. May specify only one of value or copy_from.

      • Hide set_security_user attributes Show set_security_user attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Controls what user related properties are added to the field.

      • sort object
        Hide sort attributes Show sort attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • order string

          Values are asc or desc.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • split object
        Hide split attributes Show split attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Preserves empty trailing fields, if any.

        • separator string Required

          A regex which matches the separator, for example, , or \s+.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide terminate attributes Show terminate attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

      • trim object
        Hide trim attributes Show trim attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide uppercase attributes Show uppercase attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide urldecode attributes Show urldecode attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist or is null, the processor quietly exits without modifying the document.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide uri_parts attributes Show uri_parts attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • If true, the processor copies the unparsed URI to <target_field>.original.

        • If true, the processor removes the field after parsing the URI string. If parsing fails, the processor does not remove the field.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide user_agent attributes Show user_agent attributes object
        • Description of the processor. Useful for describing the purpose of the processor or its configuration.

        • if object
          Hide if attributes Show if attributes object
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          • options object
        • Ignore failures for the processor.

        • on_failure array[object]

          Handle failures for the processor.

        • tag string

          Identifier for the processor. Useful for debugging and metrics.

        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If true and field does not exist, the processor quietly exits without modifying the document.

        • The name of the file in the config/ingest-user-agent directory containing the regular expressions for parsing the user agent string. Both the directory and the file have to be created before starting Elasticsearch. If not specified, ingest-user-agent will use the regexes.yaml from uap-core it ships with.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • properties array[string]

          Controls what properties are added to target_field.

          Values are name, os, device, original, or version.

        • Extracts device type from the user agent string on a best-effort basis.

    • version number
    • deprecated boolean

      Marks this ingest pipeline as deprecated. When a deprecated ingest pipeline is referenced as the default or final pipeline when creating or updating a non-deprecated index template, Elasticsearch will emit a deprecation warning.

    • _meta object
      Hide _meta attribute Show _meta attribute object
      • * object Additional properties

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • docs array[object] Required
      Hide docs attributes Show docs attributes object
      • doc object
        Hide doc attributes Show doc attributes object
        • _id string Required
        • _index string Required
        • _ingest object Required
          Hide _ingest attributes Show _ingest attributes object
        • _routing string

          Value used to send the document to a specific primary shard.

        • _source object Required

          JSON body for the document.

          Hide _source attribute Show _source attribute object
          • * object Additional properties
        • _version number | string

          Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

          Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

        • Values are internal, external, external_gte, or force.

      • error object
        Hide error attributes Show error attributes object
      • processor_results array[object]
        Hide processor_results attributes Show processor_results attributes object
POST /_ingest/pipeline/{id}/_simulate
curl \
 --request POST 'http://api.example.com/_ingest/pipeline/{id}/_simulate' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"pipeline\" :\n  {\n    \"description\": \"_description\",\n    \"processors\": [\n      {\n        \"set\" : {\n          \"field\" : \"field2\",\n          \"value\" : \"_value\"\n        }\n      }\n    ]\n  },\n  \"docs\": [\n    {\n      \"_index\": \"index\",\n      \"_id\": \"id\",\n      \"_source\": {\n        \"foo\": \"bar\"\n      }\n    },\n    {\n      \"_index\": \"index\",\n      \"_id\": \"id\",\n      \"_source\": {\n        \"foo\": \"rab\"\n      }\n    }\n  ]\n}"'
Request example
You can specify the used pipeline either in the request body or as a path parameter.
{
  "pipeline" :
  {
    "description": "_description",
    "processors": [
      {
        "set" : {
          "field" : "field2",
          "value" : "_value"
        }
      }
    ]
  },
  "docs": [
    {
      "_index": "index",
      "_id": "id",
      "_source": {
        "foo": "bar"
      }
    },
    {
      "_index": "index",
      "_id": "id",
      "_source": {
        "foo": "rab"
      }
    }
  ]
}
Response examples (200)
A successful response for running an ingest pipeline against a set of provided documents.
{
   "docs": [
      {
         "doc": {
            "_id": "id",
            "_index": "index",
            "_version": "-3",
            "_source": {
               "field2": "_value",
               "foo": "bar"
            },
            "_ingest": {
               "timestamp": "2017-05-04T22:30:03.187Z"
            }
         }
      },
      {
         "doc": {
            "_id": "id",
            "_index": "index",
            "_version": "-3",
            "_source": {
               "field2": "_value",
               "foo": "rab"
            },
            "_ingest": {
               "timestamp": "2017-05-04T22:30:03.188Z"
            }
         }
      }
   ]
}

Get license information

GET /_license

Get information about your Elastic license including its type, its status, when it was issued, and when it expires.


If the master node is generating a new cluster state, the get license API may return a 404 Not Found response. If you receive an unexpected 404 response after cluster startup, wait a short period and retry the request.

Query parameters

  • accept_enterprise boolean Deprecated

    If true, this parameter returns enterprise for Enterprise license types. If false, this parameter returns platinum for both platinum and enterprise license types. This behavior is maintained for backwards compatibility. This parameter is deprecated and will always be set to true in 8.x.

  • local boolean

    Specifies whether to retrieve local information. The default value is false, which means the information is retrieved from the master node.

Responses

GET /_license
curl \
 --request GET 'http://api.example.com/_license' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET /_license`.
{
  "license" : {
    "status" : "active",
    "uid" : "cbff45e7-c553-41f7-ae4f-9205eabd80xx",
    "type" : "trial",
    "issue_date" : "2018-10-20T22:05:12.332Z",
    "issue_date_in_millis" : 1540073112332,
    "expiry_date" : "2018-11-19T22:05:12.332Z",
    "expiry_date_in_millis" : 1542665112332,
    "max_nodes" : 1000,
    "max_resource_units" : null,
    "issued_to" : "test",
    "issuer" : "elasticsearch",
    "start_date_in_millis" : -1
  }
}

Logstash

Logstash APIs enable you to manage pipelines that are used by Logstash Central Management.

Learn more about centralized pipeline management

Get Logstash pipelines Added in 7.12.0

GET /_logstash/pipeline/{id}

Get pipelines that are used for Logstash Central Management.

External documentation

Path parameters

  • id string | array[string] Required

    A comma-separated list of pipeline identifiers.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • description string Required

        A description of the pipeline. This description is not used by Elasticsearch or Logstash.

      • last_modified string | number Required

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • pipeline string Required

        The configuration for the pipeline.

        External documentation
      • pipeline_metadata object Required
        Hide pipeline_metadata attributes Show pipeline_metadata attributes object
      • pipeline_settings object Required
        Hide pipeline_settings attributes Show pipeline_settings attributes object
        • pipeline.workers number Required

          The number of workers that will, in parallel, execute the filter and output stages of the pipeline.

        • pipeline.batch.size number Required

          The maximum number of events an individual worker thread will collect from inputs before attempting to execute its filters and outputs.

        • pipeline.batch.delay number Required

          When creating pipeline event batches, how long in milliseconds to wait for each event before dispatching an undersized batch to pipeline workers.

        • queue.type string Required

          The internal queuing model to use for event buffering.

        • queue.max_bytes string Required

          The total capacity of the queue (queue.type: persisted) in number of bytes.

        • The maximum number of written events before forcing a checkpoint when persistent queues are enabled (queue.type: persisted).

      • username string Required

        The user who last updated the pipeline.

GET /_logstash/pipeline/{id}
curl \
 --request GET 'http://api.example.com/_logstash/pipeline/{id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _logstash/pipeline/my_pipeline`.
{
  "my_pipeline": {
    "description": "Sample pipeline for illustration purposes",
    "last_modified": "2021-01-02T02:50:51.250Z",
    "pipeline_metadata": {
      "type": "logstash_pipeline",
      "version": "1"
    },
    "username": "elastic",
    "pipeline": "input {}\\n filter { grok {} }\\n output {}",
    "pipeline_settings": {
      "pipeline.workers": 1,
      "pipeline.batch.size": 125,
      "pipeline.batch.delay": 50,
      "queue.type": "memory",
      "queue.max_bytes": "1gb",
      "queue.checkpoint.writes": 1024
    }
  }
}

Create or update a Logstash pipeline Added in 7.12.0

PUT /_logstash/pipeline/{id}

Create a pipeline that is used for Logstash Central Management. If the specified pipeline exists, it is replaced.

External documentation

Path parameters

  • id string Required

    An identifier for the pipeline.

application/json

Body Required

  • description string Required

    A description of the pipeline. This description is not used by Elasticsearch or Logstash.

  • last_modified string | number Required

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • pipeline string Required

    The configuration for the pipeline.

    External documentation
  • pipeline_metadata object Required
    Hide pipeline_metadata attributes Show pipeline_metadata attributes object
  • pipeline_settings object Required
    Hide pipeline_settings attributes Show pipeline_settings attributes object
    • pipeline.workers number Required

      The number of workers that will, in parallel, execute the filter and output stages of the pipeline.

    • pipeline.batch.size number Required

      The maximum number of events an individual worker thread will collect from inputs before attempting to execute its filters and outputs.

    • pipeline.batch.delay number Required

      When creating pipeline event batches, how long in milliseconds to wait for each event before dispatching an undersized batch to pipeline workers.

    • queue.type string Required

      The internal queuing model to use for event buffering.

    • queue.max_bytes string Required

      The total capacity of the queue (queue.type: persisted) in number of bytes.

    • The maximum number of written events before forcing a checkpoint when persistent queues are enabled (queue.type: persisted).

  • username string Required

    The user who last updated the pipeline.

Responses

PUT /_logstash/pipeline/{id}
curl \
 --request PUT 'http://api.example.com/_logstash/pipeline/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"description\": \"Sample pipeline for illustration purposes\",\n  \"last_modified\": \"2021-01-02T02:50:51.250Z\",\n  \"pipeline_metadata\": {\n    \"type\": \"logstash_pipeline\",\n    \"version\": 1\n  },\n  \"username\": \"elastic\",\n  \"pipeline\": \"input {}\\\\n filter { grok {} }\\\\n output {}\",\n  \"pipeline_settings\": {\n    \"pipeline.workers\": 1,\n    \"pipeline.batch.size\": 125,\n    \"pipeline.batch.delay\": 50,\n    \"queue.type\": \"memory\",\n    \"queue.max_bytes\": \"1gb\",\n    \"queue.checkpoint.writes\": 1024\n  }\n}"'
Request example
Run `PUT _logstash/pipeline/my_pipeline` to create a pipeline.
{
  "description": "Sample pipeline for illustration purposes",
  "last_modified": "2021-01-02T02:50:51.250Z",
  "pipeline_metadata": {
    "type": "logstash_pipeline",
    "version": 1
  },
  "username": "elastic",
  "pipeline": "input {}\\n filter { grok {} }\\n output {}",
  "pipeline_settings": {
    "pipeline.workers": 1,
    "pipeline.batch.size": 125,
    "pipeline.batch.delay": 50,
    "queue.type": "memory",
    "queue.max_bytes": "1gb",
    "queue.checkpoint.writes": 1024
  }
}

Delete a Logstash pipeline Added in 7.12.0

DELETE /_logstash/pipeline/{id}

Delete a pipeline that is used for Logstash Central Management. If the request succeeds, you receive an empty response with an appropriate status code.

External documentation

Path parameters

  • id string Required

    An identifier for the pipeline.

Responses

DELETE /_logstash/pipeline/{id}
curl \
 --request DELETE 'http://api.example.com/_logstash/pipeline/{id}' \
 --header "Authorization: $API_KEY"

Get Logstash pipelines Added in 7.12.0

GET /_logstash/pipeline

Get pipelines that are used for Logstash Central Management.

External documentation

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • description string Required

        A description of the pipeline. This description is not used by Elasticsearch or Logstash.

      • last_modified string | number Required

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • pipeline string Required

        The configuration for the pipeline.

        External documentation
      • pipeline_metadata object Required
        Hide pipeline_metadata attributes Show pipeline_metadata attributes object
      • pipeline_settings object Required
        Hide pipeline_settings attributes Show pipeline_settings attributes object
        • pipeline.workers number Required

          The number of workers that will, in parallel, execute the filter and output stages of the pipeline.

        • pipeline.batch.size number Required

          The maximum number of events an individual worker thread will collect from inputs before attempting to execute its filters and outputs.

        • pipeline.batch.delay number Required

          When creating pipeline event batches, how long in milliseconds to wait for each event before dispatching an undersized batch to pipeline workers.

        • queue.type string Required

          The internal queuing model to use for event buffering.

        • queue.max_bytes string Required

          The total capacity of the queue (queue.type: persisted) in number of bytes.

        • The maximum number of written events before forcing a checkpoint when persistent queues are enabled (queue.type: persisted).

      • username string Required

        The user who last updated the pipeline.

GET /_logstash/pipeline
curl \
 --request GET 'http://api.example.com/_logstash/pipeline' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _logstash/pipeline/my_pipeline`.
{
  "my_pipeline": {
    "description": "Sample pipeline for illustration purposes",
    "last_modified": "2021-01-02T02:50:51.250Z",
    "pipeline_metadata": {
      "type": "logstash_pipeline",
      "version": "1"
    },
    "username": "elastic",
    "pipeline": "input {}\\n filter { grok {} }\\n output {}",
    "pipeline_settings": {
      "pipeline.workers": 1,
      "pipeline.batch.size": 125,
      "pipeline.batch.delay": 50,
      "queue.type": "memory",
      "queue.max_bytes": "1gb",
      "queue.checkpoint.writes": 1024
    }
  }
}

Close anomaly detection jobs Added in 5.4.0

POST /_ml/anomaly_detectors/{job_id}/_close

A job can be opened and closed multiple times throughout its lifecycle. A closed job cannot receive data or perform analysis operations, but you can still explore and navigate results. When you close a job, it runs housekeeping tasks such as pruning the model history, flushing buffers, calculating final results and persisting the model snapshots. Depending upon the size of the job, it could take several minutes to close and the equivalent time to re-open. After it is closed, the job has a minimal overhead on the cluster except for maintaining its meta data. Therefore it is a best practice to close jobs that are no longer required to process data. If you close an anomaly detection job whose datafeed is running, the request first tries to stop the datafeed. This behavior is equivalent to calling stop datafeed API with the same timeout and force parameters as the close job request. When a datafeed that has a specified end date stops, it automatically closes its associated job.

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job. It can be a job identifier, a group name, or a wildcard expression. You can close multiple anomaly detection jobs in a single API request by using a group name, a comma-separated list of jobs, or a wildcard expression. You can close all jobs by using _all or by specifying * as the job identifier.

Query parameters

  • Specifies what to do when the request: contains wildcard expressions and there are no jobs that match; contains the _all string or no identifiers and there are no matches; or contains wildcard expressions and there are only partial matches. By default, it returns an empty jobs array when there are no matches and the subset of results when there are partial matches. If false, the request returns a 404 status code when there are no matches or only partial matches.

  • force boolean

    Use to close a failed job, or to forcefully close a job which has not responded to its initial close request; the request returns without performing the associated actions such as flushing buffers and persisting the model snapshots. If you want the job to be in a consistent state after the close job API returns, do not set to true. This parameter should be used only in situations where the job has already failed or where you are not interested in results the job might have recently produced or might produce in the future.

  • timeout string

    Controls the time to wait until a job has closed.

application/json

Body

  • Refer to the description for the allow_no_match query parameter.

  • force boolean

    Refer to the descriptiion for the force query parameter.

  • timeout string

    A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
POST /_ml/anomaly_detectors/{job_id}/_close
curl \
 --request POST 'http://api.example.com/_ml/anomaly_detectors/{job_id}/_close' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"allow_no_match":true,"force":true,"timeout":"string"}'
Response examples (200)
A successful response when closing anomaly detection jobs.
{
  "closed": true
}

Get calendar configuration info Added in 6.2.0

GET /_ml/calendars/{calendar_id}

Path parameters

  • calendar_id string Required

    A string that uniquely identifies a calendar. You can get information for multiple calendars by using a comma-separated list of ids or a wildcard expression. You can get information for all calendars by using _all or * or by omitting the calendar identifier.

Query parameters

  • from number

    Skips the specified number of calendars. This parameter is supported only when you omit the calendar identifier.

  • size number

    Specifies the maximum number of calendars to obtain. This parameter is supported only when you omit the calendar identifier.

application/json

Body

  • page object
    Hide page attributes Show page attributes object
    • from number

      Skips the specified number of items.

    • size number

      Specifies the maximum number of items to obtain.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • calendars array[object] Required
      Hide calendars attributes Show calendars attributes object
    • count number Required
GET /_ml/calendars/{calendar_id}
curl \
 --request GET 'http://api.example.com/_ml/calendars/{calendar_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"page":{"from":42.0,"size":42.0}}'

Create a calendar Added in 6.2.0

PUT /_ml/calendars/{calendar_id}

Path parameters

  • calendar_id string Required

    A string that uniquely identifies a calendar.

application/json

Body

  • job_ids array[string]

    An array of anomaly detection job identifiers.

  • A description of the calendar.

Responses

PUT /_ml/calendars/{calendar_id}
curl \
 --request PUT 'http://api.example.com/_ml/calendars/{calendar_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"job_ids":["string"],"description":"string"}'

Get calendar configuration info Added in 6.2.0

POST /_ml/calendars/{calendar_id}

Path parameters

  • calendar_id string Required

    A string that uniquely identifies a calendar. You can get information for multiple calendars by using a comma-separated list of ids or a wildcard expression. You can get information for all calendars by using _all or * or by omitting the calendar identifier.

Query parameters

  • from number

    Skips the specified number of calendars. This parameter is supported only when you omit the calendar identifier.

  • size number

    Specifies the maximum number of calendars to obtain. This parameter is supported only when you omit the calendar identifier.

application/json

Body

  • page object
    Hide page attributes Show page attributes object
    • from number

      Skips the specified number of items.

    • size number

      Specifies the maximum number of items to obtain.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • calendars array[object] Required
      Hide calendars attributes Show calendars attributes object
    • count number Required
POST /_ml/calendars/{calendar_id}
curl \
 --request POST 'http://api.example.com/_ml/calendars/{calendar_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"page":{"from":42.0,"size":42.0}}'

Delete a calendar Added in 6.2.0

DELETE /_ml/calendars/{calendar_id}

Remove all scheduled events from a calendar, then delete it.

Path parameters

  • calendar_id string Required

    A string that uniquely identifies a calendar.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ml/calendars/{calendar_id}
curl \
 --request DELETE 'http://api.example.com/_ml/calendars/{calendar_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when deleting a calendar.
{
  "acknowledged": true
}

Delete events from a calendar Added in 6.2.0

DELETE /_ml/calendars/{calendar_id}/events/{event_id}

Path parameters

  • calendar_id string Required

    A string that uniquely identifies a calendar.

  • event_id string Required

    Identifier for the scheduled event. You can obtain this identifier by using the get calendar events API.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ml/calendars/{calendar_id}/events/{event_id}
curl \
 --request DELETE 'http://api.example.com/_ml/calendars/{calendar_id}/events/{event_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when deleting a calendar event.
{
  "acknowledged": true
}

Add anomaly detection job to calendar Added in 6.2.0

PUT /_ml/calendars/{calendar_id}/jobs/{job_id}

Path parameters

  • calendar_id string Required

    A string that uniquely identifies a calendar.

  • job_id string | array[string] Required

    An identifier for the anomaly detection jobs. It can be a job identifier, a group name, or a comma-separated list of jobs or groups.

Responses

PUT /_ml/calendars/{calendar_id}/jobs/{job_id}
curl \
 --request PUT 'http://api.example.com/_ml/calendars/{calendar_id}/jobs/{job_id}' \
 --header "Authorization: $API_KEY"

Delete anomaly jobs from a calendar Added in 6.2.0

DELETE /_ml/calendars/{calendar_id}/jobs/{job_id}

Path parameters

  • calendar_id string Required

    A string that uniquely identifies a calendar.

  • job_id string | array[string] Required

    An identifier for the anomaly detection jobs. It can be a job identifier, a group name, or a comma-separated list of jobs or groups.

Responses

DELETE /_ml/calendars/{calendar_id}/jobs/{job_id}
curl \
 --request DELETE 'http://api.example.com/_ml/calendars/{calendar_id}/jobs/{job_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when deleting an anomaly detection job from a calendar.
{
  "calendar_id": "planned-outages",
  "job_ids": []
}

Get datafeeds configuration info Added in 5.5.0

GET /_ml/datafeeds/{datafeed_id}

You can get information for multiple datafeeds in a single API request by using a comma-separated list of datafeeds or a wildcard expression. You can get information for all datafeeds by using _all, by specifying * as the <feed_id>, or by omitting the <feed_id>. This API returns a maximum of 10,000 datafeeds.

Path parameters

  • datafeed_id string | array[string] Required

    Identifier for the datafeed. It can be a datafeed identifier or a wildcard expression. If you do not specify one of these options, the API returns information about all datafeeds.

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no datafeeds that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    The default value is true, which returns an empty datafeeds array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

  • Indicates if certain fields should be removed from the configuration on retrieval. This allows the configuration to be in an acceptable format to be retrieved and then added to another cluster.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • datafeeds array[object] Required
      Hide datafeeds attributes Show datafeeds attributes object
      • Hide authorization attributes Show authorization attributes object
        • api_key object
          Hide api_key attributes Show api_key attributes object
          • id string Required

            The identifier for the API key.

          • name string Required

            The name of the API key.

        • roles array[string]

          If a user ID was used for the most recent update to the datafeed, its roles at the time of the update are listed in the response.

        • If a service account was used for the most recent update to the datafeed, the account name is listed in the response.

      • Hide chunking_config attributes Show chunking_config attributes object
        • mode string Required

          Values are auto, manual, or off.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • datafeed_id string Required
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • indices array[string] Required
      • indexes array[string]
      • job_id string Required
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide script_fields attribute Show script_fields attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • script object Required
            Hide script attributes Show script attributes object
            • id string
            • params object

              Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            • options object
      • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • enabled boolean Required

          Specifies whether the datafeed periodically checks for delayed data.

      • Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
          • fetch_fields array[object]

            For type lookup

          • format string

            A custom format for date type runtime fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • script object
            Hide script attributes Show script attributes object
            • id string
            • params object

              Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            • options object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • Hide indices_options attributes Show indices_options attributes object
        • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

        • expand_wildcards string | array[string]
        • If true, missing or closed indices are not included in the response.

        • If true, concrete, expanded or aliased indices are ignored when frozen.

      • query object Required

        The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {"boost": 1}}.

        Query DSL
GET /_ml/datafeeds/{datafeed_id}
curl \
 --request GET 'http://api.example.com/_ml/datafeeds/{datafeed_id}' \
 --header "Authorization: $API_KEY"

Create a datafeed Added in 5.4.0

PUT /_ml/datafeeds/{datafeed_id}

Datafeeds retrieve data from Elasticsearch for analysis by an anomaly detection job. You can associate only one datafeed with each anomaly detection job. The datafeed contains a query that runs at a defined interval (frequency). If you are concerned about delayed data, you can add a delay (query_delay') at each interval. By default, the datafeed uses the following query:{"match_all": {"boost": 1}}`.

When Elasticsearch security features are enabled, your datafeed remembers which roles the user who created it had at the time of creation and runs the query using those same roles. If you provide secondary authorization headers, those credentials are used instead. You must use Kibana, this API, or the create anomaly detection jobs API to create a datafeed. Do not add a datafeed directly to the .ml-config index. Do not give users write privileges on the .ml-config index.

Path parameters

  • datafeed_id string Required

    A numerical character string that uniquely identifies the datafeed. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.

Query parameters

  • If true, wildcard indices expressions that resolve into no concrete indices are ignored. This includes the _all string or when no indices are specified.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • ignore_throttled boolean Deprecated

    If true, concrete, expanded, or aliased indices are ignored when frozen.

  • If true, unavailable indices (missing or closed) are ignored.

application/json

Body Required

  • If set, the datafeed performs aggregation searches. Support for aggregations is limited and should be used only with low cardinality data.

  • Hide chunking_config attributes Show chunking_config attributes object
    • mode string Required

      Values are auto, manual, or off.

    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • enabled boolean Required

      Specifies whether the datafeed periodically checks for delayed data.

  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • indices string | array[string]
  • Hide indices_options attributes Show indices_options attributes object
    • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

    • expand_wildcards string | array[string]
    • If true, missing or closed indices are not included in the response.

    • If true, concrete, expanded or aliased indices are ignored when frozen.

  • job_id string
  • If a real-time datafeed has never seen any data (including during any initial training period), it automatically stops and closes the associated job after this many real-time searches return no documents. In other words, it stops after frequency times max_empty_searches of real-time operation. If not set, a datafeed with no end time that sees no data remains started until it is explicitly stopped. By default, it is not set.

  • query object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • Hide runtime_mappings attribute Show runtime_mappings attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • fields object

        For type composite

        Hide fields attribute Show fields attribute object
        • * object Additional properties
          Hide * attribute Show * attribute object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • fetch_fields array[object]

        For type lookup

        Hide fetch_fields attributes Show fetch_fields attributes object
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • format string
      • format string

        A custom format for date type runtime fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • script object
        Hide script attributes Show script attributes object
        • source string | object

          One of:
        • id string
        • params object

          Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          Hide params attribute Show params attribute object
          • * object Additional properties
        • lang string

          Any of:

          Values are painless, expression, mustache, or java.

        • options object
          Hide options attribute Show options attribute object
          • * string Additional properties
      • type string Required

        Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

  • Specifies scripts that evaluate custom expressions and returns script fields to the datafeed. The detector configuration objects in a job can contain functions that use these script fields.

    Hide script_fields attribute Show script_fields attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • script object Required
        Hide script attributes Show script attributes object
        • source string | object

          One of:
        • id string
        • params object

          Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          Hide params attribute Show params attribute object
          • * object Additional properties
        • lang string

          Any of:

          Values are painless, expression, mustache, or java.

        • options object
          Hide options attribute Show options attribute object
          • * string Additional properties
  • The size parameter that is used in Elasticsearch searches when the datafeed does not use aggregations. The maximum value is the value of index.max_result_window, which is 10,000 by default.

  • headers object

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide authorization attributes Show authorization attributes object
      • api_key object
        Hide api_key attributes Show api_key attributes object
        • id string Required

          The identifier for the API key.

        • name string Required

          The name of the API key.

      • roles array[string]

        If a user ID was used for the most recent update to the datafeed, its roles at the time of the update are listed in the response.

      • If a service account was used for the most recent update to the datafeed, the account name is listed in the response.

    • chunking_config object Required
      Hide chunking_config attributes Show chunking_config attributes object
      • mode string Required

        Values are auto, manual, or off.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • enabled boolean Required

        Specifies whether the datafeed periodically checks for delayed data.

    • datafeed_id string Required
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • indices array[string] Required
    • job_id string Required
    • Hide indices_options attributes Show indices_options attributes object
      • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

      • expand_wildcards string | array[string]
      • If true, missing or closed indices are not included in the response.

      • If true, concrete, expanded or aliased indices are ignored when frozen.

    • query object Required

      An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
    • query_delay string Required

      A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Hide runtime_mappings attribute Show runtime_mappings attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • fields object

          For type composite

          Hide fields attribute Show fields attribute object
          • * object Additional properties
            Hide * attribute Show * attribute object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • fetch_fields array[object]

          For type lookup

          Hide fetch_fields attributes Show fetch_fields attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • format string
        • format string

          A custom format for date type runtime fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • script object
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
        • type string Required

          Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

    • Hide script_fields attribute Show script_fields attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • script object Required
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
    • scroll_size number Required
PUT /_ml/datafeeds/{datafeed_id}
curl \
 --request PUT 'http://api.example.com/_ml/datafeeds/{datafeed_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"aggregations":{},"chunking_config":{"mode":"auto","time_span":"string"},"delayed_data_check_config":{"check_window":"string","enabled":true},"frequency":"string","indices":"string","indices_options":{"allow_no_indices":true,"expand_wildcards":"string","ignore_unavailable":true,"ignore_throttled":true},"job_id":"string","max_empty_searches":42.0,"query":{},"query_delay":"string","runtime_mappings":{"additionalProperty1":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"},"additionalProperty2":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"}},"script_fields":{"additionalProperty1":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true},"additionalProperty2":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true}},"scroll_size":42.0,"headers":{}}'

Delete a datafeed Added in 5.4.0

DELETE /_ml/datafeeds/{datafeed_id}

Path parameters

  • datafeed_id string Required

    A numerical character string that uniquely identifies the datafeed. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.

Query parameters

  • force boolean

    Use to forcefully delete a started datafeed; this method is quicker than stopping and deleting the datafeed.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ml/datafeeds/{datafeed_id}
curl \
 --request DELETE 'http://api.example.com/_ml/datafeeds/{datafeed_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when deleting a datafeed.
{
  "acknowledged": true
}

Get filters Added in 5.5.0

GET /_ml/filters/{filter_id}

You can get a single filter or all filters.

Path parameters

  • filter_id string | array[string] Required

    A string that uniquely identifies a filter.

Query parameters

  • from number

    Skips the specified number of filters.

  • size number

    Specifies the maximum number of filters to obtain.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • filters array[object] Required
      Hide filters attributes Show filters attributes object
      • A description of the filter.

      • filter_id string Required
      • items array[string] Required

        An array of strings which is the filter item list.

GET /_ml/filters/{filter_id}
curl \
 --request GET 'http://api.example.com/_ml/filters/{filter_id}' \
 --header "Authorization: $API_KEY"

Create a filter Added in 5.4.0

PUT /_ml/filters/{filter_id}

A filter contains a list of strings. It can be used by one or more anomaly detection jobs. Specifically, filters are referenced in the custom_rules property of detector configuration objects.

Path parameters

  • filter_id string Required

    A string that uniquely identifies a filter.

application/json

Body Required

  • A description of the filter.

  • items array[string]

    The items of the filter. A wildcard * can be used at the beginning or the end of an item. Up to 10000 items are allowed in each filter.

Responses

PUT /_ml/filters/{filter_id}
curl \
 --request PUT 'http://api.example.com/_ml/filters/{filter_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"description":"string","items":["string"]}'

Delete a filter Added in 5.4.0

DELETE /_ml/filters/{filter_id}

If an anomaly detection job references the filter, you cannot delete the filter. You must update or delete the job before you can delete the filter.

Path parameters

  • filter_id string Required

    A string that uniquely identifies a filter.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ml/filters/{filter_id}
curl \
 --request DELETE 'http://api.example.com/_ml/filters/{filter_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when deleting a filter.
{
  "acknowledged": true
}

Get anomaly detection jobs configuration info Added in 5.5.0

GET /_ml/anomaly_detectors/{job_id}

You can get information for multiple anomaly detection jobs in a single API request by using a group name, a comma-separated list of jobs, or a wildcard expression. You can get information for all anomaly detection jobs by using _all, by specifying * as the <job_id>, or by omitting the <job_id>.

Path parameters

  • job_id string | array[string] Required

    Identifier for the anomaly detection job. It can be a job identifier, a group name, or a wildcard expression. If you do not specify one of these options, the API returns information for all anomaly detection jobs.

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no jobs that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    The default value is true, which returns an empty jobs array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

  • Indicates if certain fields should be removed from the configuration on retrieval. This allows the configuration to be in an acceptable format to be retrieved and then added to another cluster.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • jobs array[object] Required
      Hide jobs attributes Show jobs attributes object
      • allow_lazy_open boolean Required

        Advanced configuration option. Specifies whether this job can open when there is insufficient machine learning node capacity for it to be immediately assigned to a node.

      • analysis_config object Required
        Hide analysis_config attributes Show analysis_config attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • categorization_analyzer string | object

          One of:
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If categorization_field_name is specified, you can also define optional filters. This property expects an array of regular expressions. The expressions are used to filter out matching sequences from the categorization field values. You can use this functionality to fine tune the categorization by excluding sequences from consideration when categories are defined. For example, you can exclude SQL statements that appear in your log files. This property cannot be used at the same time as categorization_analyzer. If you only want to define simple regular expression filters that are applied prior to tokenization, setting this property is the easiest method. If you also want to customize the tokenizer or post-tokenization filtering, use the categorization_analyzer property instead and include the filters as pattern_replace character filters. The effect is exactly the same.

        • detectors array[object] Required

          Detector configuration objects specify which data fields a job analyzes. They also specify which analytical functions are used. You can specify multiple detectors for a job. If the detectors array does not contain at least one detector, no analysis can occur and an error is returned.

          Hide detectors attributes Show detectors attributes object
          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • custom_rules array[object]

            Custom rules enable you to customize the way detectors operate. For example, a rule may dictate conditions under which results should be skipped. Kibana refers to custom rules as job rules.

          • A description of the detector.

          • A unique identifier for the detector. This identifier is based on the order of the detectors in the analysis_config, starting at zero. If you specify a value for this property, it is ignored.

          • Values are all, none, by, or over.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • function string

            The analysis function that is used. For example, count, rare, mean, min, max, or sum.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • use_null boolean

            Defines whether a new series is used as the null series when there is no value for the by or partition fields.

        • influencers array[string]

          A comma separated list of influencer field names. Typically these can be the by, over, or partition fields that are used in the detector configuration. You might also want to use a field name that is not specifically named in a detector, but is available as part of the input data. When you use multiple detectors, the use of influencers is recommended as it aggregates results for each influencer entity.

        • latency string

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • This functionality is reserved for internal use. It is not supported for use in customer environments and is not subject to the support SLA of official GA features. If set to true, the analysis will automatically find correlations between metrics for a given by field value and report anomalies when those correlations cease to hold. For example, suppose CPU and memory usage on host A is usually highly correlated with the same metrics on host B. Perhaps this correlation occurs because they are running a load-balanced application. If you enable this property, anomalies will be reported when, for example, CPU usage on host A is high and the value of CPU usage on host B is low. That is to say, you’ll see an anomaly when the CPU of host A is unusual given the CPU of host B. To use the multivariate_by_fields property, you must also specify by_field_name in your detector.

        • Hide per_partition_categorization attributes Show per_partition_categorization attributes object
          • enabled boolean

            To enable this setting, you must also set the partition_field_name property to the same value in every detector that uses the keyword mlcategory. Otherwise, job creation fails.

          • This setting can be set to true only if per-partition categorization is enabled. If true, both categorization and subsequent anomaly detection stops for partitions where the categorization status changes to warn. This setting makes it viable to have a job where it is expected that categorization works well for some partitions but not others; you do not pay the cost of bad categorization forever in the partitions where it works badly.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide analysis_limits attributes Show analysis_limits attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • blocked object
        Hide blocked attributes Show blocked attributes object
      • create_time string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • Custom metadata about the job

      • Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies a period of time (in days) after which only the first snapshot per day is retained. This period is relative to the timestamp of the most recent snapshot for this job. Valid values range from 0 to model_snapshot_retention_days.

      • data_description object Required
        Hide data_description attributes Show data_description attributes object
        • format string

          Only JSON format is supported at this time.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • The time format, which can be epoch, epoch_ms, or a custom pattern. The value epoch refers to UNIX or Epoch time (the number of seconds since 1 Jan 1970). The value epoch_ms indicates that time is measured in milliseconds since the epoch. The epoch and epoch_ms time formats accept either integer or real values. Custom patterns must conform to the Java DateTimeFormatter class. When you use date-time formatting patterns, it is recommended that you provide the full date, time and time zone. For example: yyyy-MM-dd'T'HH:mm:ssX. If the pattern that you specify is not sufficient to produce a complete timestamp, job creation fails.

      • Hide datafeed_config attributes Show datafeed_config attributes object
        • Hide authorization attributes Show authorization attributes object
          • api_key object
            Hide api_key attributes Show api_key attributes object
            • id string Required

              The identifier for the API key.

            • name string Required

              The name of the API key.

          • roles array[string]

            If a user ID was used for the most recent update to the datafeed, its roles at the time of the update are listed in the response.

          • If a service account was used for the most recent update to the datafeed, the account name is listed in the response.

        • Hide chunking_config attributes Show chunking_config attributes object
          • mode string Required

            Values are auto, manual, or off.

          • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • datafeed_id string Required
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • indices array[string] Required
        • indexes array[string]
        • job_id string Required
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide script_fields attribute Show script_fields attribute object
        • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
          • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • enabled boolean Required

            Specifies whether the datafeed periodically checks for delayed data.

        • Hide runtime_mappings attribute Show runtime_mappings attribute object
          • * object Additional properties
            Hide * attributes Show * attributes object
            • fields object

              For type composite

            • fetch_fields array[object]

              For type lookup

            • format string

              A custom format for date type runtime fields.

            • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • script object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • Hide indices_options attributes Show indices_options attributes object
          • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

          • expand_wildcards string | array[string]
          • If true, missing or closed indices are not included in the response.

          • If true, concrete, expanded or aliased indices are ignored when frozen.

        • query object Required

          The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {"boost": 1}}.

          Query DSL
      • deleting boolean

        Indicates that the process of deleting the job is in progress but not yet completed. It is only reported when true.

      • A description of the job.

      • finished_time string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • groups array[string]

        A list of job groups. A job can belong to no groups or many.

      • job_id string Required
      • job_type string

        Reserved for future use, currently set to anomaly_detector.

      • Hide model_plot_config attributes Show model_plot_config attributes object
        • If true, enables calculation and storage of the model change annotations for each entity that is being analyzed.

        • enabled boolean

          If true, enables calculation and storage of the model bounds for each entity that is being analyzed.

        • terms string

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies the maximum period of time (in days) that snapshots are retained. This period is relative to the timestamp of the most recent snapshot for this job. By default, snapshots ten days older than the newest snapshot are deleted.

      • Advanced configuration option. The period over which adjustments to the score are applied, as new data is seen. The default value is the longer of 30 days or 100 bucket_spans.

      • results_index_name string Required
      • Advanced configuration option. The period of time (in days) that results are retained. Age is calculated relative to the timestamp of the latest bucket result. If this property has a non-null value, once per day at 00:30 (server time), results that are the specified number of days older than the latest bucket result are deleted from Elasticsearch. The default value is null, which means all results are retained. Annotations generated by the system also count as results for retention purposes; they are deleted after the same number of days as results. Annotations added by users are retained forever.

GET /_ml/anomaly_detectors/{job_id}
curl \
 --request GET 'http://api.example.com/_ml/anomaly_detectors/{job_id}' \
 --header "Authorization: $API_KEY"

Create an anomaly detection job Added in 5.4.0

PUT /_ml/anomaly_detectors/{job_id}

If you include a datafeed_config, you must have read index privileges on the source index. If you include a datafeed_config but do not provide a query, the datafeed uses {"match_all": {"boost": 1}}.

Path parameters

  • job_id string Required

    The identifier for the anomaly detection job. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.

Query parameters

  • If true, wildcard indices expressions that resolve into no concrete indices are ignored. This includes the _all string or when no indices are specified.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values. Valid values are:

    • all: Match any data stream or index, including hidden ones.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard patterns are not accepted.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • ignore_throttled boolean Deprecated

    If true, concrete, expanded or aliased indices are ignored when frozen.

  • If true, unavailable indices (missing or closed) are ignored.

application/json

Body Required

  • Advanced configuration option. Specifies whether this job can open when there is insufficient machine learning node capacity for it to be immediately assigned to a node. By default, if a machine learning node with capacity to run the job cannot immediately be found, the open anomaly detection jobs API returns an error. However, this is also subject to the cluster-wide xpack.ml.max_lazy_ml_nodes setting. If this option is set to true, the open anomaly detection jobs API does not return an error and the job waits in the opening state until sufficient machine learning node capacity is available.

  • analysis_config object Required
    Hide analysis_config attributes Show analysis_config attributes object
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • categorization_analyzer string | object

      One of:
    • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • If categorization_field_name is specified, you can also define optional filters. This property expects an array of regular expressions. The expressions are used to filter out matching sequences from the categorization field values. You can use this functionality to fine tune the categorization by excluding sequences from consideration when categories are defined. For example, you can exclude SQL statements that appear in your log files. This property cannot be used at the same time as categorization_analyzer. If you only want to define simple regular expression filters that are applied prior to tokenization, setting this property is the easiest method. If you also want to customize the tokenizer or post-tokenization filtering, use the categorization_analyzer property instead and include the filters as pattern_replace character filters. The effect is exactly the same.

    • detectors array[object] Required

      Detector configuration objects specify which data fields a job analyzes. They also specify which analytical functions are used. You can specify multiple detectors for a job. If the detectors array does not contain at least one detector, no analysis can occur and an error is returned.

      Hide detectors attributes Show detectors attributes object
      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • custom_rules array[object]

        Custom rules enable you to customize the way detectors operate. For example, a rule may dictate conditions under which results should be skipped. Kibana refers to custom rules as job rules.

        Hide custom_rules attributes Show custom_rules attributes object
        • actions array[string]

          The set of actions to be triggered when the rule applies. If more than one action is specified the effects of all actions are combined.

          Supported values include:

          • skip_result: The result will not be created. Unless you also specify skip_model_update, the model will be updated as usual with the corresponding series value.
          • skip_model_update: The value for that series will not be used to update the model. Unless you also specify skip_result, the results will be created as usual. This action is suitable when certain values are expected to be consistently anomalous and they affect the model in a way that negatively impacts the rest of the results.

          Values are skip_result or skip_model_update.

        • conditions array[object]

          An array of numeric conditions when the rule applies. A rule must either have a non-empty scope or at least one condition. Multiple conditions are combined together with a logical AND.

        • scope object

          A scope of series where the rule applies. A rule must either have a non-empty scope or at least one condition. By default, the scope includes all series. Scoping is allowed for any of the fields that are also specified in by_field_name, over_field_name, or partition_field_name.

          Hide scope attribute Show scope attribute object
          • * object Additional properties
      • A description of the detector.

      • A unique identifier for the detector. This identifier is based on the order of the detectors in the analysis_config, starting at zero. If you specify a value for this property, it is ignored.

      • Values are all, none, by, or over.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • function string

        The analysis function that is used. For example, count, rare, mean, min, max, or sum.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • use_null boolean

        Defines whether a new series is used as the null series when there is no value for the by or partition fields.

    • influencers array[string]

      A comma separated list of influencer field names. Typically these can be the by, over, or partition fields that are used in the detector configuration. You might also want to use a field name that is not specifically named in a detector, but is available as part of the input data. When you use multiple detectors, the use of influencers is recommended as it aggregates results for each influencer entity.

    • latency string

      A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • This functionality is reserved for internal use. It is not supported for use in customer environments and is not subject to the support SLA of official GA features. If set to true, the analysis will automatically find correlations between metrics for a given by field value and report anomalies when those correlations cease to hold. For example, suppose CPU and memory usage on host A is usually highly correlated with the same metrics on host B. Perhaps this correlation occurs because they are running a load-balanced application. If you enable this property, anomalies will be reported when, for example, CPU usage on host A is high and the value of CPU usage on host B is low. That is to say, you’ll see an anomaly when the CPU of host A is unusual given the CPU of host B. To use the multivariate_by_fields property, you must also specify by_field_name in your detector.

    • Hide per_partition_categorization attributes Show per_partition_categorization attributes object
      • enabled boolean

        To enable this setting, you must also set the partition_field_name property to the same value in every detector that uses the keyword mlcategory. Otherwise, job creation fails.

      • This setting can be set to true only if per-partition categorization is enabled. If true, both categorization and subsequent anomaly detection stops for partitions where the categorization status changes to warn. This setting makes it viable to have a job where it is expected that categorization works well for some partitions but not others; you do not pay the cost of bad categorization forever in the partitions where it works badly.

    • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • Hide analysis_limits attributes Show analysis_limits attributes object
  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • Custom metadata about the job

  • Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies a period of time (in days) after which only the first snapshot per day is retained. This period is relative to the timestamp of the most recent snapshot for this job. Valid values range from 0 to model_snapshot_retention_days.

  • data_description object Required
    Hide data_description attributes Show data_description attributes object
    • format string

      Only JSON format is supported at this time.

    • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • The time format, which can be epoch, epoch_ms, or a custom pattern. The value epoch refers to UNIX or Epoch time (the number of seconds since 1 Jan 1970). The value epoch_ms indicates that time is measured in milliseconds since the epoch. The epoch and epoch_ms time formats accept either integer or real values. Custom patterns must conform to the Java DateTimeFormatter class. When you use date-time formatting patterns, it is recommended that you provide the full date, time and time zone. For example: yyyy-MM-dd'T'HH:mm:ssX. If the pattern that you specify is not sufficient to produce a complete timestamp, job creation fails.

  • Hide datafeed_config attributes Show datafeed_config attributes object
    • If set, the datafeed performs aggregation searches. Support for aggregations is limited and should be used only with low cardinality data.

    • Hide chunking_config attributes Show chunking_config attributes object
      • mode string Required

        Values are auto, manual, or off.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • enabled boolean Required

        Specifies whether the datafeed periodically checks for delayed data.

    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • indices string | array[string]
    • Hide indices_options attributes Show indices_options attributes object
      • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

      • expand_wildcards string | array[string]
      • If true, missing or closed indices are not included in the response.

      • If true, concrete, expanded or aliased indices are ignored when frozen.

    • job_id string
    • If a real-time datafeed has never seen any data (including during any initial training period) then it will automatically stop itself and close its associated job after this many real-time searches that return no documents. In other words, it will stop after frequency times max_empty_searches of real-time operation. If not set then a datafeed with no end time that sees no data will remain started until it is explicitly stopped.

    • query object

      An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Hide runtime_mappings attribute Show runtime_mappings attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • fields object

          For type composite

          Hide fields attribute Show fields attribute object
          • * object Additional properties
            Hide * attribute Show * attribute object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • fetch_fields array[object]

          For type lookup

          Hide fetch_fields attributes Show fetch_fields attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • format string
        • format string

          A custom format for date type runtime fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • script object
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
        • type string Required

          Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

    • Specifies scripts that evaluate custom expressions and returns script fields to the datafeed. The detector configuration objects in a job can contain functions that use these script fields.

      Hide script_fields attribute Show script_fields attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • script object Required
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
    • The size parameter that is used in Elasticsearch searches when the datafeed does not use aggregations. The maximum value is the value of index.max_result_window, which is 10,000 by default.

  • A description of the job.

  • job_id string
  • groups array[string]

    A list of job groups. A job can belong to no groups or many.

  • Hide model_plot_config attributes Show model_plot_config attributes object
    • If true, enables calculation and storage of the model change annotations for each entity that is being analyzed.

    • enabled boolean

      If true, enables calculation and storage of the model bounds for each entity that is being analyzed.

    • terms string

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies the maximum period of time (in days) that snapshots are retained. This period is relative to the timestamp of the most recent snapshot for this job. By default, snapshots ten days older than the newest snapshot are deleted.

  • Advanced configuration option. The period over which adjustments to the score are applied, as new data is seen. The default value is the longer of 30 days or 100 bucket spans.

  • Advanced configuration option. The period of time (in days) that results are retained. Age is calculated relative to the timestamp of the latest bucket result. If this property has a non-null value, once per day at 00:30 (server time), results that are the specified number of days older than the latest bucket result are deleted from Elasticsearch. The default value is null, which means all results are retained. Annotations generated by the system also count as results for retention purposes; they are deleted after the same number of days as results. Annotations added by users are retained forever.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • allow_lazy_open boolean Required
    • analysis_config object Required
      Hide analysis_config attributes Show analysis_config attributes object
      • bucket_span string Required

        A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • categorization_analyzer string | object

        One of:
      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If categorization_field_name is specified, you can also define optional filters. This property expects an array of regular expressions. The expressions are used to filter out matching sequences from the categorization field values.

      • detectors array[object] Required

        An array of detector configuration objects. Detector configuration objects specify which data fields a job analyzes. They also specify which analytical functions are used. You can specify multiple detectors for a job.

        Hide detectors attributes Show detectors attributes object
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • custom_rules array[object]

          An array of custom rule objects, which enable you to customize the way detectors operate. For example, a rule may dictate to the detector conditions under which results should be skipped. Kibana refers to custom rules as job rules.

          Hide custom_rules attributes Show custom_rules attributes object
          • actions array[string]

            The set of actions to be triggered when the rule applies. If more than one action is specified the effects of all actions are combined.

            Supported values include:

            • skip_result: The result will not be created. Unless you also specify skip_model_update, the model will be updated as usual with the corresponding series value.
            • skip_model_update: The value for that series will not be used to update the model. Unless you also specify skip_result, the results will be created as usual. This action is suitable when certain values are expected to be consistently anomalous and they affect the model in a way that negatively impacts the rest of the results.

            Values are skip_result or skip_model_update.

          • conditions array[object]

            An array of numeric conditions when the rule applies. A rule must either have a non-empty scope or at least one condition. Multiple conditions are combined together with a logical AND.

          • scope object

            A scope of series where the rule applies. A rule must either have a non-empty scope or at least one condition. By default, the scope includes all series. Scoping is allowed for any of the fields that are also specified in by_field_name, over_field_name, or partition_field_name.

        • A description of the detector.

        • A unique identifier for the detector. This identifier is based on the order of the detectors in the analysis_config, starting at zero.

        • Values are all, none, by, or over.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • function string Required

          The analysis function that is used. For example, count, rare, mean, min, max, and sum.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • use_null boolean

          Defines whether a new series is used as the null series when there is no value for the by or partition fields.

      • influencers array[string] Required

        A comma separated list of influencer field names. Typically these can be the by, over, or partition fields that are used in the detector configuration. You might also want to use a field name that is not specifically named in a detector, but is available as part of the input data. When you use multiple detectors, the use of influencers is recommended as it aggregates results for each influencer entity.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • latency string

        A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • This functionality is reserved for internal use. It is not supported for use in customer environments and is not subject to the support SLA of official GA features. If set to true, the analysis will automatically find correlations between metrics for a given by field value and report anomalies when those correlations cease to hold.

      • Hide per_partition_categorization attributes Show per_partition_categorization attributes object
        • enabled boolean

          To enable this setting, you must also set the partition_field_name property to the same value in every detector that uses the keyword mlcategory. Otherwise, job creation fails.

        • This setting can be set to true only if per-partition categorization is enabled. If true, both categorization and subsequent anomaly detection stops for partitions where the categorization status changes to warn. This setting makes it viable to have a job where it is expected that categorization works well for some partitions but not others; you do not pay the cost of bad categorization forever in the partitions where it works badly.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • analysis_limits object Required
      Hide analysis_limits attributes Show analysis_limits attributes object
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • create_time string | number Required

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    • Custom metadata about the job

    • data_description object Required
      Hide data_description attributes Show data_description attributes object
      • format string

        Only JSON format is supported at this time.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • The time format, which can be epoch, epoch_ms, or a custom pattern. The value epoch refers to UNIX or Epoch time (the number of seconds since 1 Jan 1970). The value epoch_ms indicates that time is measured in milliseconds since the epoch. The epoch and epoch_ms time formats accept either integer or real values. Custom patterns must conform to the Java DateTimeFormatter class. When you use date-time formatting patterns, it is recommended that you provide the full date, time and time zone. For example: yyyy-MM-dd'T'HH:mm:ssX. If the pattern that you specify is not sufficient to produce a complete timestamp, job creation fails.

    • Hide datafeed_config attributes Show datafeed_config attributes object
      • Hide authorization attributes Show authorization attributes object
        • api_key object
          Hide api_key attributes Show api_key attributes object
          • id string Required

            The identifier for the API key.

          • name string Required

            The name of the API key.

        • roles array[string]

          If a user ID was used for the most recent update to the datafeed, its roles at the time of the update are listed in the response.

        • If a service account was used for the most recent update to the datafeed, the account name is listed in the response.

      • Hide chunking_config attributes Show chunking_config attributes object
        • mode string Required

          Values are auto, manual, or off.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • datafeed_id string Required
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • indices array[string] Required
      • indexes array[string]
      • job_id string Required
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide script_fields attribute Show script_fields attribute object
      • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • enabled boolean Required

          Specifies whether the datafeed periodically checks for delayed data.

      • Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
              Hide * attribute Show * attribute object
              • type string Required

                Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

          • fetch_fields array[object]

            For type lookup

            Hide fetch_fields attributes Show fetch_fields attributes object
            • field string Required

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • format string
          • format string

            A custom format for date type runtime fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • script object
            Hide script attributes Show script attributes object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • Hide indices_options attributes Show indices_options attributes object
        • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

        • expand_wildcards string | array[string]
        • If true, missing or closed indices are not included in the response.

        • If true, concrete, expanded or aliased indices are ignored when frozen.

      • query object Required

        The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {"boost": 1}}.

        Query DSL
    • groups array[string]
    • job_id string Required
    • job_type string Required
    • job_version string Required
    • Hide model_plot_config attributes Show model_plot_config attributes object
      • If true, enables calculation and storage of the model change annotations for each entity that is being analyzed.

      • enabled boolean

        If true, enables calculation and storage of the model bounds for each entity that is being analyzed.

      • terms string

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • results_index_name string Required
PUT /_ml/anomaly_detectors/{job_id}
curl \
 --request PUT 'http://api.example.com/_ml/anomaly_detectors/{job_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"analysis_config\": {\n    \"bucket_span\": \"15m\",\n    \"detectors\": [\n      {\n        \"detector_description\": \"Sum of bytes\",\n        \"function\": \"sum\",\n        \"field_name\": \"bytes\"\n      }\n    ]\n  },\n  \"data_description\": {\n    \"time_field\": \"timestamp\",\n    \"time_format\": \"epoch_ms\"\n  },\n  \"analysis_limits\": {\n    \"model_memory_limit\": \"11MB\"\n  },\n  \"model_plot_config\": {\n    \"enabled\": true,\n    \"annotations_enabled\": true\n  },\n  \"results_index_name\": \"test-job1\",\n  \"datafeed_config\": {\n    \"indices\": [\n      \"kibana_sample_data_logs\"\n    ],\n    \"query\": {\n      \"bool\": {\n        \"must\": [\n          {\n            \"match_all\": {}\n          }\n        ]\n      }\n    },\n    \"runtime_mappings\": {\n      \"hour_of_day\": {\n        \"type\": \"long\",\n        \"script\": {\n          \"source\": \"emit(doc['timestamp'].value.getHour());\"\n        }\n      }\n    },\n    \"datafeed_id\": \"datafeed-test-job1\"\n  }\n}"'
Request example
A request to create an anomaly detection job and datafeed.
{
  "analysis_config": {
    "bucket_span": "15m",
    "detectors": [
      {
        "detector_description": "Sum of bytes",
        "function": "sum",
        "field_name": "bytes"
      }
    ]
  },
  "data_description": {
    "time_field": "timestamp",
    "time_format": "epoch_ms"
  },
  "analysis_limits": {
    "model_memory_limit": "11MB"
  },
  "model_plot_config": {
    "enabled": true,
    "annotations_enabled": true
  },
  "results_index_name": "test-job1",
  "datafeed_config": {
    "indices": [
      "kibana_sample_data_logs"
    ],
    "query": {
      "bool": {
        "must": [
          {
            "match_all": {}
          }
        ]
      }
    },
    "runtime_mappings": {
      "hour_of_day": {
        "type": "long",
        "script": {
          "source": "emit(doc['timestamp'].value.getHour());"
        }
      }
    },
    "datafeed_id": "datafeed-test-job1"
  }
}
Response examples (200)
A successful response when creating an anomaly detection job and datafeed.
{
  "job_id": "test-job1",
  "job_type": "anomaly_detector",
  "job_version": "8.4.0",
  "create_time": 1656087283340,
  "datafeed_config": {
    "datafeed_id": "datafeed-test-job1",
    "job_id": "test-job1",
    "authorization": {
      "roles": [
        "superuser"
      ]
    },
    "query_delay": "61499ms",
    "chunking_config": {
      "mode": "auto"
    },
    "indices_options": {
      "expand_wildcards": [
        "open"
      ],
      "ignore_unavailable": false,
      "allow_no_indices": true,
      "ignore_throttled": true
    },
    "query": {
      "bool": {
        "must": [
          {
            "match_all": {}
          }
        ]
      }
    },
    "indices": [
      "kibana_sample_data_logs"
    ],
    "scroll_size": 1000,
    "delayed_data_check_config": {
      "enabled": true
    },
    "runtime_mappings": {
      "hour_of_day": {
        "type": "long",
        "script": {
          "source": "emit(doc['timestamp'].value.getHour());"
        }
      }
    }
  },
  "analysis_config": {
    "bucket_span": "15m",
    "detectors": [
      {
        "detector_description": "Sum of bytes",
        "function": "sum",
        "field_name": "bytes",
        "detector_index": 0
      }
    ],
    "influencers": [],
    "model_prune_window": "30d"
  },
  "analysis_limits": {
    "model_memory_limit": "11mb",
    "categorization_examples_limit": 4
  },
  "data_description": {
    "time_field": "timestamp",
    "time_format": "epoch_ms"
  },
  "model_plot_config": {
    "enabled": true,
    "annotations_enabled": true
  },
  "model_snapshot_retention_days": 10,
  "daily_model_snapshot_retention_after_days": 1,
  "results_index_name": "custom-test-job1",
  "allow_lazy_open": false
}

Delete an anomaly detection job Added in 5.4.0

DELETE /_ml/anomaly_detectors/{job_id}

All job configuration, model state and results are deleted. It is not currently possible to delete multiple jobs using wildcards or a comma separated list. If you delete a job that has a datafeed, the request first tries to delete the datafeed. This behavior is equivalent to calling the delete datafeed API with the same timeout and force parameters as the delete job request.

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

Query parameters

  • force boolean

    Use to forcefully delete an opened job; this method is quicker than closing and deleting the job.

  • Specifies whether annotations that have been added by the user should be deleted along with any auto-generated annotations when the job is reset.

  • Specifies whether the request should return immediately or wait until the job deletion completes.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ml/anomaly_detectors/{job_id}
curl \
 --request DELETE 'http://api.example.com/_ml/anomaly_detectors/{job_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when deleting an anomaly detection job.
{
  "acknowledged": true
}
A successful response when deleting an anomaly detection job asynchronously. When the `wait_for_completion` query parameter is set to `false`, the response contains an identifier for the job deletion task.
{
  "task": "oTUltX4IQMOUUVeiohTt8A:39"
}

Estimate job model memory usage Added in 7.7.0

POST /_ml/anomaly_detectors/_estimate_model_memory

Make an estimation of the memory usage for an anomaly detection job model. The estimate is based on analysis configuration details for the job and cardinality estimates for the fields it references.

application/json

Body Required

  • Hide analysis_config attributes Show analysis_config attributes object
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • categorization_analyzer string | object

      One of:
    • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • If categorization_field_name is specified, you can also define optional filters. This property expects an array of regular expressions. The expressions are used to filter out matching sequences from the categorization field values. You can use this functionality to fine tune the categorization by excluding sequences from consideration when categories are defined. For example, you can exclude SQL statements that appear in your log files. This property cannot be used at the same time as categorization_analyzer. If you only want to define simple regular expression filters that are applied prior to tokenization, setting this property is the easiest method. If you also want to customize the tokenizer or post-tokenization filtering, use the categorization_analyzer property instead and include the filters as pattern_replace character filters. The effect is exactly the same.

    • detectors array[object] Required

      Detector configuration objects specify which data fields a job analyzes. They also specify which analytical functions are used. You can specify multiple detectors for a job. If the detectors array does not contain at least one detector, no analysis can occur and an error is returned.

      Hide detectors attributes Show detectors attributes object
      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • custom_rules array[object]

        Custom rules enable you to customize the way detectors operate. For example, a rule may dictate conditions under which results should be skipped. Kibana refers to custom rules as job rules.

        Hide custom_rules attributes Show custom_rules attributes object
        • actions array[string]

          The set of actions to be triggered when the rule applies. If more than one action is specified the effects of all actions are combined.

          Supported values include:

          • skip_result: The result will not be created. Unless you also specify skip_model_update, the model will be updated as usual with the corresponding series value.
          • skip_model_update: The value for that series will not be used to update the model. Unless you also specify skip_result, the results will be created as usual. This action is suitable when certain values are expected to be consistently anomalous and they affect the model in a way that negatively impacts the rest of the results.

          Values are skip_result or skip_model_update.

        • conditions array[object]

          An array of numeric conditions when the rule applies. A rule must either have a non-empty scope or at least one condition. Multiple conditions are combined together with a logical AND.

        • scope object

          A scope of series where the rule applies. A rule must either have a non-empty scope or at least one condition. By default, the scope includes all series. Scoping is allowed for any of the fields that are also specified in by_field_name, over_field_name, or partition_field_name.

          Hide scope attribute Show scope attribute object
          • * object Additional properties
      • A description of the detector.

      • A unique identifier for the detector. This identifier is based on the order of the detectors in the analysis_config, starting at zero. If you specify a value for this property, it is ignored.

      • Values are all, none, by, or over.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • function string

        The analysis function that is used. For example, count, rare, mean, min, max, or sum.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • use_null boolean

        Defines whether a new series is used as the null series when there is no value for the by or partition fields.

    • influencers array[string]

      A comma separated list of influencer field names. Typically these can be the by, over, or partition fields that are used in the detector configuration. You might also want to use a field name that is not specifically named in a detector, but is available as part of the input data. When you use multiple detectors, the use of influencers is recommended as it aggregates results for each influencer entity.

    • latency string

      A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • This functionality is reserved for internal use. It is not supported for use in customer environments and is not subject to the support SLA of official GA features. If set to true, the analysis will automatically find correlations between metrics for a given by field value and report anomalies when those correlations cease to hold. For example, suppose CPU and memory usage on host A is usually highly correlated with the same metrics on host B. Perhaps this correlation occurs because they are running a load-balanced application. If you enable this property, anomalies will be reported when, for example, CPU usage on host A is high and the value of CPU usage on host B is low. That is to say, you’ll see an anomaly when the CPU of host A is unusual given the CPU of host B. To use the multivariate_by_fields property, you must also specify by_field_name in your detector.

    • Hide per_partition_categorization attributes Show per_partition_categorization attributes object
      • enabled boolean

        To enable this setting, you must also set the partition_field_name property to the same value in every detector that uses the keyword mlcategory. Otherwise, job creation fails.

      • This setting can be set to true only if per-partition categorization is enabled. If true, both categorization and subsequent anomaly detection stops for partitions where the categorization status changes to warn. This setting makes it viable to have a job where it is expected that categorization works well for some partitions but not others; you do not pay the cost of bad categorization forever in the partitions where it works badly.

    • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • Estimates of the highest cardinality in a single bucket that is observed for influencer fields over the time period that the job analyzes data. To produce a good answer, values must be provided for all influencer fields. Providing values for fields that are not listed as influencers has no effect on the estimation.

    Hide max_bucket_cardinality attribute Show max_bucket_cardinality attribute object
    • * number Additional properties
  • Estimates of the cardinality that is observed for fields over the whole time period that the job analyzes data. To produce a good answer, values must be provided for fields referenced in the by_field_name, over_field_name and partition_field_name of any detectors. Providing values for other fields has no effect on the estimation. It can be omitted from the request if no detectors have a by_field_name, over_field_name or partition_field_name.

    Hide overall_cardinality attribute Show overall_cardinality attribute object
    • * number Additional properties

Responses

POST /_ml/anomaly_detectors/_estimate_model_memory
curl \
 --request POST 'http://api.example.com/_ml/anomaly_detectors/_estimate_model_memory' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"analysis_config\": {\n    \"bucket_span\": \"5m\",\n    \"detectors\": [\n      {\n        \"function\": \"sum\",\n        \"field_name\": \"bytes\",\n        \"by_field_name\": \"status\",\n        \"partition_field_name\": \"app\"\n      }\n    ],\n    \"influencers\": [\n      \"source_ip\",\n      \"dest_ip\"\n    ]\n  },\n  \"overall_cardinality\": {\n    \"status\": 10,\n    \"app\": 50\n  },\n  \"max_bucket_cardinality\": {\n    \"source_ip\": 300,\n    \"dest_ip\": 30\n  }\n}"'
Request example
Run `POST _ml/anomaly_detectors/_estimate_model_memory` to estimate the model memory limit based on the analysis configuration details provided in the request body.
{
  "analysis_config": {
    "bucket_span": "5m",
    "detectors": [
      {
        "function": "sum",
        "field_name": "bytes",
        "by_field_name": "status",
        "partition_field_name": "app"
      }
    ],
    "influencers": [
      "source_ip",
      "dest_ip"
    ]
  },
  "overall_cardinality": {
    "status": 10,
    "app": 50
  },
  "max_bucket_cardinality": {
    "source_ip": 300,
    "dest_ip": 30
  }
}
Response examples (200)
A successful response from `POST _ml/anomaly_detectors/_estimate_model_memory`.
{
  "model_memory_estimate": "21mb"
}

Force buffered data to be processed Deprecated Added in 5.4.0

POST /_ml/anomaly_detectors/{job_id}/_flush

The flush jobs API is only applicable when sending data for analysis using the post data API. Depending on the content of the buffer, then it might additionally calculate new results. Both flush and close operations are similar, however the flush is more efficient if you are expecting to send more data for analysis. When flushing, the job remains open and is available to continue analyzing data. A close operation additionally prunes and persists the model state to disk and the job must be opened again before analyzing further data.

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

Query parameters

  • advance_time string | number

    Specifies to advance to a particular time value. Results are generated and the model is updated for data from the specified time interval.

  • If true, calculates the interim results for the most recent bucket or all buckets within the latency period.

  • end string | number

    When used in conjunction with calc_interim and start, specifies the range of buckets on which to calculate interim results.

  • skip_time string | number

    Specifies to skip to a particular time value. Results are not generated and the model is not updated for data from the specified time interval.

  • start string | number

    When used in conjunction with calc_interim, specifies the range of buckets on which to calculate interim results.

application/json

Body

  • advance_time string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • Refer to the description for the calc_interim query parameter.

  • end string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • skip_time string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • start string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
POST /_ml/anomaly_detectors/{job_id}/_flush
curl \
 --request POST 'http://api.example.com/_ml/anomaly_detectors/{job_id}/_flush' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"":"string","calc_interim":true}'

Get info about events in calendars Added in 6.2.0

GET /_ml/calendars/{calendar_id}/events

Path parameters

  • calendar_id string Required

    A string that uniquely identifies a calendar. You can get information for multiple calendars by using a comma-separated list of ids or a wildcard expression. You can get information for all calendars by using _all or * or by omitting the calendar identifier.

Query parameters

  • end string | number

    Specifies to get events with timestamps earlier than this time.

  • from number

    Skips the specified number of events.

  • job_id string

    Specifies to get events for a specific anomaly detection job identifier or job group. It must be used with a calendar identifier of _all or *.

  • size number

    Specifies the maximum number of events to obtain.

  • start string | number

    Specifies to get events with timestamps after this time.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • events array[object] Required
      Hide events attributes Show events attributes object
      • event_id string
      • description string Required

        A description of the scheduled event.

      • end_time string | number Required

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • start_time string | number Required

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • When true the model will not create results for this calendar period.

      • When true the model will not be updated for this calendar period.

      • Shift time by this many seconds. For example adjust time for daylight savings changes

GET /_ml/calendars/{calendar_id}/events
curl \
 --request GET 'http://api.example.com/_ml/calendars/{calendar_id}/events' \
 --header "Authorization: $API_KEY"

Add scheduled events to the calendar Added in 6.2.0

POST /_ml/calendars/{calendar_id}/events

Path parameters

  • calendar_id string Required

    A string that uniquely identifies a calendar.

application/json

Body Required

  • events array[object] Required

    A list of one of more scheduled events. The event’s start and end times can be specified as integer milliseconds since the epoch or as a string in ISO 8601 format.

    Hide events attributes Show events attributes object
    • event_id string
    • description string Required

      A description of the scheduled event.

    • end_time string | number Required

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    • start_time string | number Required

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    • When true the model will not create results for this calendar period.

    • When true the model will not be updated for this calendar period.

    • Shift time by this many seconds. For example adjust time for daylight savings changes

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • events array[object] Required
      Hide events attributes Show events attributes object
      • event_id string
      • description string Required

        A description of the scheduled event.

      • end_time string | number Required

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • start_time string | number Required

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • When true the model will not create results for this calendar period.

      • When true the model will not be updated for this calendar period.

      • Shift time by this many seconds. For example adjust time for daylight savings changes

POST /_ml/calendars/{calendar_id}/events
curl \
 --request POST 'http://api.example.com/_ml/calendars/{calendar_id}/events' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"events":[{"calendar_id":"string","event_id":"string","description":"string","":"string","skip_result":true,"skip_model_update":true,"force_time_shift":42.0}]}'

Get calendar configuration info Added in 6.2.0

GET /_ml/calendars

Query parameters

  • from number

    Skips the specified number of calendars. This parameter is supported only when you omit the calendar identifier.

  • size number

    Specifies the maximum number of calendars to obtain. This parameter is supported only when you omit the calendar identifier.

application/json

Body

  • page object
    Hide page attributes Show page attributes object
    • from number

      Skips the specified number of items.

    • size number

      Specifies the maximum number of items to obtain.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • calendars array[object] Required
      Hide calendars attributes Show calendars attributes object
    • count number Required
GET /_ml/calendars
curl \
 --request GET 'http://api.example.com/_ml/calendars' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"page":{"from":42.0,"size":42.0}}'

Get calendar configuration info Added in 6.2.0

POST /_ml/calendars

Query parameters

  • from number

    Skips the specified number of calendars. This parameter is supported only when you omit the calendar identifier.

  • size number

    Specifies the maximum number of calendars to obtain. This parameter is supported only when you omit the calendar identifier.

application/json

Body

  • page object
    Hide page attributes Show page attributes object
    • from number

      Skips the specified number of items.

    • size number

      Specifies the maximum number of items to obtain.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • calendars array[object] Required
      Hide calendars attributes Show calendars attributes object
    • count number Required
POST /_ml/calendars
curl \
 --request POST 'http://api.example.com/_ml/calendars' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"page":{"from":42.0,"size":42.0}}'

Get datafeed stats Added in 5.5.0

GET /_ml/datafeeds/{datafeed_id}/_stats

You can get statistics for multiple datafeeds in a single API request by using a comma-separated list of datafeeds or a wildcard expression. You can get statistics for all datafeeds by using _all, by specifying * as the <feed_id>, or by omitting the <feed_id>. If the datafeed is stopped, the only information you receive is the datafeed_id and the state. This API returns a maximum of 10,000 datafeeds.

Path parameters

  • datafeed_id string | array[string] Required

    Identifier for the datafeed. It can be a datafeed identifier or a wildcard expression. If you do not specify one of these options, the API returns information about all datafeeds.

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no datafeeds that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    The default value is true, which returns an empty datafeeds array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • datafeeds array[object] Required
      Hide datafeeds attributes Show datafeeds attributes object
GET /_ml/datafeeds/{datafeed_id}/_stats
curl \
 --request GET 'http://api.example.com/_ml/datafeeds/{datafeed_id}/_stats' \
 --header "Authorization: $API_KEY"

Get datafeed stats Added in 5.5.0

GET /_ml/datafeeds/_stats

You can get statistics for multiple datafeeds in a single API request by using a comma-separated list of datafeeds or a wildcard expression. You can get statistics for all datafeeds by using _all, by specifying * as the <feed_id>, or by omitting the <feed_id>. If the datafeed is stopped, the only information you receive is the datafeed_id and the state. This API returns a maximum of 10,000 datafeeds.

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no datafeeds that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    The default value is true, which returns an empty datafeeds array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • datafeeds array[object] Required
      Hide datafeeds attributes Show datafeeds attributes object
GET /_ml/datafeeds/_stats
curl \
 --request GET 'http://api.example.com/_ml/datafeeds/_stats' \
 --header "Authorization: $API_KEY"

Get datafeeds configuration info Added in 5.5.0

GET /_ml/datafeeds

You can get information for multiple datafeeds in a single API request by using a comma-separated list of datafeeds or a wildcard expression. You can get information for all datafeeds by using _all, by specifying * as the <feed_id>, or by omitting the <feed_id>. This API returns a maximum of 10,000 datafeeds.

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no datafeeds that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    The default value is true, which returns an empty datafeeds array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

  • Indicates if certain fields should be removed from the configuration on retrieval. This allows the configuration to be in an acceptable format to be retrieved and then added to another cluster.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • datafeeds array[object] Required
      Hide datafeeds attributes Show datafeeds attributes object
      • Hide authorization attributes Show authorization attributes object
        • api_key object
          Hide api_key attributes Show api_key attributes object
          • id string Required

            The identifier for the API key.

          • name string Required

            The name of the API key.

        • roles array[string]

          If a user ID was used for the most recent update to the datafeed, its roles at the time of the update are listed in the response.

        • If a service account was used for the most recent update to the datafeed, the account name is listed in the response.

      • Hide chunking_config attributes Show chunking_config attributes object
        • mode string Required

          Values are auto, manual, or off.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • datafeed_id string Required
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • indices array[string] Required
      • indexes array[string]
      • job_id string Required
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide script_fields attribute Show script_fields attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • script object Required
            Hide script attributes Show script attributes object
            • id string
            • params object

              Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            • options object
      • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • enabled boolean Required

          Specifies whether the datafeed periodically checks for delayed data.

      • Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
          • fetch_fields array[object]

            For type lookup

          • format string

            A custom format for date type runtime fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • script object
            Hide script attributes Show script attributes object
            • id string
            • params object

              Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            • options object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • Hide indices_options attributes Show indices_options attributes object
        • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

        • expand_wildcards string | array[string]
        • If true, missing or closed indices are not included in the response.

        • If true, concrete, expanded or aliased indices are ignored when frozen.

      • query object Required

        The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {"boost": 1}}.

        Query DSL
GET /_ml/datafeeds
curl \
 --request GET 'http://api.example.com/_ml/datafeeds' \
 --header "Authorization: $API_KEY"

Get filters Added in 5.5.0

GET /_ml/filters

You can get a single filter or all filters.

Query parameters

  • from number

    Skips the specified number of filters.

  • size number

    Specifies the maximum number of filters to obtain.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • filters array[object] Required
      Hide filters attributes Show filters attributes object
      • A description of the filter.

      • filter_id string Required
      • items array[string] Required

        An array of strings which is the filter item list.

GET /_ml/filters
curl \
 --request GET 'http://api.example.com/_ml/filters' \
 --header "Authorization: $API_KEY"

Get anomaly detection job stats Added in 5.5.0

GET /_ml/anomaly_detectors/_stats

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no jobs that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    If true, the API returns an empty jobs array when there are no matches and the subset of results when there are partial matches. If false, the API returns a 404 status code when there are no matches or only partial matches.

Responses

GET /_ml/anomaly_detectors/_stats
curl \
 --request GET 'http://api.example.com/_ml/anomaly_detectors/_stats' \
 --header "Authorization: $API_KEY"

Get anomaly detection job stats Added in 5.5.0

GET /_ml/anomaly_detectors/{job_id}/_stats

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job. It can be a job identifier, a group name, a comma-separated list of jobs, or a wildcard expression. If you do not specify one of these options, the API returns information for all anomaly detection jobs.

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no jobs that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    If true, the API returns an empty jobs array when there are no matches and the subset of results when there are partial matches. If false, the API returns a 404 status code when there are no matches or only partial matches.

Responses

GET /_ml/anomaly_detectors/{job_id}/_stats
curl \
 --request GET 'http://api.example.com/_ml/anomaly_detectors/{job_id}/_stats' \
 --header "Authorization: $API_KEY"

Get anomaly detection jobs configuration info Added in 5.5.0

GET /_ml/anomaly_detectors

You can get information for multiple anomaly detection jobs in a single API request by using a group name, a comma-separated list of jobs, or a wildcard expression. You can get information for all anomaly detection jobs by using _all, by specifying * as the <job_id>, or by omitting the <job_id>.

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no jobs that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    The default value is true, which returns an empty jobs array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

  • Indicates if certain fields should be removed from the configuration on retrieval. This allows the configuration to be in an acceptable format to be retrieved and then added to another cluster.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • jobs array[object] Required
      Hide jobs attributes Show jobs attributes object
      • allow_lazy_open boolean Required

        Advanced configuration option. Specifies whether this job can open when there is insufficient machine learning node capacity for it to be immediately assigned to a node.

      • analysis_config object Required
        Hide analysis_config attributes Show analysis_config attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • categorization_analyzer string | object

          One of:
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • If categorization_field_name is specified, you can also define optional filters. This property expects an array of regular expressions. The expressions are used to filter out matching sequences from the categorization field values. You can use this functionality to fine tune the categorization by excluding sequences from consideration when categories are defined. For example, you can exclude SQL statements that appear in your log files. This property cannot be used at the same time as categorization_analyzer. If you only want to define simple regular expression filters that are applied prior to tokenization, setting this property is the easiest method. If you also want to customize the tokenizer or post-tokenization filtering, use the categorization_analyzer property instead and include the filters as pattern_replace character filters. The effect is exactly the same.

        • detectors array[object] Required

          Detector configuration objects specify which data fields a job analyzes. They also specify which analytical functions are used. You can specify multiple detectors for a job. If the detectors array does not contain at least one detector, no analysis can occur and an error is returned.

          Hide detectors attributes Show detectors attributes object
          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • custom_rules array[object]

            Custom rules enable you to customize the way detectors operate. For example, a rule may dictate conditions under which results should be skipped. Kibana refers to custom rules as job rules.

          • A description of the detector.

          • A unique identifier for the detector. This identifier is based on the order of the detectors in the analysis_config, starting at zero. If you specify a value for this property, it is ignored.

          • Values are all, none, by, or over.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • function string

            The analysis function that is used. For example, count, rare, mean, min, max, or sum.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • use_null boolean

            Defines whether a new series is used as the null series when there is no value for the by or partition fields.

        • influencers array[string]

          A comma separated list of influencer field names. Typically these can be the by, over, or partition fields that are used in the detector configuration. You might also want to use a field name that is not specifically named in a detector, but is available as part of the input data. When you use multiple detectors, the use of influencers is recommended as it aggregates results for each influencer entity.

        • latency string

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • This functionality is reserved for internal use. It is not supported for use in customer environments and is not subject to the support SLA of official GA features. If set to true, the analysis will automatically find correlations between metrics for a given by field value and report anomalies when those correlations cease to hold. For example, suppose CPU and memory usage on host A is usually highly correlated with the same metrics on host B. Perhaps this correlation occurs because they are running a load-balanced application. If you enable this property, anomalies will be reported when, for example, CPU usage on host A is high and the value of CPU usage on host B is low. That is to say, you’ll see an anomaly when the CPU of host A is unusual given the CPU of host B. To use the multivariate_by_fields property, you must also specify by_field_name in your detector.

        • Hide per_partition_categorization attributes Show per_partition_categorization attributes object
          • enabled boolean

            To enable this setting, you must also set the partition_field_name property to the same value in every detector that uses the keyword mlcategory. Otherwise, job creation fails.

          • This setting can be set to true only if per-partition categorization is enabled. If true, both categorization and subsequent anomaly detection stops for partitions where the categorization status changes to warn. This setting makes it viable to have a job where it is expected that categorization works well for some partitions but not others; you do not pay the cost of bad categorization forever in the partitions where it works badly.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Hide analysis_limits attributes Show analysis_limits attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • blocked object
        Hide blocked attributes Show blocked attributes object
      • create_time string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • Custom metadata about the job

      • Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies a period of time (in days) after which only the first snapshot per day is retained. This period is relative to the timestamp of the most recent snapshot for this job. Valid values range from 0 to model_snapshot_retention_days.

      • data_description object Required
        Hide data_description attributes Show data_description attributes object
        • format string

          Only JSON format is supported at this time.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • The time format, which can be epoch, epoch_ms, or a custom pattern. The value epoch refers to UNIX or Epoch time (the number of seconds since 1 Jan 1970). The value epoch_ms indicates that time is measured in milliseconds since the epoch. The epoch and epoch_ms time formats accept either integer or real values. Custom patterns must conform to the Java DateTimeFormatter class. When you use date-time formatting patterns, it is recommended that you provide the full date, time and time zone. For example: yyyy-MM-dd'T'HH:mm:ssX. If the pattern that you specify is not sufficient to produce a complete timestamp, job creation fails.

      • Hide datafeed_config attributes Show datafeed_config attributes object
        • Hide authorization attributes Show authorization attributes object
          • api_key object
            Hide api_key attributes Show api_key attributes object
            • id string Required

              The identifier for the API key.

            • name string Required

              The name of the API key.

          • roles array[string]

            If a user ID was used for the most recent update to the datafeed, its roles at the time of the update are listed in the response.

          • If a service account was used for the most recent update to the datafeed, the account name is listed in the response.

        • Hide chunking_config attributes Show chunking_config attributes object
          • mode string Required

            Values are auto, manual, or off.

          • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • datafeed_id string Required
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • indices array[string] Required
        • indexes array[string]
        • job_id string Required
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • Hide script_fields attribute Show script_fields attribute object
        • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
          • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

          • enabled boolean Required

            Specifies whether the datafeed periodically checks for delayed data.

        • Hide runtime_mappings attribute Show runtime_mappings attribute object
          • * object Additional properties
            Hide * attributes Show * attributes object
            • fields object

              For type composite

            • fetch_fields array[object]

              For type lookup

            • format string

              A custom format for date type runtime fields.

            • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • script object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • Hide indices_options attributes Show indices_options attributes object
          • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

          • expand_wildcards string | array[string]
          • If true, missing or closed indices are not included in the response.

          • If true, concrete, expanded or aliased indices are ignored when frozen.

        • query object Required

          The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {"boost": 1}}.

          Query DSL
      • deleting boolean

        Indicates that the process of deleting the job is in progress but not yet completed. It is only reported when true.

      • A description of the job.

      • finished_time string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • groups array[string]

        A list of job groups. A job can belong to no groups or many.

      • job_id string Required
      • job_type string

        Reserved for future use, currently set to anomaly_detector.

      • Hide model_plot_config attributes Show model_plot_config attributes object
        • If true, enables calculation and storage of the model change annotations for each entity that is being analyzed.

        • enabled boolean

          If true, enables calculation and storage of the model bounds for each entity that is being analyzed.

        • terms string

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies the maximum period of time (in days) that snapshots are retained. This period is relative to the timestamp of the most recent snapshot for this job. By default, snapshots ten days older than the newest snapshot are deleted.

      • Advanced configuration option. The period over which adjustments to the score are applied, as new data is seen. The default value is the longer of 30 days or 100 bucket_spans.

      • results_index_name string Required
      • Advanced configuration option. The period of time (in days) that results are retained. Age is calculated relative to the timestamp of the latest bucket result. If this property has a non-null value, once per day at 00:30 (server time), results that are the specified number of days older than the latest bucket result are deleted from Elasticsearch. The default value is null, which means all results are retained. Annotations generated by the system also count as results for retention purposes; they are deleted after the same number of days as results. Annotations added by users are retained forever.

GET /_ml/anomaly_detectors
curl \
 --request GET 'http://api.example.com/_ml/anomaly_detectors' \
 --header "Authorization: $API_KEY"

Get overall bucket results Added in 6.1.0

GET /_ml/anomaly_detectors/{job_id}/results/overall_buckets

Retrievs overall bucket results that summarize the bucket results of multiple anomaly detection jobs.

The overall_score is calculated by combining the scores of all the buckets within the overall bucket span. First, the maximum anomaly_score per anomaly detection job in the overall bucket is calculated. Then the top_n of those scores are averaged to result in the overall_score. This means that you can fine-tune the overall_score so that it is more or less sensitive to the number of jobs that detect an anomaly at the same time. For example, if you set top_n to 1, the overall_score is the maximum bucket score in the overall bucket. Alternatively, if you set top_n to the number of jobs, the overall_score is high only when all jobs detect anomalies in that overall bucket. If you set the bucket_span parameter (to a value greater than its default), the overall_score is the maximum overall_score of the overall buckets that have a span equal to the jobs' largest bucket span.

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job. It can be a job identifier, a group name, a comma-separated list of jobs or groups, or a wildcard expression.

    You can summarize the bucket results for all anomaly detection jobs by using _all or by specifying * as the <job_id>.

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no jobs that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    If true, the request returns an empty jobs array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

  • The span of the overall buckets. Must be greater or equal to the largest bucket span of the specified anomaly detection jobs, which is the default value.

    By default, an overall bucket has a span equal to the largest bucket span of the specified anomaly detection jobs. To override that behavior, use the optional bucket_span parameter.

  • end string | number

    Returns overall buckets with timestamps earlier than this time.

  • If true, the output excludes interim results.

  • overall_score number | string

    Returns overall buckets with overall scores greater than or equal to this value.

  • start string | number

    Returns overall buckets with timestamps after this time.

  • top_n number

    The number of top anomaly detection job bucket scores to be used in the overall_score calculation.

application/json

Body

  • Refer to the description for the allow_no_match query parameter.

  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • end string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • Refer to the description for the exclude_interim query parameter.

  • overall_score number | string

    Refer to the description for the overall_score query parameter.

  • start string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • top_n number

    Refer to the description for the top_n query parameter.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • overall_buckets array[object] Required

      Array of overall bucket objects

      Hide overall_buckets attributes Show overall_buckets attributes object
      • Time unit for seconds

      • is_interim boolean Required

        If true, this is an interim result. In other words, the results are calculated based on partial input data.

      • jobs array[object] Required

        An array of objects that contain the max_anomaly_score per job_id.

        Hide jobs attributes Show jobs attributes object
      • overall_score number Required

        The top_n average of the maximum bucket anomaly_score per job.

      • result_type string Required

        Internal. This is always set to overall_bucket.

      • Time unit for milliseconds

      • timestamp_string string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

GET /_ml/anomaly_detectors/{job_id}/results/overall_buckets
curl \
 --request GET 'http://api.example.com/_ml/anomaly_detectors/{job_id}/results/overall_buckets' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"allow_no_match":true,"bucket_span":"string","":"string","exclude_interim":true,"overall_score":42.0,"top_n":42.0}'

Get overall bucket results Added in 6.1.0

POST /_ml/anomaly_detectors/{job_id}/results/overall_buckets

Retrievs overall bucket results that summarize the bucket results of multiple anomaly detection jobs.

The overall_score is calculated by combining the scores of all the buckets within the overall bucket span. First, the maximum anomaly_score per anomaly detection job in the overall bucket is calculated. Then the top_n of those scores are averaged to result in the overall_score. This means that you can fine-tune the overall_score so that it is more or less sensitive to the number of jobs that detect an anomaly at the same time. For example, if you set top_n to 1, the overall_score is the maximum bucket score in the overall bucket. Alternatively, if you set top_n to the number of jobs, the overall_score is high only when all jobs detect anomalies in that overall bucket. If you set the bucket_span parameter (to a value greater than its default), the overall_score is the maximum overall_score of the overall buckets that have a span equal to the jobs' largest bucket span.

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job. It can be a job identifier, a group name, a comma-separated list of jobs or groups, or a wildcard expression.

    You can summarize the bucket results for all anomaly detection jobs by using _all or by specifying * as the <job_id>.

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no jobs that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    If true, the request returns an empty jobs array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

  • The span of the overall buckets. Must be greater or equal to the largest bucket span of the specified anomaly detection jobs, which is the default value.

    By default, an overall bucket has a span equal to the largest bucket span of the specified anomaly detection jobs. To override that behavior, use the optional bucket_span parameter.

  • end string | number

    Returns overall buckets with timestamps earlier than this time.

  • If true, the output excludes interim results.

  • overall_score number | string

    Returns overall buckets with overall scores greater than or equal to this value.

  • start string | number

    Returns overall buckets with timestamps after this time.

  • top_n number

    The number of top anomaly detection job bucket scores to be used in the overall_score calculation.

application/json

Body

  • Refer to the description for the allow_no_match query parameter.

  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • end string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • Refer to the description for the exclude_interim query parameter.

  • overall_score number | string

    Refer to the description for the overall_score query parameter.

  • start string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • top_n number

    Refer to the description for the top_n query parameter.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • overall_buckets array[object] Required

      Array of overall bucket objects

      Hide overall_buckets attributes Show overall_buckets attributes object
      • Time unit for seconds

      • is_interim boolean Required

        If true, this is an interim result. In other words, the results are calculated based on partial input data.

      • jobs array[object] Required

        An array of objects that contain the max_anomaly_score per job_id.

        Hide jobs attributes Show jobs attributes object
      • overall_score number Required

        The top_n average of the maximum bucket anomaly_score per job.

      • result_type string Required

        Internal. This is always set to overall_bucket.

      • Time unit for milliseconds

      • timestamp_string string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

POST /_ml/anomaly_detectors/{job_id}/results/overall_buckets
curl \
 --request POST 'http://api.example.com/_ml/anomaly_detectors/{job_id}/results/overall_buckets' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"allow_no_match":true,"bucket_span":"string","":"string","exclude_interim":true,"overall_score":42.0,"top_n":42.0}'

Open anomaly detection jobs Added in 5.4.0

POST /_ml/anomaly_detectors/{job_id}/_open

An anomaly detection job must be opened to be ready to receive and analyze data. It can be opened and closed multiple times throughout its lifecycle. When you open a new job, it starts with an empty model. When you open an existing job, the most recent model state is automatically loaded. The job is ready to resume its analysis from where it left off, once new data is received.

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

Query parameters

  • timeout string

    Controls the time to wait until a job has opened.

application/json

Body

  • timeout string

    A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
POST /_ml/anomaly_detectors/{job_id}/_open
curl \
 --request POST 'http://api.example.com/_ml/anomaly_detectors/{job_id}/_open' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"timeout\": \"35m\"\n}"'
Request example
A request to open anomaly detection jobs. The timeout specifies to wait 35 minutes for the job to open.
{
  "timeout": "35m"
}
Response examples (200)
A successful response when opening an anomaly detection job.
{
  "opened": true,
  "node": "node-1"
}

Preview a datafeed Added in 5.4.0

GET /_ml/datafeeds/{datafeed_id}/_preview

This API returns the first "page" of search results from a datafeed. You can preview an existing datafeed or provide configuration details for a datafeed and anomaly detection job in the API. The preview shows the structure of the data that will be passed to the anomaly detection engine. IMPORTANT: When Elasticsearch security features are enabled, the preview uses the credentials of the user that called the API. However, when the datafeed starts it uses the roles of the last user that created or updated the datafeed. To get a preview that accurately reflects the behavior of the datafeed, use the appropriate credentials. You can also use secondary authorization headers to supply the credentials.

Path parameters

  • datafeed_id string Required

    A numerical character string that uniquely identifies the datafeed. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters. NOTE: If you use this path parameter, you cannot provide datafeed or anomaly detection job configuration details in the request body.

Query parameters

  • start string | number

    The start time from where the datafeed preview should begin

  • end string | number

    The end time when the datafeed preview should stop

application/json

Body

  • Hide datafeed_config attributes Show datafeed_config attributes object
    • If set, the datafeed performs aggregation searches. Support for aggregations is limited and should be used only with low cardinality data.

    • Hide chunking_config attributes Show chunking_config attributes object
      • mode string Required

        Values are auto, manual, or off.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • enabled boolean Required

        Specifies whether the datafeed periodically checks for delayed data.

    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • indices string | array[string]
    • Hide indices_options attributes Show indices_options attributes object
      • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

      • expand_wildcards string | array[string]
      • If true, missing or closed indices are not included in the response.

      • If true, concrete, expanded or aliased indices are ignored when frozen.

    • job_id string
    • If a real-time datafeed has never seen any data (including during any initial training period) then it will automatically stop itself and close its associated job after this many real-time searches that return no documents. In other words, it will stop after frequency times max_empty_searches of real-time operation. If not set then a datafeed with no end time that sees no data will remain started until it is explicitly stopped.

    • query object

      An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Hide runtime_mappings attribute Show runtime_mappings attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • fields object

          For type composite

          Hide fields attribute Show fields attribute object
          • * object Additional properties
            Hide * attribute Show * attribute object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • fetch_fields array[object]

          For type lookup

          Hide fetch_fields attributes Show fetch_fields attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • format string
        • format string

          A custom format for date type runtime fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • script object
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
        • type string Required

          Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

    • Specifies scripts that evaluate custom expressions and returns script fields to the datafeed. The detector configuration objects in a job can contain functions that use these script fields.

      Hide script_fields attribute Show script_fields attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • script object Required
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
    • The size parameter that is used in Elasticsearch searches when the datafeed does not use aggregations. The maximum value is the value of index.max_result_window, which is 10,000 by default.

  • Hide job_config attributes Show job_config attributes object
    • Advanced configuration option. Specifies whether this job can open when there is insufficient machine learning node capacity for it to be immediately assigned to a node.

    • analysis_config object Required
      Hide analysis_config attributes Show analysis_config attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • categorization_analyzer string | object

        One of:
      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If categorization_field_name is specified, you can also define optional filters. This property expects an array of regular expressions. The expressions are used to filter out matching sequences from the categorization field values. You can use this functionality to fine tune the categorization by excluding sequences from consideration when categories are defined. For example, you can exclude SQL statements that appear in your log files. This property cannot be used at the same time as categorization_analyzer. If you only want to define simple regular expression filters that are applied prior to tokenization, setting this property is the easiest method. If you also want to customize the tokenizer or post-tokenization filtering, use the categorization_analyzer property instead and include the filters as pattern_replace character filters. The effect is exactly the same.

      • detectors array[object] Required

        Detector configuration objects specify which data fields a job analyzes. They also specify which analytical functions are used. You can specify multiple detectors for a job. If the detectors array does not contain at least one detector, no analysis can occur and an error is returned.

        Hide detectors attributes Show detectors attributes object
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • custom_rules array[object]

          Custom rules enable you to customize the way detectors operate. For example, a rule may dictate conditions under which results should be skipped. Kibana refers to custom rules as job rules.

          Hide custom_rules attributes Show custom_rules attributes object
          • actions array[string]

            The set of actions to be triggered when the rule applies. If more than one action is specified the effects of all actions are combined.

            Supported values include:

            • skip_result: The result will not be created. Unless you also specify skip_model_update, the model will be updated as usual with the corresponding series value.
            • skip_model_update: The value for that series will not be used to update the model. Unless you also specify skip_result, the results will be created as usual. This action is suitable when certain values are expected to be consistently anomalous and they affect the model in a way that negatively impacts the rest of the results.

            Values are skip_result or skip_model_update.

          • conditions array[object]

            An array of numeric conditions when the rule applies. A rule must either have a non-empty scope or at least one condition. Multiple conditions are combined together with a logical AND.

          • scope object

            A scope of series where the rule applies. A rule must either have a non-empty scope or at least one condition. By default, the scope includes all series. Scoping is allowed for any of the fields that are also specified in by_field_name, over_field_name, or partition_field_name.

        • A description of the detector.

        • A unique identifier for the detector. This identifier is based on the order of the detectors in the analysis_config, starting at zero. If you specify a value for this property, it is ignored.

        • Values are all, none, by, or over.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • function string

          The analysis function that is used. For example, count, rare, mean, min, max, or sum.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • use_null boolean

          Defines whether a new series is used as the null series when there is no value for the by or partition fields.

      • influencers array[string]

        A comma separated list of influencer field names. Typically these can be the by, over, or partition fields that are used in the detector configuration. You might also want to use a field name that is not specifically named in a detector, but is available as part of the input data. When you use multiple detectors, the use of influencers is recommended as it aggregates results for each influencer entity.

      • latency string

        A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • This functionality is reserved for internal use. It is not supported for use in customer environments and is not subject to the support SLA of official GA features. If set to true, the analysis will automatically find correlations between metrics for a given by field value and report anomalies when those correlations cease to hold. For example, suppose CPU and memory usage on host A is usually highly correlated with the same metrics on host B. Perhaps this correlation occurs because they are running a load-balanced application. If you enable this property, anomalies will be reported when, for example, CPU usage on host A is high and the value of CPU usage on host B is low. That is to say, you’ll see an anomaly when the CPU of host A is unusual given the CPU of host B. To use the multivariate_by_fields property, you must also specify by_field_name in your detector.

      • Hide per_partition_categorization attributes Show per_partition_categorization attributes object
        • enabled boolean

          To enable this setting, you must also set the partition_field_name property to the same value in every detector that uses the keyword mlcategory. Otherwise, job creation fails.

        • This setting can be set to true only if per-partition categorization is enabled. If true, both categorization and subsequent anomaly detection stops for partitions where the categorization status changes to warn. This setting makes it viable to have a job where it is expected that categorization works well for some partitions but not others; you do not pay the cost of bad categorization forever in the partitions where it works badly.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide analysis_limits attributes Show analysis_limits attributes object
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Custom metadata about the job

    • Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies a period of time (in days) after which only the first snapshot per day is retained. This period is relative to the timestamp of the most recent snapshot for this job.

    • data_description object Required
      Hide data_description attributes Show data_description attributes object
      • format string

        Only JSON format is supported at this time.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • The time format, which can be epoch, epoch_ms, or a custom pattern. The value epoch refers to UNIX or Epoch time (the number of seconds since 1 Jan 1970). The value epoch_ms indicates that time is measured in milliseconds since the epoch. The epoch and epoch_ms time formats accept either integer or real values. Custom patterns must conform to the Java DateTimeFormatter class. When you use date-time formatting patterns, it is recommended that you provide the full date, time and time zone. For example: yyyy-MM-dd'T'HH:mm:ssX. If the pattern that you specify is not sufficient to produce a complete timestamp, job creation fails.

    • Hide datafeed_config attributes Show datafeed_config attributes object
      • If set, the datafeed performs aggregation searches. Support for aggregations is limited and should be used only with low cardinality data.

      • Hide chunking_config attributes Show chunking_config attributes object
        • mode string Required

          Values are auto, manual, or off.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • enabled boolean Required

          Specifies whether the datafeed periodically checks for delayed data.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • indices string | array[string]
      • Hide indices_options attributes Show indices_options attributes object
        • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

        • expand_wildcards string | array[string]
        • If true, missing or closed indices are not included in the response.

        • If true, concrete, expanded or aliased indices are ignored when frozen.

      • job_id string
      • If a real-time datafeed has never seen any data (including during any initial training period) then it will automatically stop itself and close its associated job after this many real-time searches that return no documents. In other words, it will stop after frequency times max_empty_searches of real-time operation. If not set then a datafeed with no end time that sees no data will remain started until it is explicitly stopped.

      • query object

        An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

        External documentation
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
              Hide * attribute Show * attribute object
              • type string Required

                Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

          • fetch_fields array[object]

            For type lookup

            Hide fetch_fields attributes Show fetch_fields attributes object
            • field string Required

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • format string
          • format string

            A custom format for date type runtime fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • script object
            Hide script attributes Show script attributes object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • Specifies scripts that evaluate custom expressions and returns script fields to the datafeed. The detector configuration objects in a job can contain functions that use these script fields.

        Hide script_fields attribute Show script_fields attribute object
      • The size parameter that is used in Elasticsearch searches when the datafeed does not use aggregations. The maximum value is the value of index.max_result_window, which is 10,000 by default.

    • A description of the job.

    • groups array[string]

      A list of job groups. A job can belong to no groups or many.

    • job_id string
    • job_type string

      Reserved for future use, currently set to anomaly_detector.

    • Hide model_plot_config attributes Show model_plot_config attributes object
      • If true, enables calculation and storage of the model change annotations for each entity that is being analyzed.

      • enabled boolean

        If true, enables calculation and storage of the model bounds for each entity that is being analyzed.

      • terms string

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies the maximum period of time (in days) that snapshots are retained. This period is relative to the timestamp of the most recent snapshot for this job. The default value is 10, which means snapshots ten days older than the newest snapshot are deleted.

    • Advanced configuration option. The period over which adjustments to the score are applied, as new data is seen. The default value is the longer of 30 days or 100 bucket_spans.

    • Advanced configuration option. The period of time (in days) that results are retained. Age is calculated relative to the timestamp of the latest bucket result. If this property has a non-null value, once per day at 00:30 (server time), results that are the specified number of days older than the latest bucket result are deleted from Elasticsearch. The default value is null, which means all results are retained. Annotations generated by the system also count as results for retention purposes; they are deleted after the same number of days as results. Annotations added by users are retained forever.

Responses

GET /_ml/datafeeds/{datafeed_id}/_preview
curl \
 --request GET 'http://api.example.com/_ml/datafeeds/{datafeed_id}/_preview' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"datafeed_config":{"aggregations":{},"chunking_config":{"mode":"auto","time_span":"string"},"datafeed_id":"string","delayed_data_check_config":{"check_window":"string","enabled":true},"frequency":"string","indices":"string","indices_options":{"allow_no_indices":true,"expand_wildcards":"string","ignore_unavailable":true,"ignore_throttled":true},"job_id":"string","max_empty_searches":42.0,"query":{},"query_delay":"string","runtime_mappings":{"additionalProperty1":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"},"additionalProperty2":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"}},"script_fields":{"additionalProperty1":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true},"additionalProperty2":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true}},"scroll_size":42.0},"job_config":{"allow_lazy_open":true,"analysis_config":{"bucket_span":"string","":"string","categorization_field_name":"string","categorization_filters":["string"],"detectors":[{"by_field_name":"string","custom_rules":[{"actions":["skip_result"],"conditions":[{}],"scope":{}}],"detector_description":"string","detector_index":42.0,"exclude_frequent":"all","field_name":"string","function":"string","over_field_name":"string","partition_field_name":"string","use_null":true}],"influencers":["string"],"latency":"string","model_prune_window":"string","multivariate_by_fields":true,"per_partition_categorization":{"enabled":true,"stop_on_warn":true},"summary_count_field_name":"string"},"analysis_limits":{"categorization_examples_limit":42.0,"":42.0},"background_persist_interval":"string","custom_settings":{},"daily_model_snapshot_retention_after_days":42.0,"data_description":{"format":"string","time_field":"string","time_format":"string","field_delimiter":"string"},"datafeed_config":{"aggregations":{},"chunking_config":{"mode":"auto","time_span":"string"},"datafeed_id":"string","delayed_data_check_config":{"check_window":"string","enabled":true},"frequency":"string","indices":"string","indices_options":{"allow_no_indices":true,"expand_wildcards":"string","ignore_unavailable":true,"ignore_throttled":true},"job_id":"string","max_empty_searches":42.0,"query":{},"query_delay":"string","runtime_mappings":{"additionalProperty1":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"},"additionalProperty2":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"}},"script_fields":{"additionalProperty1":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true},"additionalProperty2":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true}},"scroll_size":42.0},"description":"string","groups":["string"],"job_id":"string","job_type":"string","model_plot_config":{"annotations_enabled":true,"enabled":true,"terms":"string"},"model_snapshot_retention_days":42.0,"renormalization_window_days":42.0,"results_index_name":"string","results_retention_days":42.0}}'

Preview a datafeed Added in 5.4.0

POST /_ml/datafeeds/{datafeed_id}/_preview

This API returns the first "page" of search results from a datafeed. You can preview an existing datafeed or provide configuration details for a datafeed and anomaly detection job in the API. The preview shows the structure of the data that will be passed to the anomaly detection engine. IMPORTANT: When Elasticsearch security features are enabled, the preview uses the credentials of the user that called the API. However, when the datafeed starts it uses the roles of the last user that created or updated the datafeed. To get a preview that accurately reflects the behavior of the datafeed, use the appropriate credentials. You can also use secondary authorization headers to supply the credentials.

Path parameters

  • datafeed_id string Required

    A numerical character string that uniquely identifies the datafeed. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters. NOTE: If you use this path parameter, you cannot provide datafeed or anomaly detection job configuration details in the request body.

Query parameters

  • start string | number

    The start time from where the datafeed preview should begin

  • end string | number

    The end time when the datafeed preview should stop

application/json

Body

  • Hide datafeed_config attributes Show datafeed_config attributes object
    • If set, the datafeed performs aggregation searches. Support for aggregations is limited and should be used only with low cardinality data.

    • Hide chunking_config attributes Show chunking_config attributes object
      • mode string Required

        Values are auto, manual, or off.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • enabled boolean Required

        Specifies whether the datafeed periodically checks for delayed data.

    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • indices string | array[string]
    • Hide indices_options attributes Show indices_options attributes object
      • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

      • expand_wildcards string | array[string]
      • If true, missing or closed indices are not included in the response.

      • If true, concrete, expanded or aliased indices are ignored when frozen.

    • job_id string
    • If a real-time datafeed has never seen any data (including during any initial training period) then it will automatically stop itself and close its associated job after this many real-time searches that return no documents. In other words, it will stop after frequency times max_empty_searches of real-time operation. If not set then a datafeed with no end time that sees no data will remain started until it is explicitly stopped.

    • query object

      An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Hide runtime_mappings attribute Show runtime_mappings attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • fields object

          For type composite

          Hide fields attribute Show fields attribute object
          • * object Additional properties
            Hide * attribute Show * attribute object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • fetch_fields array[object]

          For type lookup

          Hide fetch_fields attributes Show fetch_fields attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • format string
        • format string

          A custom format for date type runtime fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • script object
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
        • type string Required

          Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

    • Specifies scripts that evaluate custom expressions and returns script fields to the datafeed. The detector configuration objects in a job can contain functions that use these script fields.

      Hide script_fields attribute Show script_fields attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • script object Required
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
    • The size parameter that is used in Elasticsearch searches when the datafeed does not use aggregations. The maximum value is the value of index.max_result_window, which is 10,000 by default.

  • Hide job_config attributes Show job_config attributes object
    • Advanced configuration option. Specifies whether this job can open when there is insufficient machine learning node capacity for it to be immediately assigned to a node.

    • analysis_config object Required
      Hide analysis_config attributes Show analysis_config attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • categorization_analyzer string | object

        One of:
      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If categorization_field_name is specified, you can also define optional filters. This property expects an array of regular expressions. The expressions are used to filter out matching sequences from the categorization field values. You can use this functionality to fine tune the categorization by excluding sequences from consideration when categories are defined. For example, you can exclude SQL statements that appear in your log files. This property cannot be used at the same time as categorization_analyzer. If you only want to define simple regular expression filters that are applied prior to tokenization, setting this property is the easiest method. If you also want to customize the tokenizer or post-tokenization filtering, use the categorization_analyzer property instead and include the filters as pattern_replace character filters. The effect is exactly the same.

      • detectors array[object] Required

        Detector configuration objects specify which data fields a job analyzes. They also specify which analytical functions are used. You can specify multiple detectors for a job. If the detectors array does not contain at least one detector, no analysis can occur and an error is returned.

        Hide detectors attributes Show detectors attributes object
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • custom_rules array[object]

          Custom rules enable you to customize the way detectors operate. For example, a rule may dictate conditions under which results should be skipped. Kibana refers to custom rules as job rules.

          Hide custom_rules attributes Show custom_rules attributes object
          • actions array[string]

            The set of actions to be triggered when the rule applies. If more than one action is specified the effects of all actions are combined.

            Supported values include:

            • skip_result: The result will not be created. Unless you also specify skip_model_update, the model will be updated as usual with the corresponding series value.
            • skip_model_update: The value for that series will not be used to update the model. Unless you also specify skip_result, the results will be created as usual. This action is suitable when certain values are expected to be consistently anomalous and they affect the model in a way that negatively impacts the rest of the results.

            Values are skip_result or skip_model_update.

          • conditions array[object]

            An array of numeric conditions when the rule applies. A rule must either have a non-empty scope or at least one condition. Multiple conditions are combined together with a logical AND.

          • scope object

            A scope of series where the rule applies. A rule must either have a non-empty scope or at least one condition. By default, the scope includes all series. Scoping is allowed for any of the fields that are also specified in by_field_name, over_field_name, or partition_field_name.

        • A description of the detector.

        • A unique identifier for the detector. This identifier is based on the order of the detectors in the analysis_config, starting at zero. If you specify a value for this property, it is ignored.

        • Values are all, none, by, or over.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • function string

          The analysis function that is used. For example, count, rare, mean, min, max, or sum.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • use_null boolean

          Defines whether a new series is used as the null series when there is no value for the by or partition fields.

      • influencers array[string]

        A comma separated list of influencer field names. Typically these can be the by, over, or partition fields that are used in the detector configuration. You might also want to use a field name that is not specifically named in a detector, but is available as part of the input data. When you use multiple detectors, the use of influencers is recommended as it aggregates results for each influencer entity.

      • latency string

        A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • This functionality is reserved for internal use. It is not supported for use in customer environments and is not subject to the support SLA of official GA features. If set to true, the analysis will automatically find correlations between metrics for a given by field value and report anomalies when those correlations cease to hold. For example, suppose CPU and memory usage on host A is usually highly correlated with the same metrics on host B. Perhaps this correlation occurs because they are running a load-balanced application. If you enable this property, anomalies will be reported when, for example, CPU usage on host A is high and the value of CPU usage on host B is low. That is to say, you’ll see an anomaly when the CPU of host A is unusual given the CPU of host B. To use the multivariate_by_fields property, you must also specify by_field_name in your detector.

      • Hide per_partition_categorization attributes Show per_partition_categorization attributes object
        • enabled boolean

          To enable this setting, you must also set the partition_field_name property to the same value in every detector that uses the keyword mlcategory. Otherwise, job creation fails.

        • This setting can be set to true only if per-partition categorization is enabled. If true, both categorization and subsequent anomaly detection stops for partitions where the categorization status changes to warn. This setting makes it viable to have a job where it is expected that categorization works well for some partitions but not others; you do not pay the cost of bad categorization forever in the partitions where it works badly.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide analysis_limits attributes Show analysis_limits attributes object
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Custom metadata about the job

    • Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies a period of time (in days) after which only the first snapshot per day is retained. This period is relative to the timestamp of the most recent snapshot for this job.

    • data_description object Required
      Hide data_description attributes Show data_description attributes object
      • format string

        Only JSON format is supported at this time.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • The time format, which can be epoch, epoch_ms, or a custom pattern. The value epoch refers to UNIX or Epoch time (the number of seconds since 1 Jan 1970). The value epoch_ms indicates that time is measured in milliseconds since the epoch. The epoch and epoch_ms time formats accept either integer or real values. Custom patterns must conform to the Java DateTimeFormatter class. When you use date-time formatting patterns, it is recommended that you provide the full date, time and time zone. For example: yyyy-MM-dd'T'HH:mm:ssX. If the pattern that you specify is not sufficient to produce a complete timestamp, job creation fails.

    • Hide datafeed_config attributes Show datafeed_config attributes object
      • If set, the datafeed performs aggregation searches. Support for aggregations is limited and should be used only with low cardinality data.

      • Hide chunking_config attributes Show chunking_config attributes object
        • mode string Required

          Values are auto, manual, or off.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • enabled boolean Required

          Specifies whether the datafeed periodically checks for delayed data.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • indices string | array[string]
      • Hide indices_options attributes Show indices_options attributes object
        • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

        • expand_wildcards string | array[string]
        • If true, missing or closed indices are not included in the response.

        • If true, concrete, expanded or aliased indices are ignored when frozen.

      • job_id string
      • If a real-time datafeed has never seen any data (including during any initial training period) then it will automatically stop itself and close its associated job after this many real-time searches that return no documents. In other words, it will stop after frequency times max_empty_searches of real-time operation. If not set then a datafeed with no end time that sees no data will remain started until it is explicitly stopped.

      • query object

        An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

        External documentation
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
              Hide * attribute Show * attribute object
              • type string Required

                Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

          • fetch_fields array[object]

            For type lookup

            Hide fetch_fields attributes Show fetch_fields attributes object
            • field string Required

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • format string
          • format string

            A custom format for date type runtime fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • script object
            Hide script attributes Show script attributes object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • Specifies scripts that evaluate custom expressions and returns script fields to the datafeed. The detector configuration objects in a job can contain functions that use these script fields.

        Hide script_fields attribute Show script_fields attribute object
      • The size parameter that is used in Elasticsearch searches when the datafeed does not use aggregations. The maximum value is the value of index.max_result_window, which is 10,000 by default.

    • A description of the job.

    • groups array[string]

      A list of job groups. A job can belong to no groups or many.

    • job_id string
    • job_type string

      Reserved for future use, currently set to anomaly_detector.

    • Hide model_plot_config attributes Show model_plot_config attributes object
      • If true, enables calculation and storage of the model change annotations for each entity that is being analyzed.

      • enabled boolean

        If true, enables calculation and storage of the model bounds for each entity that is being analyzed.

      • terms string

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies the maximum period of time (in days) that snapshots are retained. This period is relative to the timestamp of the most recent snapshot for this job. The default value is 10, which means snapshots ten days older than the newest snapshot are deleted.

    • Advanced configuration option. The period over which adjustments to the score are applied, as new data is seen. The default value is the longer of 30 days or 100 bucket_spans.

    • Advanced configuration option. The period of time (in days) that results are retained. Age is calculated relative to the timestamp of the latest bucket result. If this property has a non-null value, once per day at 00:30 (server time), results that are the specified number of days older than the latest bucket result are deleted from Elasticsearch. The default value is null, which means all results are retained. Annotations generated by the system also count as results for retention purposes; they are deleted after the same number of days as results. Annotations added by users are retained forever.

Responses

POST /_ml/datafeeds/{datafeed_id}/_preview
curl \
 --request POST 'http://api.example.com/_ml/datafeeds/{datafeed_id}/_preview' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"datafeed_config":{"aggregations":{},"chunking_config":{"mode":"auto","time_span":"string"},"datafeed_id":"string","delayed_data_check_config":{"check_window":"string","enabled":true},"frequency":"string","indices":"string","indices_options":{"allow_no_indices":true,"expand_wildcards":"string","ignore_unavailable":true,"ignore_throttled":true},"job_id":"string","max_empty_searches":42.0,"query":{},"query_delay":"string","runtime_mappings":{"additionalProperty1":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"},"additionalProperty2":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"}},"script_fields":{"additionalProperty1":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true},"additionalProperty2":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true}},"scroll_size":42.0},"job_config":{"allow_lazy_open":true,"analysis_config":{"bucket_span":"string","":"string","categorization_field_name":"string","categorization_filters":["string"],"detectors":[{"by_field_name":"string","custom_rules":[{"actions":["skip_result"],"conditions":[{}],"scope":{}}],"detector_description":"string","detector_index":42.0,"exclude_frequent":"all","field_name":"string","function":"string","over_field_name":"string","partition_field_name":"string","use_null":true}],"influencers":["string"],"latency":"string","model_prune_window":"string","multivariate_by_fields":true,"per_partition_categorization":{"enabled":true,"stop_on_warn":true},"summary_count_field_name":"string"},"analysis_limits":{"categorization_examples_limit":42.0,"":42.0},"background_persist_interval":"string","custom_settings":{},"daily_model_snapshot_retention_after_days":42.0,"data_description":{"format":"string","time_field":"string","time_format":"string","field_delimiter":"string"},"datafeed_config":{"aggregations":{},"chunking_config":{"mode":"auto","time_span":"string"},"datafeed_id":"string","delayed_data_check_config":{"check_window":"string","enabled":true},"frequency":"string","indices":"string","indices_options":{"allow_no_indices":true,"expand_wildcards":"string","ignore_unavailable":true,"ignore_throttled":true},"job_id":"string","max_empty_searches":42.0,"query":{},"query_delay":"string","runtime_mappings":{"additionalProperty1":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"},"additionalProperty2":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"}},"script_fields":{"additionalProperty1":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true},"additionalProperty2":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true}},"scroll_size":42.0},"description":"string","groups":["string"],"job_id":"string","job_type":"string","model_plot_config":{"annotations_enabled":true,"enabled":true,"terms":"string"},"model_snapshot_retention_days":42.0,"renormalization_window_days":42.0,"results_index_name":"string","results_retention_days":42.0}}'

Preview a datafeed Added in 5.4.0

GET /_ml/datafeeds/_preview

This API returns the first "page" of search results from a datafeed. You can preview an existing datafeed or provide configuration details for a datafeed and anomaly detection job in the API. The preview shows the structure of the data that will be passed to the anomaly detection engine. IMPORTANT: When Elasticsearch security features are enabled, the preview uses the credentials of the user that called the API. However, when the datafeed starts it uses the roles of the last user that created or updated the datafeed. To get a preview that accurately reflects the behavior of the datafeed, use the appropriate credentials. You can also use secondary authorization headers to supply the credentials.

Query parameters

  • start string | number

    The start time from where the datafeed preview should begin

  • end string | number

    The end time when the datafeed preview should stop

application/json

Body

  • Hide datafeed_config attributes Show datafeed_config attributes object
    • If set, the datafeed performs aggregation searches. Support for aggregations is limited and should be used only with low cardinality data.

    • Hide chunking_config attributes Show chunking_config attributes object
      • mode string Required

        Values are auto, manual, or off.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • enabled boolean Required

        Specifies whether the datafeed periodically checks for delayed data.

    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • indices string | array[string]
    • Hide indices_options attributes Show indices_options attributes object
      • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

      • expand_wildcards string | array[string]
      • If true, missing or closed indices are not included in the response.

      • If true, concrete, expanded or aliased indices are ignored when frozen.

    • job_id string
    • If a real-time datafeed has never seen any data (including during any initial training period) then it will automatically stop itself and close its associated job after this many real-time searches that return no documents. In other words, it will stop after frequency times max_empty_searches of real-time operation. If not set then a datafeed with no end time that sees no data will remain started until it is explicitly stopped.

    • query object

      An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Hide runtime_mappings attribute Show runtime_mappings attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • fields object

          For type composite

          Hide fields attribute Show fields attribute object
          • * object Additional properties
            Hide * attribute Show * attribute object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • fetch_fields array[object]

          For type lookup

          Hide fetch_fields attributes Show fetch_fields attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • format string
        • format string

          A custom format for date type runtime fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • script object
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
        • type string Required

          Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

    • Specifies scripts that evaluate custom expressions and returns script fields to the datafeed. The detector configuration objects in a job can contain functions that use these script fields.

      Hide script_fields attribute Show script_fields attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • script object Required
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
    • The size parameter that is used in Elasticsearch searches when the datafeed does not use aggregations. The maximum value is the value of index.max_result_window, which is 10,000 by default.

  • Hide job_config attributes Show job_config attributes object
    • Advanced configuration option. Specifies whether this job can open when there is insufficient machine learning node capacity for it to be immediately assigned to a node.

    • analysis_config object Required
      Hide analysis_config attributes Show analysis_config attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • categorization_analyzer string | object

        One of:
      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If categorization_field_name is specified, you can also define optional filters. This property expects an array of regular expressions. The expressions are used to filter out matching sequences from the categorization field values. You can use this functionality to fine tune the categorization by excluding sequences from consideration when categories are defined. For example, you can exclude SQL statements that appear in your log files. This property cannot be used at the same time as categorization_analyzer. If you only want to define simple regular expression filters that are applied prior to tokenization, setting this property is the easiest method. If you also want to customize the tokenizer or post-tokenization filtering, use the categorization_analyzer property instead and include the filters as pattern_replace character filters. The effect is exactly the same.

      • detectors array[object] Required

        Detector configuration objects specify which data fields a job analyzes. They also specify which analytical functions are used. You can specify multiple detectors for a job. If the detectors array does not contain at least one detector, no analysis can occur and an error is returned.

        Hide detectors attributes Show detectors attributes object
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • custom_rules array[object]

          Custom rules enable you to customize the way detectors operate. For example, a rule may dictate conditions under which results should be skipped. Kibana refers to custom rules as job rules.

          Hide custom_rules attributes Show custom_rules attributes object
          • actions array[string]

            The set of actions to be triggered when the rule applies. If more than one action is specified the effects of all actions are combined.

            Supported values include:

            • skip_result: The result will not be created. Unless you also specify skip_model_update, the model will be updated as usual with the corresponding series value.
            • skip_model_update: The value for that series will not be used to update the model. Unless you also specify skip_result, the results will be created as usual. This action is suitable when certain values are expected to be consistently anomalous and they affect the model in a way that negatively impacts the rest of the results.

            Values are skip_result or skip_model_update.

          • conditions array[object]

            An array of numeric conditions when the rule applies. A rule must either have a non-empty scope or at least one condition. Multiple conditions are combined together with a logical AND.

          • scope object

            A scope of series where the rule applies. A rule must either have a non-empty scope or at least one condition. By default, the scope includes all series. Scoping is allowed for any of the fields that are also specified in by_field_name, over_field_name, or partition_field_name.

        • A description of the detector.

        • A unique identifier for the detector. This identifier is based on the order of the detectors in the analysis_config, starting at zero. If you specify a value for this property, it is ignored.

        • Values are all, none, by, or over.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • function string

          The analysis function that is used. For example, count, rare, mean, min, max, or sum.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • use_null boolean

          Defines whether a new series is used as the null series when there is no value for the by or partition fields.

      • influencers array[string]

        A comma separated list of influencer field names. Typically these can be the by, over, or partition fields that are used in the detector configuration. You might also want to use a field name that is not specifically named in a detector, but is available as part of the input data. When you use multiple detectors, the use of influencers is recommended as it aggregates results for each influencer entity.

      • latency string

        A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • This functionality is reserved for internal use. It is not supported for use in customer environments and is not subject to the support SLA of official GA features. If set to true, the analysis will automatically find correlations between metrics for a given by field value and report anomalies when those correlations cease to hold. For example, suppose CPU and memory usage on host A is usually highly correlated with the same metrics on host B. Perhaps this correlation occurs because they are running a load-balanced application. If you enable this property, anomalies will be reported when, for example, CPU usage on host A is high and the value of CPU usage on host B is low. That is to say, you’ll see an anomaly when the CPU of host A is unusual given the CPU of host B. To use the multivariate_by_fields property, you must also specify by_field_name in your detector.

      • Hide per_partition_categorization attributes Show per_partition_categorization attributes object
        • enabled boolean

          To enable this setting, you must also set the partition_field_name property to the same value in every detector that uses the keyword mlcategory. Otherwise, job creation fails.

        • This setting can be set to true only if per-partition categorization is enabled. If true, both categorization and subsequent anomaly detection stops for partitions where the categorization status changes to warn. This setting makes it viable to have a job where it is expected that categorization works well for some partitions but not others; you do not pay the cost of bad categorization forever in the partitions where it works badly.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide analysis_limits attributes Show analysis_limits attributes object
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Custom metadata about the job

    • Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies a period of time (in days) after which only the first snapshot per day is retained. This period is relative to the timestamp of the most recent snapshot for this job.

    • data_description object Required
      Hide data_description attributes Show data_description attributes object
      • format string

        Only JSON format is supported at this time.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • The time format, which can be epoch, epoch_ms, or a custom pattern. The value epoch refers to UNIX or Epoch time (the number of seconds since 1 Jan 1970). The value epoch_ms indicates that time is measured in milliseconds since the epoch. The epoch and epoch_ms time formats accept either integer or real values. Custom patterns must conform to the Java DateTimeFormatter class. When you use date-time formatting patterns, it is recommended that you provide the full date, time and time zone. For example: yyyy-MM-dd'T'HH:mm:ssX. If the pattern that you specify is not sufficient to produce a complete timestamp, job creation fails.

    • Hide datafeed_config attributes Show datafeed_config attributes object
      • If set, the datafeed performs aggregation searches. Support for aggregations is limited and should be used only with low cardinality data.

      • Hide chunking_config attributes Show chunking_config attributes object
        • mode string Required

          Values are auto, manual, or off.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • enabled boolean Required

          Specifies whether the datafeed periodically checks for delayed data.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • indices string | array[string]
      • Hide indices_options attributes Show indices_options attributes object
        • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

        • expand_wildcards string | array[string]
        • If true, missing or closed indices are not included in the response.

        • If true, concrete, expanded or aliased indices are ignored when frozen.

      • job_id string
      • If a real-time datafeed has never seen any data (including during any initial training period) then it will automatically stop itself and close its associated job after this many real-time searches that return no documents. In other words, it will stop after frequency times max_empty_searches of real-time operation. If not set then a datafeed with no end time that sees no data will remain started until it is explicitly stopped.

      • query object

        An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

        External documentation
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
              Hide * attribute Show * attribute object
              • type string Required

                Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

          • fetch_fields array[object]

            For type lookup

            Hide fetch_fields attributes Show fetch_fields attributes object
            • field string Required

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • format string
          • format string

            A custom format for date type runtime fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • script object
            Hide script attributes Show script attributes object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • Specifies scripts that evaluate custom expressions and returns script fields to the datafeed. The detector configuration objects in a job can contain functions that use these script fields.

        Hide script_fields attribute Show script_fields attribute object
      • The size parameter that is used in Elasticsearch searches when the datafeed does not use aggregations. The maximum value is the value of index.max_result_window, which is 10,000 by default.

    • A description of the job.

    • groups array[string]

      A list of job groups. A job can belong to no groups or many.

    • job_id string
    • job_type string

      Reserved for future use, currently set to anomaly_detector.

    • Hide model_plot_config attributes Show model_plot_config attributes object
      • If true, enables calculation and storage of the model change annotations for each entity that is being analyzed.

      • enabled boolean

        If true, enables calculation and storage of the model bounds for each entity that is being analyzed.

      • terms string

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies the maximum period of time (in days) that snapshots are retained. This period is relative to the timestamp of the most recent snapshot for this job. The default value is 10, which means snapshots ten days older than the newest snapshot are deleted.

    • Advanced configuration option. The period over which adjustments to the score are applied, as new data is seen. The default value is the longer of 30 days or 100 bucket_spans.

    • Advanced configuration option. The period of time (in days) that results are retained. Age is calculated relative to the timestamp of the latest bucket result. If this property has a non-null value, once per day at 00:30 (server time), results that are the specified number of days older than the latest bucket result are deleted from Elasticsearch. The default value is null, which means all results are retained. Annotations generated by the system also count as results for retention purposes; they are deleted after the same number of days as results. Annotations added by users are retained forever.

Responses

GET /_ml/datafeeds/_preview
curl \
 --request GET 'http://api.example.com/_ml/datafeeds/_preview' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"datafeed_config":{"aggregations":{},"chunking_config":{"mode":"auto","time_span":"string"},"datafeed_id":"string","delayed_data_check_config":{"check_window":"string","enabled":true},"frequency":"string","indices":"string","indices_options":{"allow_no_indices":true,"expand_wildcards":"string","ignore_unavailable":true,"ignore_throttled":true},"job_id":"string","max_empty_searches":42.0,"query":{},"query_delay":"string","runtime_mappings":{"additionalProperty1":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"},"additionalProperty2":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"}},"script_fields":{"additionalProperty1":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true},"additionalProperty2":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true}},"scroll_size":42.0},"job_config":{"allow_lazy_open":true,"analysis_config":{"bucket_span":"string","":"string","categorization_field_name":"string","categorization_filters":["string"],"detectors":[{"by_field_name":"string","custom_rules":[{"actions":["skip_result"],"conditions":[{}],"scope":{}}],"detector_description":"string","detector_index":42.0,"exclude_frequent":"all","field_name":"string","function":"string","over_field_name":"string","partition_field_name":"string","use_null":true}],"influencers":["string"],"latency":"string","model_prune_window":"string","multivariate_by_fields":true,"per_partition_categorization":{"enabled":true,"stop_on_warn":true},"summary_count_field_name":"string"},"analysis_limits":{"categorization_examples_limit":42.0,"":42.0},"background_persist_interval":"string","custom_settings":{},"daily_model_snapshot_retention_after_days":42.0,"data_description":{"format":"string","time_field":"string","time_format":"string","field_delimiter":"string"},"datafeed_config":{"aggregations":{},"chunking_config":{"mode":"auto","time_span":"string"},"datafeed_id":"string","delayed_data_check_config":{"check_window":"string","enabled":true},"frequency":"string","indices":"string","indices_options":{"allow_no_indices":true,"expand_wildcards":"string","ignore_unavailable":true,"ignore_throttled":true},"job_id":"string","max_empty_searches":42.0,"query":{},"query_delay":"string","runtime_mappings":{"additionalProperty1":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"},"additionalProperty2":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"}},"script_fields":{"additionalProperty1":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true},"additionalProperty2":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true}},"scroll_size":42.0},"description":"string","groups":["string"],"job_id":"string","job_type":"string","model_plot_config":{"annotations_enabled":true,"enabled":true,"terms":"string"},"model_snapshot_retention_days":42.0,"renormalization_window_days":42.0,"results_index_name":"string","results_retention_days":42.0}}'

Preview a datafeed Added in 5.4.0

POST /_ml/datafeeds/_preview

This API returns the first "page" of search results from a datafeed. You can preview an existing datafeed or provide configuration details for a datafeed and anomaly detection job in the API. The preview shows the structure of the data that will be passed to the anomaly detection engine. IMPORTANT: When Elasticsearch security features are enabled, the preview uses the credentials of the user that called the API. However, when the datafeed starts it uses the roles of the last user that created or updated the datafeed. To get a preview that accurately reflects the behavior of the datafeed, use the appropriate credentials. You can also use secondary authorization headers to supply the credentials.

Query parameters

  • start string | number

    The start time from where the datafeed preview should begin

  • end string | number

    The end time when the datafeed preview should stop

application/json

Body

  • Hide datafeed_config attributes Show datafeed_config attributes object
    • If set, the datafeed performs aggregation searches. Support for aggregations is limited and should be used only with low cardinality data.

    • Hide chunking_config attributes Show chunking_config attributes object
      • mode string Required

        Values are auto, manual, or off.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • enabled boolean Required

        Specifies whether the datafeed periodically checks for delayed data.

    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • indices string | array[string]
    • Hide indices_options attributes Show indices_options attributes object
      • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

      • expand_wildcards string | array[string]
      • If true, missing or closed indices are not included in the response.

      • If true, concrete, expanded or aliased indices are ignored when frozen.

    • job_id string
    • If a real-time datafeed has never seen any data (including during any initial training period) then it will automatically stop itself and close its associated job after this many real-time searches that return no documents. In other words, it will stop after frequency times max_empty_searches of real-time operation. If not set then a datafeed with no end time that sees no data will remain started until it is explicitly stopped.

    • query object

      An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Hide runtime_mappings attribute Show runtime_mappings attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • fields object

          For type composite

          Hide fields attribute Show fields attribute object
          • * object Additional properties
            Hide * attribute Show * attribute object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • fetch_fields array[object]

          For type lookup

          Hide fetch_fields attributes Show fetch_fields attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • format string
        • format string

          A custom format for date type runtime fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • script object
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
        • type string Required

          Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

    • Specifies scripts that evaluate custom expressions and returns script fields to the datafeed. The detector configuration objects in a job can contain functions that use these script fields.

      Hide script_fields attribute Show script_fields attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • script object Required
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
    • The size parameter that is used in Elasticsearch searches when the datafeed does not use aggregations. The maximum value is the value of index.max_result_window, which is 10,000 by default.

  • Hide job_config attributes Show job_config attributes object
    • Advanced configuration option. Specifies whether this job can open when there is insufficient machine learning node capacity for it to be immediately assigned to a node.

    • analysis_config object Required
      Hide analysis_config attributes Show analysis_config attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • categorization_analyzer string | object

        One of:
      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If categorization_field_name is specified, you can also define optional filters. This property expects an array of regular expressions. The expressions are used to filter out matching sequences from the categorization field values. You can use this functionality to fine tune the categorization by excluding sequences from consideration when categories are defined. For example, you can exclude SQL statements that appear in your log files. This property cannot be used at the same time as categorization_analyzer. If you only want to define simple regular expression filters that are applied prior to tokenization, setting this property is the easiest method. If you also want to customize the tokenizer or post-tokenization filtering, use the categorization_analyzer property instead and include the filters as pattern_replace character filters. The effect is exactly the same.

      • detectors array[object] Required

        Detector configuration objects specify which data fields a job analyzes. They also specify which analytical functions are used. You can specify multiple detectors for a job. If the detectors array does not contain at least one detector, no analysis can occur and an error is returned.

        Hide detectors attributes Show detectors attributes object
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • custom_rules array[object]

          Custom rules enable you to customize the way detectors operate. For example, a rule may dictate conditions under which results should be skipped. Kibana refers to custom rules as job rules.

          Hide custom_rules attributes Show custom_rules attributes object
          • actions array[string]

            The set of actions to be triggered when the rule applies. If more than one action is specified the effects of all actions are combined.

            Supported values include:

            • skip_result: The result will not be created. Unless you also specify skip_model_update, the model will be updated as usual with the corresponding series value.
            • skip_model_update: The value for that series will not be used to update the model. Unless you also specify skip_result, the results will be created as usual. This action is suitable when certain values are expected to be consistently anomalous and they affect the model in a way that negatively impacts the rest of the results.

            Values are skip_result or skip_model_update.

          • conditions array[object]

            An array of numeric conditions when the rule applies. A rule must either have a non-empty scope or at least one condition. Multiple conditions are combined together with a logical AND.

          • scope object

            A scope of series where the rule applies. A rule must either have a non-empty scope or at least one condition. By default, the scope includes all series. Scoping is allowed for any of the fields that are also specified in by_field_name, over_field_name, or partition_field_name.

        • A description of the detector.

        • A unique identifier for the detector. This identifier is based on the order of the detectors in the analysis_config, starting at zero. If you specify a value for this property, it is ignored.

        • Values are all, none, by, or over.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • function string

          The analysis function that is used. For example, count, rare, mean, min, max, or sum.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • use_null boolean

          Defines whether a new series is used as the null series when there is no value for the by or partition fields.

      • influencers array[string]

        A comma separated list of influencer field names. Typically these can be the by, over, or partition fields that are used in the detector configuration. You might also want to use a field name that is not specifically named in a detector, but is available as part of the input data. When you use multiple detectors, the use of influencers is recommended as it aggregates results for each influencer entity.

      • latency string

        A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • This functionality is reserved for internal use. It is not supported for use in customer environments and is not subject to the support SLA of official GA features. If set to true, the analysis will automatically find correlations between metrics for a given by field value and report anomalies when those correlations cease to hold. For example, suppose CPU and memory usage on host A is usually highly correlated with the same metrics on host B. Perhaps this correlation occurs because they are running a load-balanced application. If you enable this property, anomalies will be reported when, for example, CPU usage on host A is high and the value of CPU usage on host B is low. That is to say, you’ll see an anomaly when the CPU of host A is unusual given the CPU of host B. To use the multivariate_by_fields property, you must also specify by_field_name in your detector.

      • Hide per_partition_categorization attributes Show per_partition_categorization attributes object
        • enabled boolean

          To enable this setting, you must also set the partition_field_name property to the same value in every detector that uses the keyword mlcategory. Otherwise, job creation fails.

        • This setting can be set to true only if per-partition categorization is enabled. If true, both categorization and subsequent anomaly detection stops for partitions where the categorization status changes to warn. This setting makes it viable to have a job where it is expected that categorization works well for some partitions but not others; you do not pay the cost of bad categorization forever in the partitions where it works badly.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Hide analysis_limits attributes Show analysis_limits attributes object
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Custom metadata about the job

    • Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies a period of time (in days) after which only the first snapshot per day is retained. This period is relative to the timestamp of the most recent snapshot for this job.

    • data_description object Required
      Hide data_description attributes Show data_description attributes object
      • format string

        Only JSON format is supported at this time.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • The time format, which can be epoch, epoch_ms, or a custom pattern. The value epoch refers to UNIX or Epoch time (the number of seconds since 1 Jan 1970). The value epoch_ms indicates that time is measured in milliseconds since the epoch. The epoch and epoch_ms time formats accept either integer or real values. Custom patterns must conform to the Java DateTimeFormatter class. When you use date-time formatting patterns, it is recommended that you provide the full date, time and time zone. For example: yyyy-MM-dd'T'HH:mm:ssX. If the pattern that you specify is not sufficient to produce a complete timestamp, job creation fails.

    • Hide datafeed_config attributes Show datafeed_config attributes object
      • If set, the datafeed performs aggregation searches. Support for aggregations is limited and should be used only with low cardinality data.

      • Hide chunking_config attributes Show chunking_config attributes object
        • mode string Required

          Values are auto, manual, or off.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • enabled boolean Required

          Specifies whether the datafeed periodically checks for delayed data.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • indices string | array[string]
      • Hide indices_options attributes Show indices_options attributes object
        • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

        • expand_wildcards string | array[string]
        • If true, missing or closed indices are not included in the response.

        • If true, concrete, expanded or aliased indices are ignored when frozen.

      • job_id string
      • If a real-time datafeed has never seen any data (including during any initial training period) then it will automatically stop itself and close its associated job after this many real-time searches that return no documents. In other words, it will stop after frequency times max_empty_searches of real-time operation. If not set then a datafeed with no end time that sees no data will remain started until it is explicitly stopped.

      • query object

        An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

        External documentation
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
              Hide * attribute Show * attribute object
              • type string Required

                Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

          • fetch_fields array[object]

            For type lookup

            Hide fetch_fields attributes Show fetch_fields attributes object
            • field string Required

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • format string
          • format string

            A custom format for date type runtime fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • script object
            Hide script attributes Show script attributes object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • Specifies scripts that evaluate custom expressions and returns script fields to the datafeed. The detector configuration objects in a job can contain functions that use these script fields.

        Hide script_fields attribute Show script_fields attribute object
      • The size parameter that is used in Elasticsearch searches when the datafeed does not use aggregations. The maximum value is the value of index.max_result_window, which is 10,000 by default.

    • A description of the job.

    • groups array[string]

      A list of job groups. A job can belong to no groups or many.

    • job_id string
    • job_type string

      Reserved for future use, currently set to anomaly_detector.

    • Hide model_plot_config attributes Show model_plot_config attributes object
      • If true, enables calculation and storage of the model change annotations for each entity that is being analyzed.

      • enabled boolean

        If true, enables calculation and storage of the model bounds for each entity that is being analyzed.

      • terms string

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies the maximum period of time (in days) that snapshots are retained. This period is relative to the timestamp of the most recent snapshot for this job. The default value is 10, which means snapshots ten days older than the newest snapshot are deleted.

    • Advanced configuration option. The period over which adjustments to the score are applied, as new data is seen. The default value is the longer of 30 days or 100 bucket_spans.

    • Advanced configuration option. The period of time (in days) that results are retained. Age is calculated relative to the timestamp of the latest bucket result. If this property has a non-null value, once per day at 00:30 (server time), results that are the specified number of days older than the latest bucket result are deleted from Elasticsearch. The default value is null, which means all results are retained. Annotations generated by the system also count as results for retention purposes; they are deleted after the same number of days as results. Annotations added by users are retained forever.

Responses

POST /_ml/datafeeds/_preview
curl \
 --request POST 'http://api.example.com/_ml/datafeeds/_preview' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"datafeed_config":{"aggregations":{},"chunking_config":{"mode":"auto","time_span":"string"},"datafeed_id":"string","delayed_data_check_config":{"check_window":"string","enabled":true},"frequency":"string","indices":"string","indices_options":{"allow_no_indices":true,"expand_wildcards":"string","ignore_unavailable":true,"ignore_throttled":true},"job_id":"string","max_empty_searches":42.0,"query":{},"query_delay":"string","runtime_mappings":{"additionalProperty1":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"},"additionalProperty2":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"}},"script_fields":{"additionalProperty1":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true},"additionalProperty2":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true}},"scroll_size":42.0},"job_config":{"allow_lazy_open":true,"analysis_config":{"bucket_span":"string","":"string","categorization_field_name":"string","categorization_filters":["string"],"detectors":[{"by_field_name":"string","custom_rules":[{"actions":["skip_result"],"conditions":[{}],"scope":{}}],"detector_description":"string","detector_index":42.0,"exclude_frequent":"all","field_name":"string","function":"string","over_field_name":"string","partition_field_name":"string","use_null":true}],"influencers":["string"],"latency":"string","model_prune_window":"string","multivariate_by_fields":true,"per_partition_categorization":{"enabled":true,"stop_on_warn":true},"summary_count_field_name":"string"},"analysis_limits":{"categorization_examples_limit":42.0,"":42.0},"background_persist_interval":"string","custom_settings":{},"daily_model_snapshot_retention_after_days":42.0,"data_description":{"format":"string","time_field":"string","time_format":"string","field_delimiter":"string"},"datafeed_config":{"aggregations":{},"chunking_config":{"mode":"auto","time_span":"string"},"datafeed_id":"string","delayed_data_check_config":{"check_window":"string","enabled":true},"frequency":"string","indices":"string","indices_options":{"allow_no_indices":true,"expand_wildcards":"string","ignore_unavailable":true,"ignore_throttled":true},"job_id":"string","max_empty_searches":42.0,"query":{},"query_delay":"string","runtime_mappings":{"additionalProperty1":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"},"additionalProperty2":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"}},"script_fields":{"additionalProperty1":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true},"additionalProperty2":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true}},"scroll_size":42.0},"description":"string","groups":["string"],"job_id":"string","job_type":"string","model_plot_config":{"annotations_enabled":true,"enabled":true,"terms":"string"},"model_snapshot_retention_days":42.0,"renormalization_window_days":42.0,"results_index_name":"string","results_retention_days":42.0}}'

Reset an anomaly detection job Added in 7.14.0

POST /_ml/anomaly_detectors/{job_id}/_reset

All model state and results are deleted. The job is ready to start over as if it had just been created. It is not currently possible to reset multiple jobs using wildcards or a comma separated list.

Path parameters

  • job_id string Required

    The ID of the job to reset.

Query parameters

  • Should this request wait until the operation has completed before returning.

  • Specifies whether annotations that have been added by the user should be deleted along with any auto-generated annotations when the job is reset.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_ml/anomaly_detectors/{job_id}/_reset
curl \
 --request POST 'http://api.example.com/_ml/anomaly_detectors/{job_id}/_reset' \
 --header "Authorization: $API_KEY"

Start datafeeds Added in 5.5.0

POST /_ml/datafeeds/{datafeed_id}/_start

A datafeed must be started in order to retrieve data from Elasticsearch. A datafeed can be started and stopped multiple times throughout its lifecycle.

Before you can start a datafeed, the anomaly detection job must be open. Otherwise, an error occurs.

If you restart a stopped datafeed, it continues processing input data from the next millisecond after it was stopped. If new data was indexed for that exact millisecond between stopping and starting, it will be ignored.

When Elasticsearch security features are enabled, your datafeed remembers which roles the last user to create or update it had at the time of creation or update and runs the query using those same roles. If you provided secondary authorization headers when you created or updated the datafeed, those credentials are used instead.

Path parameters

  • datafeed_id string Required

    A numerical character string that uniquely identifies the datafeed. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.

Query parameters

  • end string | number

    The time that the datafeed should end, which can be specified by using one of the following formats:

    • ISO 8601 format with milliseconds, for example 2017-01-22T06:00:00.000Z
    • ISO 8601 format without milliseconds, for example 2017-01-22T06:00:00+00:00
    • Milliseconds since the epoch, for example 1485061200000

    Date-time arguments using either of the ISO 8601 formats must have a time zone designator, where Z is accepted as an abbreviation for UTC time. When a URL is expected (for example, in browsers), the + used in time zone designators must be encoded as %2B. The end time value is exclusive. If you do not specify an end time, the datafeed runs continuously.

  • start string | number

    The time that the datafeed should begin, which can be specified by using the same formats as the end parameter. This value is inclusive. If you do not specify a start time and the datafeed is associated with a new anomaly detection job, the analysis starts from the earliest time for which data is available. If you restart a stopped datafeed and specify a start value that is earlier than the timestamp of the latest processed record, the datafeed continues from 1 millisecond after the timestamp of the latest processed record.

  • timeout string

    Specifies the amount of time to wait until a datafeed starts.

application/json

Body

  • end string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • start string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • timeout string

    A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

Responses

POST /_ml/datafeeds/{datafeed_id}/_start
curl \
 --request POST 'http://api.example.com/_ml/datafeeds/{datafeed_id}/_start' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"":"string","timeout":"string"}'

Stop datafeeds Added in 5.4.0

POST /_ml/datafeeds/{datafeed_id}/_stop

A datafeed that is stopped ceases to retrieve data from Elasticsearch. A datafeed can be started and stopped multiple times throughout its lifecycle.

Path parameters

  • datafeed_id string Required

    Identifier for the datafeed. You can stop multiple datafeeds in a single API request by using a comma-separated list of datafeeds or a wildcard expression. You can close all datafeeds by using _all or by specifying * as the identifier.

Query parameters

  • Specifies what to do when the request:

    • Contains wildcard expressions and there are no datafeeds that match.
    • Contains the _all string or no identifiers and there are no matches.
    • Contains wildcard expressions and there are only partial matches.

    If true, the API returns an empty datafeeds array when there are no matches and the subset of results when there are partial matches. If false, the API returns a 404 status code when there are no matches or only partial matches.

  • force boolean

    If true, the datafeed is stopped forcefully.

  • timeout string

    Specifies the amount of time to wait until a datafeed stops.

application/json

Body

  • Refer to the description for the allow_no_match query parameter.

  • force boolean

    Refer to the description for the force query parameter.

  • timeout string

    A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
POST /_ml/datafeeds/{datafeed_id}/_stop
curl \
 --request POST 'http://api.example.com/_ml/datafeeds/{datafeed_id}/_stop' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"allow_no_match":true,"force":true,"timeout":"string"}'

Update a datafeed Added in 6.4.0

POST /_ml/datafeeds/{datafeed_id}/_update

You must stop and start the datafeed for the changes to be applied. When Elasticsearch security features are enabled, your datafeed remembers which roles the user who updated it had at the time of the update and runs the query using those same roles. If you provide secondary authorization headers, those credentials are used instead.

Path parameters

  • datafeed_id string Required

    A numerical character string that uniquely identifies the datafeed. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.

Query parameters

  • If true, wildcard indices expressions that resolve into no concrete indices are ignored. This includes the _all string or when no indices are specified.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values. Valid values are:

    • all: Match any data stream or index, including hidden ones.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard patterns are not accepted.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • ignore_throttled boolean Deprecated

    If true, concrete, expanded or aliased indices are ignored when frozen.

  • If true, unavailable indices (missing or closed) are ignored.

application/json

Body Required

  • If set, the datafeed performs aggregation searches. Support for aggregations is limited and should be used only with low cardinality data.

  • Hide chunking_config attributes Show chunking_config attributes object
    • mode string Required

      Values are auto, manual, or off.

    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • enabled boolean Required

      Specifies whether the datafeed periodically checks for delayed data.

  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • indices array[string]

    An array of index names. Wildcards are supported. If any of the indices are in remote clusters, the machine learning nodes must have the remote_cluster_client role.

  • Hide indices_options attributes Show indices_options attributes object
    • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

    • expand_wildcards string | array[string]
    • If true, missing or closed indices are not included in the response.

    • If true, concrete, expanded or aliased indices are ignored when frozen.

  • job_id string
  • If a real-time datafeed has never seen any data (including during any initial training period), it automatically stops and closes the associated job after this many real-time searches return no documents. In other words, it stops after frequency times max_empty_searches of real-time operation. If not set, a datafeed with no end time that sees no data remains started until it is explicitly stopped. By default, it is not set.

  • query object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • Hide runtime_mappings attribute Show runtime_mappings attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • fields object

        For type composite

        Hide fields attribute Show fields attribute object
        • * object Additional properties
          Hide * attribute Show * attribute object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • fetch_fields array[object]

        For type lookup

        Hide fetch_fields attributes Show fetch_fields attributes object
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • format string
      • format string

        A custom format for date type runtime fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • script object
        Hide script attributes Show script attributes object
        • source string | object

          One of:
        • id string
        • params object

          Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          Hide params attribute Show params attribute object
          • * object Additional properties
        • lang string

          Any of:

          Values are painless, expression, mustache, or java.

        • options object
          Hide options attribute Show options attribute object
          • * string Additional properties
      • type string Required

        Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

  • Specifies scripts that evaluate custom expressions and returns script fields to the datafeed. The detector configuration objects in a job can contain functions that use these script fields.

    Hide script_fields attribute Show script_fields attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • script object Required
        Hide script attributes Show script attributes object
        • source string | object

          One of:
        • id string
        • params object

          Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          Hide params attribute Show params attribute object
          • * object Additional properties
        • lang string

          Any of:

          Values are painless, expression, mustache, or java.

        • options object
          Hide options attribute Show options attribute object
          • * string Additional properties
  • The size parameter that is used in Elasticsearch searches when the datafeed does not use aggregations. The maximum value is the value of index.max_result_window.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide authorization attributes Show authorization attributes object
      • api_key object
        Hide api_key attributes Show api_key attributes object
        • id string Required

          The identifier for the API key.

        • name string Required

          The name of the API key.

      • roles array[string]

        If a user ID was used for the most recent update to the datafeed, its roles at the time of the update are listed in the response.

      • If a service account was used for the most recent update to the datafeed, the account name is listed in the response.

    • chunking_config object Required
      Hide chunking_config attributes Show chunking_config attributes object
      • mode string Required

        Values are auto, manual, or off.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • enabled boolean Required

        Specifies whether the datafeed periodically checks for delayed data.

    • datafeed_id string Required
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • indices array[string] Required
    • Hide indices_options attributes Show indices_options attributes object
      • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

      • expand_wildcards string | array[string]
      • If true, missing or closed indices are not included in the response.

      • If true, concrete, expanded or aliased indices are ignored when frozen.

    • job_id string Required
    • query object Required

      An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
    • query_delay string Required

      A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Hide runtime_mappings attribute Show runtime_mappings attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • fields object

          For type composite

          Hide fields attribute Show fields attribute object
          • * object Additional properties
            Hide * attribute Show * attribute object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • fetch_fields array[object]

          For type lookup

          Hide fetch_fields attributes Show fetch_fields attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • format string
        • format string

          A custom format for date type runtime fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • script object
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
        • type string Required

          Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

    • Hide script_fields attribute Show script_fields attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • script object Required
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
    • scroll_size number Required
POST /_ml/datafeeds/{datafeed_id}/_update
curl \
 --request POST 'http://api.example.com/_ml/datafeeds/{datafeed_id}/_update' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"aggregations":{},"chunking_config":{"mode":"auto","time_span":"string"},"delayed_data_check_config":{"check_window":"string","enabled":true},"frequency":"string","indices":["string"],"indices_options":{"allow_no_indices":true,"expand_wildcards":"string","ignore_unavailable":true,"ignore_throttled":true},"job_id":"string","max_empty_searches":42.0,"query":{},"query_delay":"string","runtime_mappings":{"additionalProperty1":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"},"additionalProperty2":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"}},"script_fields":{"additionalProperty1":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true},"additionalProperty2":{"script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"ignore_failure":true}},"scroll_size":42.0}'

Update a filter Added in 6.4.0

POST /_ml/filters/{filter_id}/_update

Updates the description of a filter, adds items, or removes items from the list.

Path parameters

  • filter_id string Required

    A string that uniquely identifies a filter.

application/json

Body Required

Responses

POST /_ml/filters/{filter_id}/_update
curl \
 --request POST 'http://api.example.com/_ml/filters/{filter_id}/_update' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"add_items":["string"],"description":"string","remove_items":["string"]}'

Update an anomaly detection job Added in 5.5.0

POST /_ml/anomaly_detectors/{job_id}/_update

Updates certain properties of an anomaly detection job.

Path parameters

  • job_id string Required

    Identifier for the job.

application/json

Body Required

  • Advanced configuration option. Specifies whether this job can open when there is insufficient machine learning node capacity for it to be immediately assigned to a node. If false and a machine learning node with capacity to run the job cannot immediately be found, the open anomaly detection jobs API returns an error. However, this is also subject to the cluster-wide xpack.ml.max_lazy_ml_nodes setting. If this option is set to true, the open anomaly detection jobs API does not return an error and the job waits in the opening state until sufficient machine learning node capacity is available.

  • Hide analysis_limits attribute Show analysis_limits attribute object
    • model_memory_limit string Required

      Limits can be applied for the resources required to hold the mathematical models in memory. These limits are approximate and can be set per job. They do not control the memory used by other processes, for example the Elasticsearch Java processes.

  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • Advanced configuration option. Contains custom meta data about the job. For example, it can contain custom URL information as shown in Adding custom URLs to machine learning results.

    Hide custom_settings attribute Show custom_settings attribute object
    • * object Additional properties
  • A description of the job.

  • Hide model_plot_config attributes Show model_plot_config attributes object
    • If true, enables calculation and storage of the model change annotations for each entity that is being analyzed.

    • enabled boolean

      If true, enables calculation and storage of the model bounds for each entity that is being analyzed.

    • terms string

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies a period of time (in days) after which only the first snapshot per day is retained. This period is relative to the timestamp of the most recent snapshot for this job. Valid values range from 0 to model_snapshot_retention_days. For jobs created before version 7.8.0, the default value matches model_snapshot_retention_days.

  • Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies the maximum period of time (in days) that snapshots are retained. This period is relative to the timestamp of the most recent snapshot for this job.

  • Advanced configuration option. The period over which adjustments to the score are applied, as new data is seen.

  • Advanced configuration option. The period of time (in days) that results are retained. Age is calculated relative to the timestamp of the latest bucket result. If this property has a non-null value, once per day at 00:30 (server time), results that are the specified number of days older than the latest bucket result are deleted from Elasticsearch. The default value is null, which means all results are retained.

  • groups array[string]

    A list of job groups. A job can belong to no groups or many.

  • detectors array[object]

    An array of detector update objects.

    Hide detectors attributes Show detectors attributes object
    • detector_index number Required

      A unique identifier for the detector. This identifier is based on the order of the detectors in the analysis_config, starting at zero.

    • A description of the detector.

    • custom_rules array[object]

      An array of custom rule objects, which enable you to customize the way detectors operate. For example, a rule may dictate to the detector conditions under which results should be skipped. Kibana refers to custom rules as job rules.

      Hide custom_rules attributes Show custom_rules attributes object
      • actions array[string]

        The set of actions to be triggered when the rule applies. If more than one action is specified the effects of all actions are combined.

        Supported values include:

        • skip_result: The result will not be created. Unless you also specify skip_model_update, the model will be updated as usual with the corresponding series value.
        • skip_model_update: The value for that series will not be used to update the model. Unless you also specify skip_result, the results will be created as usual. This action is suitable when certain values are expected to be consistently anomalous and they affect the model in a way that negatively impacts the rest of the results.

        Values are skip_result or skip_model_update.

      • conditions array[object]

        An array of numeric conditions when the rule applies. A rule must either have a non-empty scope or at least one condition. Multiple conditions are combined together with a logical AND.

        Hide conditions attributes Show conditions attributes object
        • applies_to string Required

          Values are actual, typical, diff_from_typical, or time.

        • operator string Required

          Values are gt, gte, lt, or lte.

        • value number Required

          The value that is compared against the applies_to field using the operator.

      • scope object

        A scope of series where the rule applies. A rule must either have a non-empty scope or at least one condition. By default, the scope includes all series. Scoping is allowed for any of the fields that are also specified in by_field_name, over_field_name, or partition_field_name.

        Hide scope attribute Show scope attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
  • Hide per_partition_categorization attributes Show per_partition_categorization attributes object
    • enabled boolean

      To enable this setting, you must also set the partition_field_name property to the same value in every detector that uses the keyword mlcategory. Otherwise, job creation fails.

    • This setting can be set to true only if per-partition categorization is enabled. If true, both categorization and subsequent anomaly detection stops for partitions where the categorization status changes to warn. This setting makes it viable to have a job where it is expected that categorization works well for some partitions but not others; you do not pay the cost of bad categorization forever in the partitions where it works badly.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • allow_lazy_open boolean Required
    • analysis_config object Required
      Hide analysis_config attributes Show analysis_config attributes object
      • bucket_span string Required

        A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • categorization_analyzer string | object

        One of:
      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • If categorization_field_name is specified, you can also define optional filters. This property expects an array of regular expressions. The expressions are used to filter out matching sequences from the categorization field values.

      • detectors array[object] Required

        An array of detector configuration objects. Detector configuration objects specify which data fields a job analyzes. They also specify which analytical functions are used. You can specify multiple detectors for a job.

        Hide detectors attributes Show detectors attributes object
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • custom_rules array[object]

          An array of custom rule objects, which enable you to customize the way detectors operate. For example, a rule may dictate to the detector conditions under which results should be skipped. Kibana refers to custom rules as job rules.

          Hide custom_rules attributes Show custom_rules attributes object
          • actions array[string]

            The set of actions to be triggered when the rule applies. If more than one action is specified the effects of all actions are combined.

            Supported values include:

            • skip_result: The result will not be created. Unless you also specify skip_model_update, the model will be updated as usual with the corresponding series value.
            • skip_model_update: The value for that series will not be used to update the model. Unless you also specify skip_result, the results will be created as usual. This action is suitable when certain values are expected to be consistently anomalous and they affect the model in a way that negatively impacts the rest of the results.

            Values are skip_result or skip_model_update.

          • conditions array[object]

            An array of numeric conditions when the rule applies. A rule must either have a non-empty scope or at least one condition. Multiple conditions are combined together with a logical AND.

          • scope object

            A scope of series where the rule applies. A rule must either have a non-empty scope or at least one condition. By default, the scope includes all series. Scoping is allowed for any of the fields that are also specified in by_field_name, over_field_name, or partition_field_name.

        • A description of the detector.

        • A unique identifier for the detector. This identifier is based on the order of the detectors in the analysis_config, starting at zero.

        • Values are all, none, by, or over.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • function string Required

          The analysis function that is used. For example, count, rare, mean, min, max, and sum.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • use_null boolean

          Defines whether a new series is used as the null series when there is no value for the by or partition fields.

      • influencers array[string] Required

        A comma separated list of influencer field names. Typically these can be the by, over, or partition fields that are used in the detector configuration. You might also want to use a field name that is not specifically named in a detector, but is available as part of the input data. When you use multiple detectors, the use of influencers is recommended as it aggregates results for each influencer entity.

      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • latency string

        A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • This functionality is reserved for internal use. It is not supported for use in customer environments and is not subject to the support SLA of official GA features. If set to true, the analysis will automatically find correlations between metrics for a given by field value and report anomalies when those correlations cease to hold.

      • Hide per_partition_categorization attributes Show per_partition_categorization attributes object
        • enabled boolean

          To enable this setting, you must also set the partition_field_name property to the same value in every detector that uses the keyword mlcategory. Otherwise, job creation fails.

        • This setting can be set to true only if per-partition categorization is enabled. If true, both categorization and subsequent anomaly detection stops for partitions where the categorization status changes to warn. This setting makes it viable to have a job where it is expected that categorization works well for some partitions but not others; you do not pay the cost of bad categorization forever in the partitions where it works badly.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • analysis_limits object Required
      Hide analysis_limits attributes Show analysis_limits attributes object
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Time unit for milliseconds

    • Time unit for milliseconds

    • Hide custom_settings attribute Show custom_settings attribute object
      • * string Additional properties
    • data_description object Required
      Hide data_description attributes Show data_description attributes object
      • format string

        Only JSON format is supported at this time.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • The time format, which can be epoch, epoch_ms, or a custom pattern. The value epoch refers to UNIX or Epoch time (the number of seconds since 1 Jan 1970). The value epoch_ms indicates that time is measured in milliseconds since the epoch. The epoch and epoch_ms time formats accept either integer or real values. Custom patterns must conform to the Java DateTimeFormatter class. When you use date-time formatting patterns, it is recommended that you provide the full date, time and time zone. For example: yyyy-MM-dd'T'HH:mm:ssX. If the pattern that you specify is not sufficient to produce a complete timestamp, job creation fails.

    • Hide datafeed_config attributes Show datafeed_config attributes object
      • Hide authorization attributes Show authorization attributes object
        • api_key object
          Hide api_key attributes Show api_key attributes object
          • id string Required

            The identifier for the API key.

          • name string Required

            The name of the API key.

        • roles array[string]

          If a user ID was used for the most recent update to the datafeed, its roles at the time of the update are listed in the response.

        • If a service account was used for the most recent update to the datafeed, the account name is listed in the response.

      • Hide chunking_config attributes Show chunking_config attributes object
        • mode string Required

          Values are auto, manual, or off.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • datafeed_id string Required
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • indices array[string] Required
      • indexes array[string]
      • job_id string Required
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide script_fields attribute Show script_fields attribute object
      • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • enabled boolean Required

          Specifies whether the datafeed periodically checks for delayed data.

      • Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
              Hide * attribute Show * attribute object
              • type string Required

                Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

          • fetch_fields array[object]

            For type lookup

            Hide fetch_fields attributes Show fetch_fields attributes object
            • field string Required

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • format string
          • format string

            A custom format for date type runtime fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • script object
            Hide script attributes Show script attributes object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • Hide indices_options attributes Show indices_options attributes object
        • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

        • expand_wildcards string | array[string]
        • If true, missing or closed indices are not included in the response.

        • If true, concrete, expanded or aliased indices are ignored when frozen.

      • query object Required

        The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {"boost": 1}}.

        Query DSL
    • groups array[string]
    • job_id string Required
    • job_type string Required
    • job_version string Required
    • Hide model_plot_config attributes Show model_plot_config attributes object
      • If true, enables calculation and storage of the model change annotations for each entity that is being analyzed.

      • enabled boolean

        If true, enables calculation and storage of the model bounds for each entity that is being analyzed.

      • terms string

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • results_index_name string Required
POST /_ml/anomaly_detectors/{job_id}/_update
curl \
 --request POST 'http://api.example.com/_ml/anomaly_detectors/{job_id}/_update' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"allow_lazy_open":true,"analysis_limits":{"model_memory_limit":"string"},"background_persist_interval":"string","custom_settings":{"additionalProperty1":{},"additionalProperty2":{}},"categorization_filters":["string"],"description":"string","model_plot_config":{"annotations_enabled":true,"enabled":true,"terms":"string"},"model_prune_window":"string","daily_model_snapshot_retention_after_days":42.0,"model_snapshot_retention_days":42.0,"renormalization_window_days":42.0,"results_retention_days":42.0,"groups":["string"],"detectors":[{"detector_index":42.0,"description":"string","custom_rules":[{"actions":["skip_result"],"conditions":[{"applies_to":"actual","operator":"gt","value":42.0}],"scope":{"additionalProperty1":{"filter_id":"string","filter_type":"include"},"additionalProperty2":{"filter_id":"string","filter_type":"include"}}}]}],"per_partition_categorization":{"enabled":true,"stop_on_warn":true}}'

Get data frame analytics job configuration info Added in 7.3.0

GET /_ml/data_frame/analytics/{id}

You can get information for multiple data frame analytics jobs in a single API request by using a comma-separated list of data frame analytics jobs or a wildcard expression.

Path parameters

  • id string Required

    Identifier for the data frame analytics job. If you do not specify this option, the API returns information for the first hundred data frame analytics jobs.

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no data frame analytics jobs that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    The default value returns an empty data_frame_analytics array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

  • from number

    Skips the specified number of data frame analytics jobs.

  • size number

    Specifies the maximum number of data frame analytics jobs to obtain.

  • Indicates if certain fields should be removed from the configuration on retrieval. This allows the configuration to be in an acceptable format to be retrieved and then added to another cluster.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • data_frame_analytics array[object] Required

      An array of data frame analytics job resources, which are sorted by the id value in ascending order.

      Hide data_frame_analytics attributes Show data_frame_analytics attributes object
      • analysis object Required
        Hide analysis attributes Show analysis attributes object
        • Hide classification attributes Show classification attributes object
          • alpha number

            Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

          • dependent_variable string Required

            Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

          • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

          • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

          • eta number

            Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

          • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

          • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

          • feature_processors array[object]

            Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          • gamma number

            Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

          • lambda number

            Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

          • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

          • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

          • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

          • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

          • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

          • Defines the number of categories for which the predicted probabilities are reported. It must be non-negative or -1. If it is -1 or greater than the total number of categories, probabilities are reported for all categories; if you have a large number of categories, there could be a significant effect on the size of your destination index. NOTE: To use the AUC ROC evaluation method, num_top_classes must be set to -1 or a value greater than or equal to the total number of categories.

        • Hide outlier_detection attributes Show outlier_detection attributes object
          • Specifies whether the feature influence calculation is enabled.

          • The minimum outlier score that a document needs to have in order to calculate its feature influence score. Value range: 0-1.

          • method string

            The method that outlier detection uses. Available methods are lof, ldof, distance_kth_nn, distance_knn, and ensemble. The default value is ensemble, which means that outlier detection uses an ensemble of different methods and normalises and combines their individual outlier scores to obtain the overall outlier score.

          • Defines the value for how many nearest neighbors each method of outlier detection uses to calculate its outlier score. When the value is not set, different values are used for different ensemble members. This default behavior helps improve the diversity in the ensemble; only override it if you are confident that the value you choose is appropriate for the data set.

          • The proportion of the data set that is assumed to be outlying prior to outlier detection. For example, 0.05 means it is assumed that 5% of values are real outliers and 95% are inliers.

          • If true, the following operation is performed on the columns before computing outlier scores: (x_i - mean(x_i)) / sd(x_i).

        • Hide regression attributes Show regression attributes object
          • alpha number

            Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

          • dependent_variable string Required

            Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

          • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

          • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

          • eta number

            Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

          • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

          • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

          • feature_processors array[object]

            Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          • gamma number

            Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

          • lambda number

            Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

          • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

          • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

          • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

          • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

          • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

          • The loss function used during regression. Available options are mse (mean squared error), msle (mean squared logarithmic error), huber (Pseudo-Huber loss).

          • A positive number that is used as a parameter to the loss_function.

      • Hide analyzed_fields attributes Show analyzed_fields attributes object
        • includes array[string]

          An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

        • excludes array[string]

          An array of strings that defines the fields that will be included in the analysis.

      • Hide authorization attributes Show authorization attributes object
        • api_key object
          Hide api_key attributes Show api_key attributes object
          • id string Required

            The identifier for the API key.

          • name string Required

            The name of the API key.

        • roles array[string]

          If a user ID was used for the most recent update to the job, its roles at the time of the update are listed in the response.

        • If a service account was used for the most recent update to the job, the account name is listed in the response.

      • Time unit for milliseconds

      • dest object Required
        Hide dest attributes Show dest attributes object
        • index string Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • id string Required
      • source object Required
        Hide source attributes Show source attributes object
        • index string | array[string] Required
        • Hide runtime_mappings attribute Show runtime_mappings attribute object
          • * object Additional properties
            Hide * attributes Show * attributes object
            • fields object

              For type composite

            • fetch_fields array[object]

              For type lookup

            • format string

              A custom format for date type runtime fields.

            • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • script object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • _source object
          Hide _source attributes Show _source attributes object
          • includes array[string]

            An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

          • excludes array[string]

            An array of strings that defines the fields that will be included in the analysis.

        • query object

          The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {}}.

          Query DSL
      • version string
      • _meta object
        Hide _meta attribute Show _meta attribute object
        • * object Additional properties
GET /_ml/data_frame/analytics/{id}
curl \
 --request GET 'http://api.example.com/_ml/data_frame/analytics/{id}' \
 --header "Authorization: $API_KEY"

Create a data frame analytics job Added in 7.3.0

PUT /_ml/data_frame/analytics/{id}

This API creates a data frame analytics job that performs an analysis on the source indices and stores the outcome in a destination index. By default, the query used in the source configuration is {"match_all": {}}.

If the destination index does not exist, it is created automatically when you start the job.

If you supply only a subset of the regression or classification parameters, hyperparameter optimization occurs. It determines a value for each of the undefined parameters.

Path parameters

  • id string Required

    Identifier for the data frame analytics job. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.

application/json

Body Required

  • Specifies whether this job can start when there is insufficient machine learning node capacity for it to be immediately assigned to a node. If set to false and a machine learning node with capacity to run the job cannot be immediately found, the API returns an error. If set to true, the API does not return an error; the job waits in the starting state until sufficient machine learning node capacity is available. This behavior is also affected by the cluster-wide xpack.ml.max_lazy_ml_nodes setting.

  • analysis object Required
    Hide analysis attributes Show analysis attributes object
    • Hide classification attributes Show classification attributes object
      • alpha number

        Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

      • dependent_variable string Required

        Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

      • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

      • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

      • eta number

        Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

      • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

      • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

      • feature_processors array[object]

        Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

        Hide feature_processors attributes Show feature_processors attributes object
        • Hide frequency_encoding attributes Show frequency_encoding attributes object
          • feature_name string Required
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • frequency_map object Required

            The resulting frequency map for the field value. If the field value is missing from the frequency_map, the resulting value is 0.

        • Hide multi_encoding attribute Show multi_encoding attribute object
          • processors array[number] Required

            The ordered array of custom processors to execute. Must be more than 1.

        • Hide n_gram_encoding attributes Show n_gram_encoding attributes object
          • The feature name prefix. Defaults to ngram__.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • length number

            Specifies the length of the n-gram substring. Defaults to 50. Must be greater than 0.

          • n_grams array[number] Required

            Specifies which n-grams to gather. It’s an array of integer values where the minimum value is 1, and a maximum value is 5.

          • start number

            Specifies the zero-indexed start of the n-gram substring. Negative values are allowed for encoding n-grams of string suffixes. Defaults to 0.

          • custom boolean
        • Hide one_hot_encoding attributes Show one_hot_encoding attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • hot_map string Required

            The one hot map mapping the field value with the column name.

        • Hide target_mean_encoding attributes Show target_mean_encoding attributes object
          • default_value number Required

            The default value if field value is not found in the target_map.

          • feature_name string Required
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • target_map object Required

            The field value to target mean transition map.

      • gamma number

        Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

      • lambda number

        Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

      • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

      • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

      • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

      • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

      • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

      • Defines the number of categories for which the predicted probabilities are reported. It must be non-negative or -1. If it is -1 or greater than the total number of categories, probabilities are reported for all categories; if you have a large number of categories, there could be a significant effect on the size of your destination index. NOTE: To use the AUC ROC evaluation method, num_top_classes must be set to -1 or a value greater than or equal to the total number of categories.

    • Hide outlier_detection attributes Show outlier_detection attributes object
      • Specifies whether the feature influence calculation is enabled.

      • The minimum outlier score that a document needs to have in order to calculate its feature influence score. Value range: 0-1.

      • method string

        The method that outlier detection uses. Available methods are lof, ldof, distance_kth_nn, distance_knn, and ensemble. The default value is ensemble, which means that outlier detection uses an ensemble of different methods and normalises and combines their individual outlier scores to obtain the overall outlier score.

      • Defines the value for how many nearest neighbors each method of outlier detection uses to calculate its outlier score. When the value is not set, different values are used for different ensemble members. This default behavior helps improve the diversity in the ensemble; only override it if you are confident that the value you choose is appropriate for the data set.

      • The proportion of the data set that is assumed to be outlying prior to outlier detection. For example, 0.05 means it is assumed that 5% of values are real outliers and 95% are inliers.

      • If true, the following operation is performed on the columns before computing outlier scores: (x_i - mean(x_i)) / sd(x_i).

    • Hide regression attributes Show regression attributes object
      • alpha number

        Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

      • dependent_variable string Required

        Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

      • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

      • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

      • eta number

        Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

      • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

      • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

      • feature_processors array[object]

        Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

        Hide feature_processors attributes Show feature_processors attributes object
        • Hide frequency_encoding attributes Show frequency_encoding attributes object
          • feature_name string Required
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • frequency_map object Required

            The resulting frequency map for the field value. If the field value is missing from the frequency_map, the resulting value is 0.

        • Hide multi_encoding attribute Show multi_encoding attribute object
          • processors array[number] Required

            The ordered array of custom processors to execute. Must be more than 1.

        • Hide n_gram_encoding attributes Show n_gram_encoding attributes object
          • The feature name prefix. Defaults to ngram__.

          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • length number

            Specifies the length of the n-gram substring. Defaults to 50. Must be greater than 0.

          • n_grams array[number] Required

            Specifies which n-grams to gather. It’s an array of integer values where the minimum value is 1, and a maximum value is 5.

          • start number

            Specifies the zero-indexed start of the n-gram substring. Negative values are allowed for encoding n-grams of string suffixes. Defaults to 0.

          • custom boolean
        • Hide one_hot_encoding attributes Show one_hot_encoding attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • hot_map string Required

            The one hot map mapping the field value with the column name.

        • Hide target_mean_encoding attributes Show target_mean_encoding attributes object
          • default_value number Required

            The default value if field value is not found in the target_map.

          • feature_name string Required
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • target_map object Required

            The field value to target mean transition map.

      • gamma number

        Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

      • lambda number

        Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

      • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

      • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

      • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

      • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

      • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

      • The loss function used during regression. Available options are mse (mean squared error), msle (mean squared logarithmic error), huber (Pseudo-Huber loss).

      • A positive number that is used as a parameter to the loss_function.

  • Hide analyzed_fields attributes Show analyzed_fields attributes object
    • includes array[string]

      An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

    • excludes array[string]

      An array of strings that defines the fields that will be included in the analysis.

  • A description of the job.

  • dest object Required
    Hide dest attributes Show dest attributes object
    • index string Required
    • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • The maximum number of threads to be used by the analysis. Using more threads may decrease the time necessary to complete the analysis at the cost of using more CPU. Note that the process may use additional threads for operational functionality other than the analysis itself.

  • _meta object
    Hide _meta attribute Show _meta attribute object
    • * object Additional properties
  • The approximate maximum amount of memory resources that are permitted for analytical processing. If your elasticsearch.yml file contains an xpack.ml.max_model_memory_limit setting, an error occurs when you try to create data frame analytics jobs that have model_memory_limit values greater than that setting.

  • source object Required
    Hide source attributes Show source attributes object
    • index string | array[string] Required
    • Hide runtime_mappings attribute Show runtime_mappings attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • fields object

          For type composite

          Hide fields attribute Show fields attribute object
          • * object Additional properties
            Hide * attribute Show * attribute object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • fetch_fields array[object]

          For type lookup

          Hide fetch_fields attributes Show fetch_fields attributes object
          • field string Required

            Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • format string
        • format string

          A custom format for date type runtime fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • script object
          Hide script attributes Show script attributes object
          • source string | object

            One of:
          • id string
          • params object

            Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            Hide params attribute Show params attribute object
            • * object Additional properties
          • lang string

            Any of:

            Values are painless, expression, mustache, or java.

          • options object
            Hide options attribute Show options attribute object
            • * string Additional properties
        • type string Required

          Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

    • _source object
      Hide _source attributes Show _source attributes object
      • includes array[string]

        An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

      • excludes array[string]

        An array of strings that defines the fields that will be included in the analysis.

    • query object

      The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {}}.

      Query DSL
  • headers object
  • version string

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide authorization attributes Show authorization attributes object
      • api_key object
        Hide api_key attributes Show api_key attributes object
        • id string Required

          The identifier for the API key.

        • name string Required

          The name of the API key.

      • roles array[string]

        If a user ID was used for the most recent update to the job, its roles at the time of the update are listed in the response.

      • If a service account was used for the most recent update to the job, the account name is listed in the response.

    • allow_lazy_start boolean Required
    • analysis object Required
      Hide analysis attributes Show analysis attributes object
      • Hide classification attributes Show classification attributes object
        • alpha number

          Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

        • dependent_variable string Required

          Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

        • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

        • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

        • eta number

          Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

        • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

        • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

        • feature_processors array[object]

          Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          Hide feature_processors attributes Show feature_processors attributes object
        • gamma number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • lambda number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

        • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

        • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

        • Defines the number of categories for which the predicted probabilities are reported. It must be non-negative or -1. If it is -1 or greater than the total number of categories, probabilities are reported for all categories; if you have a large number of categories, there could be a significant effect on the size of your destination index. NOTE: To use the AUC ROC evaluation method, num_top_classes must be set to -1 or a value greater than or equal to the total number of categories.

      • Hide outlier_detection attributes Show outlier_detection attributes object
        • Specifies whether the feature influence calculation is enabled.

        • The minimum outlier score that a document needs to have in order to calculate its feature influence score. Value range: 0-1.

        • method string

          The method that outlier detection uses. Available methods are lof, ldof, distance_kth_nn, distance_knn, and ensemble. The default value is ensemble, which means that outlier detection uses an ensemble of different methods and normalises and combines their individual outlier scores to obtain the overall outlier score.

        • Defines the value for how many nearest neighbors each method of outlier detection uses to calculate its outlier score. When the value is not set, different values are used for different ensemble members. This default behavior helps improve the diversity in the ensemble; only override it if you are confident that the value you choose is appropriate for the data set.

        • The proportion of the data set that is assumed to be outlying prior to outlier detection. For example, 0.05 means it is assumed that 5% of values are real outliers and 95% are inliers.

        • If true, the following operation is performed on the columns before computing outlier scores: (x_i - mean(x_i)) / sd(x_i).

      • Hide regression attributes Show regression attributes object
        • alpha number

          Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

        • dependent_variable string Required

          Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

        • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

        • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

        • eta number

          Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

        • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

        • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

        • feature_processors array[object]

          Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          Hide feature_processors attributes Show feature_processors attributes object
        • gamma number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • lambda number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

        • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

        • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

        • The loss function used during regression. Available options are mse (mean squared error), msle (mean squared logarithmic error), huber (Pseudo-Huber loss).

        • A positive number that is used as a parameter to the loss_function.

    • Hide analyzed_fields attributes Show analyzed_fields attributes object
      • includes array[string]

        An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

      • excludes array[string]

        An array of strings that defines the fields that will be included in the analysis.

    • Time unit for milliseconds

    • dest object Required
      Hide dest attributes Show dest attributes object
      • index string Required
      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • id string Required
    • max_num_threads number Required
    • _meta object
      Hide _meta attribute Show _meta attribute object
      • * object Additional properties
    • model_memory_limit string Required
    • source object Required
      Hide source attributes Show source attributes object
      • index string | array[string] Required
      • Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
              Hide * attribute Show * attribute object
              • type string Required

                Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

          • fetch_fields array[object]

            For type lookup

            Hide fetch_fields attributes Show fetch_fields attributes object
            • field string Required

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • format string
          • format string

            A custom format for date type runtime fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • script object
            Hide script attributes Show script attributes object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • _source object
        Hide _source attributes Show _source attributes object
        • includes array[string]

          An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

        • excludes array[string]

          An array of strings that defines the fields that will be included in the analysis.

      • query object

        The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {}}.

        Query DSL
    • version string Required
PUT /_ml/data_frame/analytics/{id}
curl \
 --request PUT 'http://api.example.com/_ml/data_frame/analytics/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"allow_lazy_start":true,"analysis":{"classification":{"alpha":42.0,"dependent_variable":"string","downsample_factor":42.0,"early_stopping_enabled":true,"eta":42.0,"eta_growth_rate_per_tree":42.0,"feature_bag_fraction":42.0,"feature_processors":[{"frequency_encoding":{"feature_name":"string","field":"string","frequency_map":{}},"multi_encoding":{"processors":[42.0]},"n_gram_encoding":{"feature_prefix":"string","field":"string","length":42.0,"n_grams":[42.0],"start":42.0,"custom":true},"one_hot_encoding":{"field":"string","hot_map":"string"},"target_mean_encoding":{"default_value":42.0,"feature_name":"string","field":"string","target_map":{}}}],"gamma":42.0,"lambda":42.0,"max_optimization_rounds_per_hyperparameter":42.0,"max_trees":42.0,"num_top_feature_importance_values":42.0,"prediction_field_name":"string","randomize_seed":42.0,"soft_tree_depth_limit":42.0,"soft_tree_depth_tolerance":42.0,"":"string","class_assignment_objective":"string","num_top_classes":42.0},"outlier_detection":{"compute_feature_influence":true,"feature_influence_threshold":42.0,"method":"string","n_neighbors":42.0,"outlier_fraction":42.0,"standardization_enabled":true},"regression":{"alpha":42.0,"dependent_variable":"string","downsample_factor":42.0,"early_stopping_enabled":true,"eta":42.0,"eta_growth_rate_per_tree":42.0,"feature_bag_fraction":42.0,"feature_processors":[{"frequency_encoding":{"feature_name":"string","field":"string","frequency_map":{}},"multi_encoding":{"processors":[42.0]},"n_gram_encoding":{"feature_prefix":"string","field":"string","length":42.0,"n_grams":[42.0],"start":42.0,"custom":true},"one_hot_encoding":{"field":"string","hot_map":"string"},"target_mean_encoding":{"default_value":42.0,"feature_name":"string","field":"string","target_map":{}}}],"gamma":42.0,"lambda":42.0,"max_optimization_rounds_per_hyperparameter":42.0,"max_trees":42.0,"num_top_feature_importance_values":42.0,"prediction_field_name":"string","randomize_seed":42.0,"soft_tree_depth_limit":42.0,"soft_tree_depth_tolerance":42.0,"":"string","loss_function":"string","loss_function_parameter":42.0}},"analyzed_fields":{"includes":["string"],"excludes":["string"]},"description":"string","dest":{"index":"string","results_field":"string"},"max_num_threads":42.0,"_meta":{"additionalProperty1":{},"additionalProperty2":{}},"model_memory_limit":"string","source":{"index":"string","runtime_mappings":{"additionalProperty1":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"},"additionalProperty2":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"}},"_source":{"includes":["string"],"excludes":["string"]},"query":{}},"headers":{},"version":"string"}'

Delete a data frame analytics job Added in 7.3.0

DELETE /_ml/data_frame/analytics/{id}

Path parameters

  • id string Required

    Identifier for the data frame analytics job.

Query parameters

  • force boolean

    If true, it deletes a job that is not stopped; this method is quicker than stopping and deleting the job.

  • timeout string

    The time to wait for the job to be deleted.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ml/data_frame/analytics/{id}
curl \
 --request DELETE 'http://api.example.com/_ml/data_frame/analytics/{id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when deleting a data frame analytics job.
{
  "acknowledged": true
}

Evaluate data frame analytics Added in 7.3.0

POST /_ml/data_frame/_evaluate

The API packages together commonly used evaluation metrics for various types of machine learning features. This has been designed for use on indexes created by data frame analytics. Evaluation requires both a ground truth field and an analytics result field to be present.

application/json

Body Required

  • evaluation object Required
    Hide evaluation attributes Show evaluation attributes object
    • Hide classification attributes Show classification attributes object
      • actual_field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • metrics object
        Hide metrics attributes Show metrics attributes object
        • auc_roc object
          Hide auc_roc attributes Show auc_roc attributes object
          • Whether or not the curve should be returned in addition to the score. Default value is false.

        • Precision of predictions (per-class and average).

          Hide precision attribute Show precision attribute object
          • * object Additional properties
        • recall object

          Recall of predictions (per-class and average).

          Hide recall attribute Show recall attribute object
          • * object Additional properties
        • accuracy object

          Accuracy of predictions (per-class and overall).

          Hide accuracy attribute Show accuracy attribute object
          • * object Additional properties
        • Multiclass confusion matrix.

          Hide multiclass_confusion_matrix attribute Show multiclass_confusion_matrix attribute object
          • * object Additional properties
    • Hide outlier_detection attributes Show outlier_detection attributes object
      • actual_field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • metrics object
        Hide metrics attributes Show metrics attributes object
        • auc_roc object
          Hide auc_roc attributes Show auc_roc attributes object
          • Whether or not the curve should be returned in addition to the score. Default value is false.

        • Precision of predictions (per-class and average).

          Hide precision attribute Show precision attribute object
          • * object Additional properties
        • recall object

          Recall of predictions (per-class and average).

          Hide recall attribute Show recall attribute object
          • * object Additional properties
        • Accuracy of predictions (per-class and overall).

          Hide confusion_matrix attribute Show confusion_matrix attribute object
          • * object Additional properties
    • Hide regression attributes Show regression attributes object
      • actual_field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • predicted_field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • metrics object
        Hide metrics attributes Show metrics attributes object
        • mse object

          Average squared difference between the predicted values and the actual (ground truth) value. For more information, read this wiki article.

          Hide mse attribute Show mse attribute object
          • * object Additional properties
        • msle object
          Hide msle attribute Show msle attribute object
          • offset number

            Defines the transition point at which you switch from minimizing quadratic error to minimizing quadratic log error. Defaults to 1.

        • huber object
          Hide huber attribute Show huber attribute object
          • delta number

            Approximates 1/2 (prediction - actual)2 for values much less than delta and approximates a straight line with slope delta for values much larger than delta. Defaults to 1. Delta needs to be greater than 0.

        • Proportion of the variance in the dependent variable that is predictable from the independent variables.

          Hide r_squared attribute Show r_squared attribute object
          • * object Additional properties
  • index string Required
  • query object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide classification attributes Show classification attributes object
    • Hide outlier_detection attributes Show outlier_detection attributes object
      • auc_roc object
        Hide auc_roc attributes Show auc_roc attributes object
      • Set the different thresholds of the outlier score at where the metric is calculated.

        Hide precision attribute Show precision attribute object
        • * number Additional properties
      • recall object

        Set the different thresholds of the outlier score at where the metric is calculated.

        Hide recall attribute Show recall attribute object
        • * number Additional properties
      • Set the different thresholds of the outlier score at where the metrics (tp - true positive, fp - false positive, tn - true negative, fn - false negative) are calculated.

        Hide confusion_matrix attribute Show confusion_matrix attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • tp number Required

            True Positive

          • fp number Required

            False Positive

          • tn number Required

            True Negative

          • fn number Required

            False Negative

    • Hide regression attributes Show regression attributes object
      • huber object
        Hide huber attribute Show huber attribute object
      • mse object
        Hide mse attribute Show mse attribute object
      • msle object
        Hide msle attribute Show msle attribute object
      • Hide r_squared attribute Show r_squared attribute object
POST /_ml/data_frame/_evaluate
curl \
 --request POST 'http://api.example.com/_ml/data_frame/_evaluate' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"index\": \"animal_classification\",\n  \"evaluation\": {\n    \"classification\": {\n      \"actual_field\": \"animal_class\",\n      \"predicted_field\": \"ml.animal_class_prediction\",\n      \"metrics\": {\n        \"multiclass_confusion_matrix\": {}\n      }\n    }\n  }\n}"'
Run `POST _ml/data_frame/_evaluate` to evaluate a a classification job for an annotated index. The `actual_field` contains the ground truth for classification. The `predicted_field` contains the predicted value calculated by the classification analysis.
{
  "index": "animal_classification",
  "evaluation": {
    "classification": {
      "actual_field": "animal_class",
      "predicted_field": "ml.animal_class_prediction",
      "metrics": {
        "multiclass_confusion_matrix": {}
      }
    }
  }
}
Run `POST _ml/data_frame/_evaluate` to evaluate a classification job with AUC ROC metrics for an annotated index. The `actual_field` contains the ground truth value for the actual animal classification. This is required in order to evaluate results. The `class_name` specifies the class name that is treated as positive during the evaluation, all the other classes are treated as negative.
{
  "index": "animal_classification",
  "evaluation": {
    "classification": {
      "actual_field": "animal_class",
      "metrics": {
        "auc_roc": {
          "class_name": "dog"
        }
      }
    }
  }
}
Run `POST _ml/data_frame/_evaluate` to evaluate an outlier detection job for an annotated index.
{
  "index": "my_analytics_dest_index",
  "evaluation": {
    "outlier_detection": {
      "actual_field": "is_outlier",
      "predicted_probability_field": "ml.outlier_score"
    }
  }
}
Run `POST _ml/data_frame/_evaluate` to evaluate the testing error of a regression job for an annotated index. The term query in the body limits evaluation to be performed on the test split only. The `actual_field` contains the ground truth for house prices. The `predicted_field` contains the house price calculated by the regression analysis.
{
  "index": "house_price_predictions",
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "ml.is_training": false
          }
        }
      ]
    }
  },
  "evaluation": {
    "regression": {
      "actual_field": "price",
      "predicted_field": "ml.price_prediction",
      "metrics": {
        "r_squared": {},
        "mse": {},
        "msle": {
          "offset": 10
        },
        "huber": {
          "delta": 1.5
        }
      }
    }
  }
}
Run `POST _ml/data_frame/_evaluate` to evaluate the training error of a regression job for an annotated index. The term query in the body limits evaluation to be performed on the training split only. The `actual_field` contains the ground truth for house prices. The `predicted_field` contains the house price calculated by the regression analysis.
{
  "index": "house_price_predictions",
  "query": {
    "term": {
      "ml.is_training": {
        "value": true
      }
    }
  },
  "evaluation": {
    "regression": {
      "actual_field": "price",
      "predicted_field": "ml.price_prediction",
      "metrics": {
        "r_squared": {},
        "mse": {},
        "msle": {},
        "huber": {}
      }
    }
  }
}
A succesful response from `POST _ml/data_frame/_evaluate` to evaluate a classification analysis job for an annotated index. The `actual_class` contains the name of the class the analysis tried to predict. The `actual_class_doc_count` is the number of documents in the index belonging to the `actual_class`. The `predicted_classes` object contains the list of the predicted classes and the number of predictions associated with the class.
{
  "classification": {
    "multiclass_confusion_matrix": {
      "confusion_matrix": [
        {
          "actual_class": "cat",
          "actual_class_doc_count": 12,
          "predicted_classes": [
            {
              "predicted_class": "cat",
              "count": 12
            },
            {
              "predicted_class": "dog",
              "count": 0
            }
          ],
          "other_predicted_class_doc_count": 0
        },
        {
          "actual_class": "dog",
          "actual_class_doc_count": 11,
          "predicted_classes": [
            {
              "predicted_class": "dog",
              "count": 7
            },
            {
              "predicted_class": "cat",
              "count": 4
            }
          ],
          "other_predicted_class_doc_count": 0
        }
      ],
      "other_actual_class_count": 0
    }
  }
}
A succesful response from `POST _ml/data_frame/_evaluate` to evaluate a classification analysis job with the AUC ROC metrics for an annotated index.
{
  "classification": {
    "auc_roc": {
      "value": 0.8941788639536681
    }
  }
}
A successful response from `POST _ml/data_frame/_evaluate` to evaluate an outlier detection job.
{
  "outlier_detection": {
    "auc_roc": {
      "value": 0.9258475774641445
    },
    "confusion_matrix": {
      "0.25": {
        "tp": 5,
        "fp": 9,
        "tn": 204,
        "fn": 5
      },
      "0.5": {
        "tp": 1,
        "fp": 5,
        "tn": 208,
        "fn": 9
      },
      "0.75": {
        "tp": 0,
        "fp": 4,
        "tn": 209,
        "fn": 10
      }
    },
    "precision": {
      "0.25": 0.35714285714285715,
      "0.5": 0.16666666666666666,
      "0.75": 0
    },
    "recall": {
      "0.25": 0.5,
      "0.5": 0.1,
      "0.75": 0
    }
  }
}

Get data frame analytics job configuration info Added in 7.3.0

GET /_ml/data_frame/analytics

You can get information for multiple data frame analytics jobs in a single API request by using a comma-separated list of data frame analytics jobs or a wildcard expression.

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no data frame analytics jobs that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    The default value returns an empty data_frame_analytics array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

  • from number

    Skips the specified number of data frame analytics jobs.

  • size number

    Specifies the maximum number of data frame analytics jobs to obtain.

  • Indicates if certain fields should be removed from the configuration on retrieval. This allows the configuration to be in an acceptable format to be retrieved and then added to another cluster.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • data_frame_analytics array[object] Required

      An array of data frame analytics job resources, which are sorted by the id value in ascending order.

      Hide data_frame_analytics attributes Show data_frame_analytics attributes object
      • analysis object Required
        Hide analysis attributes Show analysis attributes object
        • Hide classification attributes Show classification attributes object
          • alpha number

            Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

          • dependent_variable string Required

            Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

          • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

          • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

          • eta number

            Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

          • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

          • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

          • feature_processors array[object]

            Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          • gamma number

            Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

          • lambda number

            Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

          • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

          • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

          • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

          • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

          • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

          • Defines the number of categories for which the predicted probabilities are reported. It must be non-negative or -1. If it is -1 or greater than the total number of categories, probabilities are reported for all categories; if you have a large number of categories, there could be a significant effect on the size of your destination index. NOTE: To use the AUC ROC evaluation method, num_top_classes must be set to -1 or a value greater than or equal to the total number of categories.

        • Hide outlier_detection attributes Show outlier_detection attributes object
          • Specifies whether the feature influence calculation is enabled.

          • The minimum outlier score that a document needs to have in order to calculate its feature influence score. Value range: 0-1.

          • method string

            The method that outlier detection uses. Available methods are lof, ldof, distance_kth_nn, distance_knn, and ensemble. The default value is ensemble, which means that outlier detection uses an ensemble of different methods and normalises and combines their individual outlier scores to obtain the overall outlier score.

          • Defines the value for how many nearest neighbors each method of outlier detection uses to calculate its outlier score. When the value is not set, different values are used for different ensemble members. This default behavior helps improve the diversity in the ensemble; only override it if you are confident that the value you choose is appropriate for the data set.

          • The proportion of the data set that is assumed to be outlying prior to outlier detection. For example, 0.05 means it is assumed that 5% of values are real outliers and 95% are inliers.

          • If true, the following operation is performed on the columns before computing outlier scores: (x_i - mean(x_i)) / sd(x_i).

        • Hide regression attributes Show regression attributes object
          • alpha number

            Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

          • dependent_variable string Required

            Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

          • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

          • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

          • eta number

            Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

          • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

          • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

          • feature_processors array[object]

            Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          • gamma number

            Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

          • lambda number

            Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

          • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

          • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

          • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

          • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

          • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

          • The loss function used during regression. Available options are mse (mean squared error), msle (mean squared logarithmic error), huber (Pseudo-Huber loss).

          • A positive number that is used as a parameter to the loss_function.

      • Hide analyzed_fields attributes Show analyzed_fields attributes object
        • includes array[string]

          An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

        • excludes array[string]

          An array of strings that defines the fields that will be included in the analysis.

      • Hide authorization attributes Show authorization attributes object
        • api_key object
          Hide api_key attributes Show api_key attributes object
          • id string Required

            The identifier for the API key.

          • name string Required

            The name of the API key.

        • roles array[string]

          If a user ID was used for the most recent update to the job, its roles at the time of the update are listed in the response.

        • If a service account was used for the most recent update to the job, the account name is listed in the response.

      • Time unit for milliseconds

      • dest object Required
        Hide dest attributes Show dest attributes object
        • index string Required
        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • id string Required
      • source object Required
        Hide source attributes Show source attributes object
        • index string | array[string] Required
        • Hide runtime_mappings attribute Show runtime_mappings attribute object
          • * object Additional properties
            Hide * attributes Show * attributes object
            • fields object

              For type composite

            • fetch_fields array[object]

              For type lookup

            • format string

              A custom format for date type runtime fields.

            • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • script object
            • type string Required

              Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

        • _source object
          Hide _source attributes Show _source attributes object
          • includes array[string]

            An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

          • excludes array[string]

            An array of strings that defines the fields that will be included in the analysis.

        • query object

          The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {}}.

          Query DSL
      • version string
      • _meta object
        Hide _meta attribute Show _meta attribute object
        • * object Additional properties
GET /_ml/data_frame/analytics
curl \
 --request GET 'http://api.example.com/_ml/data_frame/analytics' \
 --header "Authorization: $API_KEY"

Get data frame analytics job stats Added in 7.3.0

GET /_ml/data_frame/analytics/_stats

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no data frame analytics jobs that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    The default value returns an empty data_frame_analytics array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

  • from number

    Skips the specified number of data frame analytics jobs.

  • size number

    Specifies the maximum number of data frame analytics jobs to obtain.

  • verbose boolean

    Defines whether the stats response should be verbose.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • data_frame_analytics array[object] Required

      An array of objects that contain usage information for data frame analytics jobs, which are sorted by the id value in ascending order.

      Hide data_frame_analytics attributes Show data_frame_analytics attributes object
      • Hide analysis_stats attributes Show analysis_stats attributes object
        • Hide classification_stats attributes Show classification_stats attributes object
          • hyperparameters object Required
            Hide hyperparameters attributes Show hyperparameters attributes object
            • alpha number

              Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

            • lambda number

              Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

            • gamma number

              Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

            • eta number

              Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

            • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

            • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

            • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

            • If the algorithm fails to determine a non-trivial tree (more than a single leaf), this parameter determines how many of such consecutive failures are tolerated. Once the number of attempts exceeds the threshold, the forest training stops.

            • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

            • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

            • The maximum number of folds for the cross-validation procedure.

            • Determines the maximum number of splits for every feature that can occur in a decision tree when the tree is trained.

            • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

            • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

          • iteration number Required

            The number of iterations on the analysis.

          • Time unit for milliseconds

          • timing_stats object Required
            Hide timing_stats attributes Show timing_stats attributes object
          • validation_loss object Required
            Hide validation_loss attributes Show validation_loss attributes object
            • fold_values array[string] Required

              Validation loss values for every added decision tree during the forest growing procedure.

            • loss_type string Required

              The type of the loss metric. For example, binomial_logistic.

        • Hide outlier_detection_stats attributes Show outlier_detection_stats attributes object
          • parameters object Required
            Hide parameters attributes Show parameters attributes object
            • Specifies whether the feature influence calculation is enabled.

            • The minimum outlier score that a document needs to have in order to calculate its feature influence score. Value range: 0-1

            • method string

              The method that outlier detection uses. Available methods are lof, ldof, distance_kth_nn, distance_knn, and ensemble. The default value is ensemble, which means that outlier detection uses an ensemble of different methods and normalises and combines their individual outlier scores to obtain the overall outlier score.

            • Defines the value for how many nearest neighbors each method of outlier detection uses to calculate its outlier score. When the value is not set, different values are used for different ensemble members. This default behavior helps improve the diversity in the ensemble; only override it if you are confident that the value you choose is appropriate for the data set.

            • The proportion of the data set that is assumed to be outlying prior to outlier detection. For example, 0.05 means it is assumed that 5% of values are real outliers and 95% are inliers.

            • If true, the following operation is performed on the columns before computing outlier scores: (x_i - mean(x_i)) / sd(x_i).

          • Time unit for milliseconds

          • timing_stats object Required
            Hide timing_stats attributes Show timing_stats attributes object
        • Hide regression_stats attributes Show regression_stats attributes object
          • hyperparameters object Required
            Hide hyperparameters attributes Show hyperparameters attributes object
            • alpha number

              Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

            • lambda number

              Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

            • gamma number

              Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

            • eta number

              Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

            • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

            • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

            • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

            • If the algorithm fails to determine a non-trivial tree (more than a single leaf), this parameter determines how many of such consecutive failures are tolerated. Once the number of attempts exceeds the threshold, the forest training stops.

            • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

            • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

            • The maximum number of folds for the cross-validation procedure.

            • Determines the maximum number of splits for every feature that can occur in a decision tree when the tree is trained.

            • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

            • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

          • iteration number Required

            The number of iterations on the analysis.

          • Time unit for milliseconds

          • timing_stats object Required
            Hide timing_stats attributes Show timing_stats attributes object
          • validation_loss object Required
            Hide validation_loss attributes Show validation_loss attributes object
            • fold_values array[string] Required

              Validation loss values for every added decision tree during the forest growing procedure.

            • loss_type string Required

              The type of the loss metric. For example, binomial_logistic.

      • For running jobs only, contains messages relating to the selection of a node to run the job.

      • data_counts object Required
        Hide data_counts attributes Show data_counts attributes object
        • skipped_docs_count number Required

          The number of documents that are skipped during the analysis because they contained values that are not supported by the analysis. For example, outlier detection does not support missing fields so it skips documents with missing fields. Likewise, all types of analysis skip documents that contain arrays with more than one element.

        • test_docs_count number Required

          The number of documents that are not used for training the model and can be used for testing.

        • training_docs_count number Required

          The number of documents that are used for training the model.

      • id string Required
      • memory_usage object Required
        Hide memory_usage attributes Show memory_usage attributes object
        • This value is present when the status is hard_limit and it is a new estimate of how much memory the job needs.

        • peak_usage_bytes number Required

          The number of bytes used at the highest peak of memory usage.

        • status string Required

          The memory usage status.

        • Time unit for milliseconds

      • progress array[object] Required

        The progress report of the data frame analytics job by phase.

        Hide progress attributes Show progress attributes object
        • phase string Required

          Defines the phase of the data frame analytics job.

        • progress_percent number Required

          The progress that the data frame analytics job has made expressed in percentage.

      • state string Required

        Values are started, stopped, starting, stopping, or failed.

GET /_ml/data_frame/analytics/_stats
curl \
 --request GET 'http://api.example.com/_ml/data_frame/analytics/_stats' \
 --header "Authorization: $API_KEY"

Get data frame analytics job stats Added in 7.3.0

GET /_ml/data_frame/analytics/{id}/_stats

Path parameters

  • id string Required

    Identifier for the data frame analytics job. If you do not specify this option, the API returns information for the first hundred data frame analytics jobs.

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no data frame analytics jobs that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    The default value returns an empty data_frame_analytics array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

  • from number

    Skips the specified number of data frame analytics jobs.

  • size number

    Specifies the maximum number of data frame analytics jobs to obtain.

  • verbose boolean

    Defines whether the stats response should be verbose.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • data_frame_analytics array[object] Required

      An array of objects that contain usage information for data frame analytics jobs, which are sorted by the id value in ascending order.

      Hide data_frame_analytics attributes Show data_frame_analytics attributes object
      • Hide analysis_stats attributes Show analysis_stats attributes object
        • Hide classification_stats attributes Show classification_stats attributes object
          • hyperparameters object Required
            Hide hyperparameters attributes Show hyperparameters attributes object
            • alpha number

              Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

            • lambda number

              Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

            • gamma number

              Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

            • eta number

              Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

            • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

            • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

            • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

            • If the algorithm fails to determine a non-trivial tree (more than a single leaf), this parameter determines how many of such consecutive failures are tolerated. Once the number of attempts exceeds the threshold, the forest training stops.

            • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

            • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

            • The maximum number of folds for the cross-validation procedure.

            • Determines the maximum number of splits for every feature that can occur in a decision tree when the tree is trained.

            • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

            • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

          • iteration number Required

            The number of iterations on the analysis.

          • Time unit for milliseconds

          • timing_stats object Required
            Hide timing_stats attributes Show timing_stats attributes object
          • validation_loss object Required
            Hide validation_loss attributes Show validation_loss attributes object
            • fold_values array[string] Required

              Validation loss values for every added decision tree during the forest growing procedure.

            • loss_type string Required

              The type of the loss metric. For example, binomial_logistic.

        • Hide outlier_detection_stats attributes Show outlier_detection_stats attributes object
          • parameters object Required
            Hide parameters attributes Show parameters attributes object
            • Specifies whether the feature influence calculation is enabled.

            • The minimum outlier score that a document needs to have in order to calculate its feature influence score. Value range: 0-1

            • method string

              The method that outlier detection uses. Available methods are lof, ldof, distance_kth_nn, distance_knn, and ensemble. The default value is ensemble, which means that outlier detection uses an ensemble of different methods and normalises and combines their individual outlier scores to obtain the overall outlier score.

            • Defines the value for how many nearest neighbors each method of outlier detection uses to calculate its outlier score. When the value is not set, different values are used for different ensemble members. This default behavior helps improve the diversity in the ensemble; only override it if you are confident that the value you choose is appropriate for the data set.

            • The proportion of the data set that is assumed to be outlying prior to outlier detection. For example, 0.05 means it is assumed that 5% of values are real outliers and 95% are inliers.

            • If true, the following operation is performed on the columns before computing outlier scores: (x_i - mean(x_i)) / sd(x_i).

          • Time unit for milliseconds

          • timing_stats object Required
            Hide timing_stats attributes Show timing_stats attributes object
        • Hide regression_stats attributes Show regression_stats attributes object
          • hyperparameters object Required
            Hide hyperparameters attributes Show hyperparameters attributes object
            • alpha number

              Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

            • lambda number

              Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

            • gamma number

              Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

            • eta number

              Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

            • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

            • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

            • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

            • If the algorithm fails to determine a non-trivial tree (more than a single leaf), this parameter determines how many of such consecutive failures are tolerated. Once the number of attempts exceeds the threshold, the forest training stops.

            • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

            • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

            • The maximum number of folds for the cross-validation procedure.

            • Determines the maximum number of splits for every feature that can occur in a decision tree when the tree is trained.

            • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

            • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

          • iteration number Required

            The number of iterations on the analysis.

          • Time unit for milliseconds

          • timing_stats object Required
            Hide timing_stats attributes Show timing_stats attributes object
          • validation_loss object Required
            Hide validation_loss attributes Show validation_loss attributes object
            • fold_values array[string] Required

              Validation loss values for every added decision tree during the forest growing procedure.

            • loss_type string Required

              The type of the loss metric. For example, binomial_logistic.

      • For running jobs only, contains messages relating to the selection of a node to run the job.

      • data_counts object Required
        Hide data_counts attributes Show data_counts attributes object
        • skipped_docs_count number Required

          The number of documents that are skipped during the analysis because they contained values that are not supported by the analysis. For example, outlier detection does not support missing fields so it skips documents with missing fields. Likewise, all types of analysis skip documents that contain arrays with more than one element.

        • test_docs_count number Required

          The number of documents that are not used for training the model and can be used for testing.

        • training_docs_count number Required

          The number of documents that are used for training the model.

      • id string Required
      • memory_usage object Required
        Hide memory_usage attributes Show memory_usage attributes object
        • This value is present when the status is hard_limit and it is a new estimate of how much memory the job needs.

        • peak_usage_bytes number Required

          The number of bytes used at the highest peak of memory usage.

        • status string Required

          The memory usage status.

        • Time unit for milliseconds

      • progress array[object] Required

        The progress report of the data frame analytics job by phase.

        Hide progress attributes Show progress attributes object
        • phase string Required

          Defines the phase of the data frame analytics job.

        • progress_percent number Required

          The progress that the data frame analytics job has made expressed in percentage.

      • state string Required

        Values are started, stopped, starting, stopping, or failed.

GET /_ml/data_frame/analytics/{id}/_stats
curl \
 --request GET 'http://api.example.com/_ml/data_frame/analytics/{id}/_stats' \
 --header "Authorization: $API_KEY"

Preview features used by data frame analytics Added in 7.13.0

GET /_ml/data_frame/analytics/_preview

Preview the extracted features used by a data frame analytics config.

application/json

Body

  • config object
    Hide config attributes Show config attributes object
    • source object Required
      Hide source attributes Show source attributes object
      • index string | array[string] Required
      • Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
              Hide * attribute Show * attribute object
              • type string Required

                Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

          • fetch_fields array[object]

            For type lookup

            Hide fetch_fields attributes Show fetch_fields attributes object
            • field string Required

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • format string
          • format string

            A custom format for date type runtime fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • script object
            Hide script attributes Show script attributes object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • _source object
        Hide _source attributes Show _source attributes object
        • includes array[string]

          An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

        • excludes array[string]

          An array of strings that defines the fields that will be included in the analysis.

      • query object

        The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {}}.

        Query DSL
    • analysis object Required
      Hide analysis attributes Show analysis attributes object
      • Hide classification attributes Show classification attributes object
        • alpha number

          Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

        • dependent_variable string Required

          Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

        • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

        • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

        • eta number

          Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

        • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

        • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

        • feature_processors array[object]

          Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          Hide feature_processors attributes Show feature_processors attributes object
        • gamma number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • lambda number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

        • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

        • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

        • Defines the number of categories for which the predicted probabilities are reported. It must be non-negative or -1. If it is -1 or greater than the total number of categories, probabilities are reported for all categories; if you have a large number of categories, there could be a significant effect on the size of your destination index. NOTE: To use the AUC ROC evaluation method, num_top_classes must be set to -1 or a value greater than or equal to the total number of categories.

      • Hide outlier_detection attributes Show outlier_detection attributes object
        • Specifies whether the feature influence calculation is enabled.

        • The minimum outlier score that a document needs to have in order to calculate its feature influence score. Value range: 0-1.

        • method string

          The method that outlier detection uses. Available methods are lof, ldof, distance_kth_nn, distance_knn, and ensemble. The default value is ensemble, which means that outlier detection uses an ensemble of different methods and normalises and combines their individual outlier scores to obtain the overall outlier score.

        • Defines the value for how many nearest neighbors each method of outlier detection uses to calculate its outlier score. When the value is not set, different values are used for different ensemble members. This default behavior helps improve the diversity in the ensemble; only override it if you are confident that the value you choose is appropriate for the data set.

        • The proportion of the data set that is assumed to be outlying prior to outlier detection. For example, 0.05 means it is assumed that 5% of values are real outliers and 95% are inliers.

        • If true, the following operation is performed on the columns before computing outlier scores: (x_i - mean(x_i)) / sd(x_i).

      • Hide regression attributes Show regression attributes object
        • alpha number

          Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

        • dependent_variable string Required

          Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

        • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

        • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

        • eta number

          Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

        • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

        • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

        • feature_processors array[object]

          Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          Hide feature_processors attributes Show feature_processors attributes object
        • gamma number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • lambda number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

        • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

        • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

        • The loss function used during regression. Available options are mse (mean squared error), msle (mean squared logarithmic error), huber (Pseudo-Huber loss).

        • A positive number that is used as a parameter to the loss_function.

    • Hide analyzed_fields attributes Show analyzed_fields attributes object
      • includes array[string]

        An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

      • excludes array[string]

        An array of strings that defines the fields that will be included in the analysis.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • feature_values array[object] Required

      An array of objects that contain feature name and value pairs. The features have been processed and indicate what will be sent to the model for training.

      Hide feature_values attribute Show feature_values attribute object
      • * string Additional properties
GET /_ml/data_frame/analytics/_preview
curl \
 --request GET 'http://api.example.com/_ml/data_frame/analytics/_preview' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"config":{"source":{"index":"string","runtime_mappings":{"additionalProperty1":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"},"additionalProperty2":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"}},"_source":{"includes":["string"],"excludes":["string"]},"query":{}},"analysis":{"classification":{"alpha":42.0,"dependent_variable":"string","downsample_factor":42.0,"early_stopping_enabled":true,"eta":42.0,"eta_growth_rate_per_tree":42.0,"feature_bag_fraction":42.0,"feature_processors":[{"frequency_encoding":{},"multi_encoding":{},"n_gram_encoding":{},"one_hot_encoding":{},"target_mean_encoding":{}}],"gamma":42.0,"lambda":42.0,"max_optimization_rounds_per_hyperparameter":42.0,"max_trees":42.0,"num_top_feature_importance_values":42.0,"prediction_field_name":"string","randomize_seed":42.0,"soft_tree_depth_limit":42.0,"soft_tree_depth_tolerance":42.0,"":"string","class_assignment_objective":"string","num_top_classes":42.0},"outlier_detection":{"compute_feature_influence":true,"feature_influence_threshold":42.0,"method":"string","n_neighbors":42.0,"outlier_fraction":42.0,"standardization_enabled":true},"regression":{"alpha":42.0,"dependent_variable":"string","downsample_factor":42.0,"early_stopping_enabled":true,"eta":42.0,"eta_growth_rate_per_tree":42.0,"feature_bag_fraction":42.0,"feature_processors":[{"frequency_encoding":{},"multi_encoding":{},"n_gram_encoding":{},"one_hot_encoding":{},"target_mean_encoding":{}}],"gamma":42.0,"lambda":42.0,"max_optimization_rounds_per_hyperparameter":42.0,"max_trees":42.0,"num_top_feature_importance_values":42.0,"prediction_field_name":"string","randomize_seed":42.0,"soft_tree_depth_limit":42.0,"soft_tree_depth_tolerance":42.0,"":"string","loss_function":"string","loss_function_parameter":42.0}},"model_memory_limit":"string","max_num_threads":42.0,"analyzed_fields":{"includes":["string"],"excludes":["string"]}}}'

Preview features used by data frame analytics Added in 7.13.0

POST /_ml/data_frame/analytics/_preview

Preview the extracted features used by a data frame analytics config.

application/json

Body

  • config object
    Hide config attributes Show config attributes object
    • source object Required
      Hide source attributes Show source attributes object
      • index string | array[string] Required
      • Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
              Hide * attribute Show * attribute object
              • type string Required

                Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

          • fetch_fields array[object]

            For type lookup

            Hide fetch_fields attributes Show fetch_fields attributes object
            • field string Required

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • format string
          • format string

            A custom format for date type runtime fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • script object
            Hide script attributes Show script attributes object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • _source object
        Hide _source attributes Show _source attributes object
        • includes array[string]

          An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

        • excludes array[string]

          An array of strings that defines the fields that will be included in the analysis.

      • query object

        The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {}}.

        Query DSL
    • analysis object Required
      Hide analysis attributes Show analysis attributes object
      • Hide classification attributes Show classification attributes object
        • alpha number

          Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

        • dependent_variable string Required

          Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

        • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

        • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

        • eta number

          Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

        • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

        • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

        • feature_processors array[object]

          Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          Hide feature_processors attributes Show feature_processors attributes object
        • gamma number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • lambda number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

        • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

        • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

        • Defines the number of categories for which the predicted probabilities are reported. It must be non-negative or -1. If it is -1 or greater than the total number of categories, probabilities are reported for all categories; if you have a large number of categories, there could be a significant effect on the size of your destination index. NOTE: To use the AUC ROC evaluation method, num_top_classes must be set to -1 or a value greater than or equal to the total number of categories.

      • Hide outlier_detection attributes Show outlier_detection attributes object
        • Specifies whether the feature influence calculation is enabled.

        • The minimum outlier score that a document needs to have in order to calculate its feature influence score. Value range: 0-1.

        • method string

          The method that outlier detection uses. Available methods are lof, ldof, distance_kth_nn, distance_knn, and ensemble. The default value is ensemble, which means that outlier detection uses an ensemble of different methods and normalises and combines their individual outlier scores to obtain the overall outlier score.

        • Defines the value for how many nearest neighbors each method of outlier detection uses to calculate its outlier score. When the value is not set, different values are used for different ensemble members. This default behavior helps improve the diversity in the ensemble; only override it if you are confident that the value you choose is appropriate for the data set.

        • The proportion of the data set that is assumed to be outlying prior to outlier detection. For example, 0.05 means it is assumed that 5% of values are real outliers and 95% are inliers.

        • If true, the following operation is performed on the columns before computing outlier scores: (x_i - mean(x_i)) / sd(x_i).

      • Hide regression attributes Show regression attributes object
        • alpha number

          Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

        • dependent_variable string Required

          Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

        • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

        • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

        • eta number

          Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

        • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

        • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

        • feature_processors array[object]

          Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          Hide feature_processors attributes Show feature_processors attributes object
        • gamma number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • lambda number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

        • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

        • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

        • The loss function used during regression. Available options are mse (mean squared error), msle (mean squared logarithmic error), huber (Pseudo-Huber loss).

        • A positive number that is used as a parameter to the loss_function.

    • Hide analyzed_fields attributes Show analyzed_fields attributes object
      • includes array[string]

        An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

      • excludes array[string]

        An array of strings that defines the fields that will be included in the analysis.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • feature_values array[object] Required

      An array of objects that contain feature name and value pairs. The features have been processed and indicate what will be sent to the model for training.

      Hide feature_values attribute Show feature_values attribute object
      • * string Additional properties
POST /_ml/data_frame/analytics/_preview
curl \
 --request POST 'http://api.example.com/_ml/data_frame/analytics/_preview' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"config":{"source":{"index":"string","runtime_mappings":{"additionalProperty1":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"},"additionalProperty2":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"}},"_source":{"includes":["string"],"excludes":["string"]},"query":{}},"analysis":{"classification":{"alpha":42.0,"dependent_variable":"string","downsample_factor":42.0,"early_stopping_enabled":true,"eta":42.0,"eta_growth_rate_per_tree":42.0,"feature_bag_fraction":42.0,"feature_processors":[{"frequency_encoding":{},"multi_encoding":{},"n_gram_encoding":{},"one_hot_encoding":{},"target_mean_encoding":{}}],"gamma":42.0,"lambda":42.0,"max_optimization_rounds_per_hyperparameter":42.0,"max_trees":42.0,"num_top_feature_importance_values":42.0,"prediction_field_name":"string","randomize_seed":42.0,"soft_tree_depth_limit":42.0,"soft_tree_depth_tolerance":42.0,"":"string","class_assignment_objective":"string","num_top_classes":42.0},"outlier_detection":{"compute_feature_influence":true,"feature_influence_threshold":42.0,"method":"string","n_neighbors":42.0,"outlier_fraction":42.0,"standardization_enabled":true},"regression":{"alpha":42.0,"dependent_variable":"string","downsample_factor":42.0,"early_stopping_enabled":true,"eta":42.0,"eta_growth_rate_per_tree":42.0,"feature_bag_fraction":42.0,"feature_processors":[{"frequency_encoding":{},"multi_encoding":{},"n_gram_encoding":{},"one_hot_encoding":{},"target_mean_encoding":{}}],"gamma":42.0,"lambda":42.0,"max_optimization_rounds_per_hyperparameter":42.0,"max_trees":42.0,"num_top_feature_importance_values":42.0,"prediction_field_name":"string","randomize_seed":42.0,"soft_tree_depth_limit":42.0,"soft_tree_depth_tolerance":42.0,"":"string","loss_function":"string","loss_function_parameter":42.0}},"model_memory_limit":"string","max_num_threads":42.0,"analyzed_fields":{"includes":["string"],"excludes":["string"]}}}'

Preview features used by data frame analytics Added in 7.13.0

GET /_ml/data_frame/analytics/{id}/_preview

Preview the extracted features used by a data frame analytics config.

Path parameters

  • id string Required

    Identifier for the data frame analytics job.

application/json

Body

  • config object
    Hide config attributes Show config attributes object
    • source object Required
      Hide source attributes Show source attributes object
      • index string | array[string] Required
      • Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
              Hide * attribute Show * attribute object
              • type string Required

                Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

          • fetch_fields array[object]

            For type lookup

            Hide fetch_fields attributes Show fetch_fields attributes object
            • field string Required

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • format string
          • format string

            A custom format for date type runtime fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • script object
            Hide script attributes Show script attributes object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • _source object
        Hide _source attributes Show _source attributes object
        • includes array[string]

          An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

        • excludes array[string]

          An array of strings that defines the fields that will be included in the analysis.

      • query object

        The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {}}.

        Query DSL
    • analysis object Required
      Hide analysis attributes Show analysis attributes object
      • Hide classification attributes Show classification attributes object
        • alpha number

          Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

        • dependent_variable string Required

          Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

        • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

        • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

        • eta number

          Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

        • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

        • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

        • feature_processors array[object]

          Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          Hide feature_processors attributes Show feature_processors attributes object
        • gamma number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • lambda number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

        • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

        • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

        • Defines the number of categories for which the predicted probabilities are reported. It must be non-negative or -1. If it is -1 or greater than the total number of categories, probabilities are reported for all categories; if you have a large number of categories, there could be a significant effect on the size of your destination index. NOTE: To use the AUC ROC evaluation method, num_top_classes must be set to -1 or a value greater than or equal to the total number of categories.

      • Hide outlier_detection attributes Show outlier_detection attributes object
        • Specifies whether the feature influence calculation is enabled.

        • The minimum outlier score that a document needs to have in order to calculate its feature influence score. Value range: 0-1.

        • method string

          The method that outlier detection uses. Available methods are lof, ldof, distance_kth_nn, distance_knn, and ensemble. The default value is ensemble, which means that outlier detection uses an ensemble of different methods and normalises and combines their individual outlier scores to obtain the overall outlier score.

        • Defines the value for how many nearest neighbors each method of outlier detection uses to calculate its outlier score. When the value is not set, different values are used for different ensemble members. This default behavior helps improve the diversity in the ensemble; only override it if you are confident that the value you choose is appropriate for the data set.

        • The proportion of the data set that is assumed to be outlying prior to outlier detection. For example, 0.05 means it is assumed that 5% of values are real outliers and 95% are inliers.

        • If true, the following operation is performed on the columns before computing outlier scores: (x_i - mean(x_i)) / sd(x_i).

      • Hide regression attributes Show regression attributes object
        • alpha number

          Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

        • dependent_variable string Required

          Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

        • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

        • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

        • eta number

          Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

        • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

        • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

        • feature_processors array[object]

          Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          Hide feature_processors attributes Show feature_processors attributes object
        • gamma number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • lambda number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

        • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

        • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

        • The loss function used during regression. Available options are mse (mean squared error), msle (mean squared logarithmic error), huber (Pseudo-Huber loss).

        • A positive number that is used as a parameter to the loss_function.

    • Hide analyzed_fields attributes Show analyzed_fields attributes object
      • includes array[string]

        An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

      • excludes array[string]

        An array of strings that defines the fields that will be included in the analysis.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • feature_values array[object] Required

      An array of objects that contain feature name and value pairs. The features have been processed and indicate what will be sent to the model for training.

      Hide feature_values attribute Show feature_values attribute object
      • * string Additional properties
GET /_ml/data_frame/analytics/{id}/_preview
curl \
 --request GET 'http://api.example.com/_ml/data_frame/analytics/{id}/_preview' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"config":{"source":{"index":"string","runtime_mappings":{"additionalProperty1":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"},"additionalProperty2":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"}},"_source":{"includes":["string"],"excludes":["string"]},"query":{}},"analysis":{"classification":{"alpha":42.0,"dependent_variable":"string","downsample_factor":42.0,"early_stopping_enabled":true,"eta":42.0,"eta_growth_rate_per_tree":42.0,"feature_bag_fraction":42.0,"feature_processors":[{"frequency_encoding":{},"multi_encoding":{},"n_gram_encoding":{},"one_hot_encoding":{},"target_mean_encoding":{}}],"gamma":42.0,"lambda":42.0,"max_optimization_rounds_per_hyperparameter":42.0,"max_trees":42.0,"num_top_feature_importance_values":42.0,"prediction_field_name":"string","randomize_seed":42.0,"soft_tree_depth_limit":42.0,"soft_tree_depth_tolerance":42.0,"":"string","class_assignment_objective":"string","num_top_classes":42.0},"outlier_detection":{"compute_feature_influence":true,"feature_influence_threshold":42.0,"method":"string","n_neighbors":42.0,"outlier_fraction":42.0,"standardization_enabled":true},"regression":{"alpha":42.0,"dependent_variable":"string","downsample_factor":42.0,"early_stopping_enabled":true,"eta":42.0,"eta_growth_rate_per_tree":42.0,"feature_bag_fraction":42.0,"feature_processors":[{"frequency_encoding":{},"multi_encoding":{},"n_gram_encoding":{},"one_hot_encoding":{},"target_mean_encoding":{}}],"gamma":42.0,"lambda":42.0,"max_optimization_rounds_per_hyperparameter":42.0,"max_trees":42.0,"num_top_feature_importance_values":42.0,"prediction_field_name":"string","randomize_seed":42.0,"soft_tree_depth_limit":42.0,"soft_tree_depth_tolerance":42.0,"":"string","loss_function":"string","loss_function_parameter":42.0}},"model_memory_limit":"string","max_num_threads":42.0,"analyzed_fields":{"includes":["string"],"excludes":["string"]}}}'

Preview features used by data frame analytics Added in 7.13.0

POST /_ml/data_frame/analytics/{id}/_preview

Preview the extracted features used by a data frame analytics config.

Path parameters

  • id string Required

    Identifier for the data frame analytics job.

application/json

Body

  • config object
    Hide config attributes Show config attributes object
    • source object Required
      Hide source attributes Show source attributes object
      • index string | array[string] Required
      • Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
              Hide * attribute Show * attribute object
              • type string Required

                Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

          • fetch_fields array[object]

            For type lookup

            Hide fetch_fields attributes Show fetch_fields attributes object
            • field string Required

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • format string
          • format string

            A custom format for date type runtime fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • script object
            Hide script attributes Show script attributes object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • _source object
        Hide _source attributes Show _source attributes object
        • includes array[string]

          An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

        • excludes array[string]

          An array of strings that defines the fields that will be included in the analysis.

      • query object

        The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {}}.

        Query DSL
    • analysis object Required
      Hide analysis attributes Show analysis attributes object
      • Hide classification attributes Show classification attributes object
        • alpha number

          Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

        • dependent_variable string Required

          Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

        • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

        • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

        • eta number

          Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

        • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

        • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

        • feature_processors array[object]

          Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          Hide feature_processors attributes Show feature_processors attributes object
        • gamma number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • lambda number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

        • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

        • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

        • Defines the number of categories for which the predicted probabilities are reported. It must be non-negative or -1. If it is -1 or greater than the total number of categories, probabilities are reported for all categories; if you have a large number of categories, there could be a significant effect on the size of your destination index. NOTE: To use the AUC ROC evaluation method, num_top_classes must be set to -1 or a value greater than or equal to the total number of categories.

      • Hide outlier_detection attributes Show outlier_detection attributes object
        • Specifies whether the feature influence calculation is enabled.

        • The minimum outlier score that a document needs to have in order to calculate its feature influence score. Value range: 0-1.

        • method string

          The method that outlier detection uses. Available methods are lof, ldof, distance_kth_nn, distance_knn, and ensemble. The default value is ensemble, which means that outlier detection uses an ensemble of different methods and normalises and combines their individual outlier scores to obtain the overall outlier score.

        • Defines the value for how many nearest neighbors each method of outlier detection uses to calculate its outlier score. When the value is not set, different values are used for different ensemble members. This default behavior helps improve the diversity in the ensemble; only override it if you are confident that the value you choose is appropriate for the data set.

        • The proportion of the data set that is assumed to be outlying prior to outlier detection. For example, 0.05 means it is assumed that 5% of values are real outliers and 95% are inliers.

        • If true, the following operation is performed on the columns before computing outlier scores: (x_i - mean(x_i)) / sd(x_i).

      • Hide regression attributes Show regression attributes object
        • alpha number

          Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

        • dependent_variable string Required

          Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

        • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

        • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

        • eta number

          Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

        • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

        • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

        • feature_processors array[object]

          Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          Hide feature_processors attributes Show feature_processors attributes object
        • gamma number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • lambda number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

        • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

        • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

        • The loss function used during regression. Available options are mse (mean squared error), msle (mean squared logarithmic error), huber (Pseudo-Huber loss).

        • A positive number that is used as a parameter to the loss_function.

    • Hide analyzed_fields attributes Show analyzed_fields attributes object
      • includes array[string]

        An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

      • excludes array[string]

        An array of strings that defines the fields that will be included in the analysis.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • feature_values array[object] Required

      An array of objects that contain feature name and value pairs. The features have been processed and indicate what will be sent to the model for training.

      Hide feature_values attribute Show feature_values attribute object
      • * string Additional properties
POST /_ml/data_frame/analytics/{id}/_preview
curl \
 --request POST 'http://api.example.com/_ml/data_frame/analytics/{id}/_preview' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"config":{"source":{"index":"string","runtime_mappings":{"additionalProperty1":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"},"additionalProperty2":{"fields":{"additionalProperty1":{"type":"boolean"},"additionalProperty2":{"type":"boolean"}},"fetch_fields":[{"field":"string","format":"string"}],"format":"string","input_field":"string","target_field":"string","target_index":"string","script":{"":"painless","id":"string","params":{"additionalProperty1":{},"additionalProperty2":{}},"options":{"additionalProperty1":"string","additionalProperty2":"string"}},"type":"boolean"}},"_source":{"includes":["string"],"excludes":["string"]},"query":{}},"analysis":{"classification":{"alpha":42.0,"dependent_variable":"string","downsample_factor":42.0,"early_stopping_enabled":true,"eta":42.0,"eta_growth_rate_per_tree":42.0,"feature_bag_fraction":42.0,"feature_processors":[{"frequency_encoding":{},"multi_encoding":{},"n_gram_encoding":{},"one_hot_encoding":{},"target_mean_encoding":{}}],"gamma":42.0,"lambda":42.0,"max_optimization_rounds_per_hyperparameter":42.0,"max_trees":42.0,"num_top_feature_importance_values":42.0,"prediction_field_name":"string","randomize_seed":42.0,"soft_tree_depth_limit":42.0,"soft_tree_depth_tolerance":42.0,"":"string","class_assignment_objective":"string","num_top_classes":42.0},"outlier_detection":{"compute_feature_influence":true,"feature_influence_threshold":42.0,"method":"string","n_neighbors":42.0,"outlier_fraction":42.0,"standardization_enabled":true},"regression":{"alpha":42.0,"dependent_variable":"string","downsample_factor":42.0,"early_stopping_enabled":true,"eta":42.0,"eta_growth_rate_per_tree":42.0,"feature_bag_fraction":42.0,"feature_processors":[{"frequency_encoding":{},"multi_encoding":{},"n_gram_encoding":{},"one_hot_encoding":{},"target_mean_encoding":{}}],"gamma":42.0,"lambda":42.0,"max_optimization_rounds_per_hyperparameter":42.0,"max_trees":42.0,"num_top_feature_importance_values":42.0,"prediction_field_name":"string","randomize_seed":42.0,"soft_tree_depth_limit":42.0,"soft_tree_depth_tolerance":42.0,"":"string","loss_function":"string","loss_function_parameter":42.0}},"model_memory_limit":"string","max_num_threads":42.0,"analyzed_fields":{"includes":["string"],"excludes":["string"]}}}'

Start a data frame analytics job Added in 7.3.0

POST /_ml/data_frame/analytics/{id}/_start

A data frame analytics job can be started and stopped multiple times throughout its lifecycle. If the destination index does not exist, it is created automatically the first time you start the data frame analytics job. The index.number_of_shards and index.number_of_replicas settings for the destination index are copied from the source index. If there are multiple source indices, the destination index copies the highest setting values. The mappings for the destination index are also copied from the source indices. If there are any mapping conflicts, the job fails to start. If the destination index exists, it is used as is. You can therefore set up the destination index in advance with custom settings and mappings.

Path parameters

  • id string Required

    Identifier for the data frame analytics job. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.

Query parameters

  • timeout string

    Controls the amount of time to wait until the data frame analytics job starts.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
POST /_ml/data_frame/analytics/{id}/_start
curl \
 --request POST 'http://api.example.com/_ml/data_frame/analytics/{id}/_start' \
 --header "Authorization: $API_KEY"

Stop data frame analytics jobs Added in 7.3.0

POST /_ml/data_frame/analytics/{id}/_stop

A data frame analytics job can be started and stopped multiple times throughout its lifecycle.

Path parameters

  • id string Required

    Identifier for the data frame analytics job. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no data frame analytics jobs that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    The default value is true, which returns an empty data_frame_analytics array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

  • force boolean

    If true, the data frame analytics job is stopped forcefully.

  • timeout string

    Controls the amount of time to wait until the data frame analytics job stops. Defaults to 20 seconds.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
POST /_ml/data_frame/analytics/{id}/_stop
curl \
 --request POST 'http://api.example.com/_ml/data_frame/analytics/{id}/_stop' \
 --header "Authorization: $API_KEY"

Update a data frame analytics job Added in 7.3.0

POST /_ml/data_frame/analytics/{id}/_update

Path parameters

  • id string Required

    Identifier for the data frame analytics job. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.

application/json

Body Required

  • A description of the job.

  • The approximate maximum amount of memory resources that are permitted for analytical processing. If your elasticsearch.yml file contains an xpack.ml.max_model_memory_limit setting, an error occurs when you try to create data frame analytics jobs that have model_memory_limit values greater than that setting.

  • The maximum number of threads to be used by the analysis. Using more threads may decrease the time necessary to complete the analysis at the cost of using more CPU. Note that the process may use additional threads for operational functionality other than the analysis itself.

  • Specifies whether this job can start when there is insufficient machine learning node capacity for it to be immediately assigned to a node.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide authorization attributes Show authorization attributes object
      • api_key object
        Hide api_key attributes Show api_key attributes object
        • id string Required

          The identifier for the API key.

        • name string Required

          The name of the API key.

      • roles array[string]

        If a user ID was used for the most recent update to the job, its roles at the time of the update are listed in the response.

      • If a service account was used for the most recent update to the job, the account name is listed in the response.

    • allow_lazy_start boolean Required
    • analysis object Required
      Hide analysis attributes Show analysis attributes object
      • Hide classification attributes Show classification attributes object
        • alpha number

          Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

        • dependent_variable string Required

          Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

        • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

        • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

        • eta number

          Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

        • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

        • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

        • feature_processors array[object]

          Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          Hide feature_processors attributes Show feature_processors attributes object
        • gamma number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • lambda number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

        • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

        • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

        • Defines the number of categories for which the predicted probabilities are reported. It must be non-negative or -1. If it is -1 or greater than the total number of categories, probabilities are reported for all categories; if you have a large number of categories, there could be a significant effect on the size of your destination index. NOTE: To use the AUC ROC evaluation method, num_top_classes must be set to -1 or a value greater than or equal to the total number of categories.

      • Hide outlier_detection attributes Show outlier_detection attributes object
        • Specifies whether the feature influence calculation is enabled.

        • The minimum outlier score that a document needs to have in order to calculate its feature influence score. Value range: 0-1.

        • method string

          The method that outlier detection uses. Available methods are lof, ldof, distance_kth_nn, distance_knn, and ensemble. The default value is ensemble, which means that outlier detection uses an ensemble of different methods and normalises and combines their individual outlier scores to obtain the overall outlier score.

        • Defines the value for how many nearest neighbors each method of outlier detection uses to calculate its outlier score. When the value is not set, different values are used for different ensemble members. This default behavior helps improve the diversity in the ensemble; only override it if you are confident that the value you choose is appropriate for the data set.

        • The proportion of the data set that is assumed to be outlying prior to outlier detection. For example, 0.05 means it is assumed that 5% of values are real outliers and 95% are inliers.

        • If true, the following operation is performed on the columns before computing outlier scores: (x_i - mean(x_i)) / sd(x_i).

      • Hide regression attributes Show regression attributes object
        • alpha number

          Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This parameter affects loss calculations by acting as a multiplier of the tree depth. Higher alpha values result in shallower trees and faster training times. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to zero.

        • dependent_variable string Required

          Defines which field of the document is to be predicted. It must match one of the fields in the index being used to train. If this field is missing from a document, then that document will not be used for training, but a prediction with the trained model will be generated for it. It is also known as continuous target variable. For classification analysis, the data type of the field must be numeric (integer, short, long, byte), categorical (ip or keyword), or boolean. There must be no more than 30 different values in this field. For regression analysis, the data type of the field must be numeric.

        • Advanced configuration option. Controls the fraction of data that is used to compute the derivatives of the loss function for tree training. A small value results in the use of a small fraction of the data. If this value is set to be less than 1, accuracy typically improves. However, too small a value may result in poor convergence for the ensemble and so require more trees. By default, this value is calculated during hyperparameter optimization. It must be greater than zero and less than or equal to 1.

        • Advanced configuration option. Specifies whether the training process should finish if it is not finding any better performing models. If disabled, the training process can take significantly longer and the chance of finding a better performing model is unremarkable.

        • eta number

          Advanced configuration option. The shrinkage applied to the weights. Smaller values result in larger forests which have a better generalization error. However, larger forests cause slower training. By default, this value is calculated during hyperparameter optimization. It must be a value between 0.001 and 1.

        • Advanced configuration option. Specifies the rate at which eta increases for each new tree that is added to the forest. For example, a rate of 1.05 increases eta by 5% for each extra tree. By default, this value is calculated during hyperparameter optimization. It must be between 0.5 and 2.

        • Advanced configuration option. Defines the fraction of features that will be used when selecting a random bag for each candidate split. By default, this value is calculated during hyperparameter optimization.

        • feature_processors array[object]

          Advanced configuration option. A collection of feature preprocessors that modify one or more included fields. The analysis uses the resulting one or more features instead of the original document field. However, these features are ephemeral; they are not stored in the destination index. Multiple feature_processors entries can refer to the same document fields. Automatic categorical feature encoding still occurs for the fields that are unprocessed by a custom processor or that have categorical values. Use this property only if you want to override the automatic feature encoding of the specified fields.

          Hide feature_processors attributes Show feature_processors attributes object
        • gamma number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies a linear penalty associated with the size of individual trees in the forest. A high gamma value causes training to prefer small trees. A small gamma value results in larger individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • lambda number

          Advanced configuration option. Regularization parameter to prevent overfitting on the training data set. Multiplies an L2 regularization term which applies to leaf weights of the individual trees in the forest. A high lambda value causes training to favor small leaf weights. This behavior makes the prediction function smoother at the expense of potentially not being able to capture relevant relationships between the features and the dependent variable. A small lambda value results in large individual trees and slower training. By default, this value is calculated during hyperparameter optimization. It must be a nonnegative value.

        • Advanced configuration option. A multiplier responsible for determining the maximum number of hyperparameter optimization steps in the Bayesian optimization procedure. The maximum number of steps is determined based on the number of undefined hyperparameters times the maximum optimization rounds per hyperparameter. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Defines the maximum number of decision trees in the forest. The maximum value is 2000. By default, this value is calculated during hyperparameter optimization.

        • Advanced configuration option. Specifies the maximum number of feature importance values per document to return. By default, no feature importance calculation occurs.

        • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • Defines the seed for the random generator that is used to pick training data. By default, it is randomly generated. Set it to a specific value to use the same training data each time you start a job (assuming other related parameters such as source and analyzed_fields are the same).

        • Advanced configuration option. Machine learning uses loss guided tree growing, which means that the decision trees grow where the regularized loss decreases most quickly. This soft limit combines with the soft_tree_depth_tolerance to penalize trees that exceed the specified depth; the regularized loss increases quickly beyond this depth. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.

        • Advanced configuration option. This option controls how quickly the regularized loss increases when the tree depth exceeds soft_tree_depth_limit. By default, this value is calculated during hyperparameter optimization. It must be greater than or equal to 0.01.

        • The loss function used during regression. Available options are mse (mean squared error), msle (mean squared logarithmic error), huber (Pseudo-Huber loss).

        • A positive number that is used as a parameter to the loss_function.

    • Hide analyzed_fields attributes Show analyzed_fields attributes object
      • includes array[string]

        An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

      • excludes array[string]

        An array of strings that defines the fields that will be included in the analysis.

    • create_time number Required
    • dest object Required
      Hide dest attributes Show dest attributes object
      • index string Required
      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • id string Required
    • max_num_threads number Required
    • model_memory_limit string Required
    • source object Required
      Hide source attributes Show source attributes object
      • index string | array[string] Required
      • Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
              Hide * attribute Show * attribute object
              • type string Required

                Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

          • fetch_fields array[object]

            For type lookup

            Hide fetch_fields attributes Show fetch_fields attributes object
            • field string Required

              Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

            • format string
          • format string

            A custom format for date type runtime fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • script object
            Hide script attributes Show script attributes object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • _source object
        Hide _source attributes Show _source attributes object
        • includes array[string]

          An array of strings that defines the fields that will be excluded from the analysis. You do not need to add fields with unsupported data types to excludes, these fields are excluded from the analysis automatically.

        • excludes array[string]

          An array of strings that defines the fields that will be included in the analysis.

      • query object

        The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {}}.

        Query DSL
    • version string Required
POST /_ml/data_frame/analytics/{id}/_update
curl \
 --request POST 'http://api.example.com/_ml/data_frame/analytics/{id}/_update' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"description":"string","model_memory_limit":"string","max_num_threads":42.0,"allow_lazy_start":true}'

Get trained model configuration info Added in 7.10.0

GET /_ml/trained_models/{model_id}

Path parameters

  • model_id string | array[string] Required

    The unique identifier of the trained model or a model alias.

    You can get information for multiple trained models in a single API request by using a comma-separated list of model IDs or a wildcard expression.

Query parameters

  • Specifies what to do when the request:

    • Contains wildcard expressions and there are no models that match.
    • Contains the _all string or no identifiers and there are no matches.
    • Contains wildcard expressions and there are only partial matches.

    If true, it returns an empty array when there are no matches and the subset of results when there are partial matches.

  • Specifies whether the included model definition should be returned as a JSON map (true) or in a custom compressed format (false).

  • Indicates if certain fields should be removed from the configuration on retrieval. This allows the configuration to be in an acceptable format to be retrieved and then added to another cluster.

  • from number

    Skips the specified number of models.

  • include string

    A comma delimited string of optional fields to include in the response body.

    Supported values include:

    • definition: Includes the model definition.
    • feature_importance_baseline: Includes the baseline for feature importance values.
    • hyperparameters: Includes the information about hyperparameters used to train the model. This information consists of the value, the absolute and relative importance of the hyperparameter as well as an indicator of whether it was specified by the user or tuned during hyperparameter optimization.
    • total_feature_importance: Includes the total feature importance for the training data set. The baseline and total feature importance values are returned in the metadata field in the response body.
    • definition_status: Includes the model definition status.

    Values are definition, feature_importance_baseline, hyperparameters, total_feature_importance, or definition_status.

  • size number

    Specifies the maximum number of models to obtain.

  • tags string | array[string]

    A comma delimited string of tags. A trained model can have many tags, or none. When supplied, only trained models that contain all the supplied tags are returned.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • trained_model_configs array[object] Required

      An array of trained model resources, which are sorted by the model_id value in ascending order.

      Hide trained_model_configs attributes Show trained_model_configs attributes object
      • model_id string Required
      • Values are tree_ensemble, lang_ident, or pytorch.

      • tags array[string] Required

        A comma delimited string of tags. A trained model can have many tags, or none.

      • version string
      • Information on the creator of the trained model.

      • create_time string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • Any field map described in the inference configuration takes precedence.

        Hide default_field_map attribute Show default_field_map attribute object
        • * string Additional properties
      • The free-text description of the trained model.

      • The estimated heap usage in bytes to keep the trained model in memory.

      • The estimated number of operations to use the trained model.

      • True if the full model definition is present.

      • Inference configuration provided when storing the model config

        Hide inference_config attributes Show inference_config attributes object
        • Hide regression attributes Show regression attributes object
        • Hide classification attributes Show classification attributes object
          • Specifies the number of top class predictions to return. Defaults to 0.

          • Specifies the maximum number of feature importance values per document.

          • Specifies the type of the predicted field to write. Acceptable values are: string, number, boolean. When boolean is provided 1.0 is transformed to true and 0.0 to false.

          • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • Specifies the field to which the top classes are written. Defaults to top_classes.

        • Hide text_classification attributes Show text_classification attributes object
          • Specifies the number of top class predictions to return. Defaults to 0.

          • Tokenization options stored in inference configuration

            Hide tokenization attributes Show tokenization attributes object
          • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • Classification labels to apply other than the stored labels. Must have the same deminsions as the default configured labels

          • Hide vocabulary attribute Show vocabulary attribute object
        • Hide zero_shot_classification attributes Show zero_shot_classification attributes object
          • Tokenization options stored in inference configuration

            Hide tokenization attributes Show tokenization attributes object
          • Hypothesis template used when tokenizing labels for prediction

          • classification_labels array[string] Required

            The zero shot classification labels indicating entailment, neutral, and contradiction Must contain exactly and only entailment, neutral, and contradiction

          • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • Indicates if more than one true label exists.

          • labels array[string]

            The labels to predict.

        • Hide fill_mask attributes Show fill_mask attributes object
          • The string/token which will be removed from incoming documents and replaced with the inference prediction(s). In a response, this field contains the mask token for the specified model/tokenizer. Each model and tokenizer has a predefined mask token which cannot be changed. Thus, it is recommended not to set this value in requests. However, if this field is present in a request, its value must match the predefined value for that model/tokenizer, otherwise the request will fail.

          • Specifies the number of top class predictions to return. Defaults to 0.

          • Tokenization options stored in inference configuration

            Hide tokenization attributes Show tokenization attributes object
          • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • vocabulary object Required
            Hide vocabulary attribute Show vocabulary attribute object
        • Hide learning_to_rank attributes Show learning_to_rank attributes object
        • ner object
          Hide ner attributes Show ner attributes object
        • Hide pass_through attributes Show pass_through attributes object
        • Hide text_embedding attributes Show text_embedding attributes object
        • Hide text_expansion attributes Show text_expansion attributes object
        • Hide question_answering attributes Show question_answering attributes object
      • input object Required
        Hide input attribute Show input attribute object
        • field_names array[string] Required

          An array of input field names for the model.

      • The license level of the trained model.

      • metadata object
        Hide metadata attributes Show metadata attributes object
        • model_aliases array[string]
        • An object that contains the baseline for feature importance values. For regression analysis, it is a single value. For classification analysis, there is a value for each class.

          Hide feature_importance_baseline attribute Show feature_importance_baseline attribute object
          • * string Additional properties
        • hyperparameters array[object]

          List of the available hyperparameters optimized during the fine_parameter_tuning phase as well as specified by the user.

          Hide hyperparameters attributes Show hyperparameters attributes object
          • A positive number showing how much the parameter influences the variation of the loss function. For hyperparameters with values that are not specified by the user but tuned during hyperparameter optimization.

          • name string Required
          • A number between 0 and 1 showing the proportion of influence on the variation of the loss function among all tuned hyperparameters. For hyperparameters with values that are not specified by the user but tuned during hyperparameter optimization.

          • supplied boolean Required

            Indicates if the hyperparameter is specified by the user (true) or optimized (false).

          • value number Required

            The value of the hyperparameter, either optimized or specified by the user.

        • An array of the total feature importance for each feature used from the training data set. This array of objects is returned if data frame analytics trained the model and the request includes total_feature_importance in the include request parameter.

          Hide total_feature_importance attributes Show total_feature_importance attributes object
          • feature_name string Required
          • importance array[object] Required

            A collection of feature importance statistics related to the training data set for this particular feature.

          • classes array[object] Required

            If the trained model is a classification model, feature importance statistics are gathered per target class value.

      • Hide model_package attributes Show model_package attributes object
      • location object
        Hide location attribute Show location attribute object
        • index object Required
          Hide index attribute Show index attribute object
      • Hide prefix_strings attributes Show prefix_strings attributes object
        • ingest string

          String prepended to input at ingest

GET /_ml/trained_models/{model_id}
curl \
 --request GET 'http://api.example.com/_ml/trained_models/{model_id}' \
 --header "Authorization: $API_KEY"

Create a trained model Added in 7.10.0

PUT /_ml/trained_models/{model_id}

Enable you to supply a trained model that is not created by data frame analytics.

Path parameters

  • model_id string Required

    The unique identifier of the trained model.

Query parameters

  • If set to true and a compressed_definition is provided, the request defers definition decompression and skips relevant validations.

  • Whether to wait for all child operations (e.g. model download) to complete.

application/json

Body Required

  • The compressed (GZipped and Base64 encoded) inference definition of the model. If compressed_definition is specified, then definition cannot be specified.

  • Hide definition attributes Show definition attributes object
  • A human-readable description of the inference trained model.

  • Inference configuration provided when storing the model config

    Hide inference_config attributes Show inference_config attributes object
    • Hide regression attributes Show regression attributes object
    • Hide classification attributes Show classification attributes object
      • Specifies the number of top class predictions to return. Defaults to 0.

      • Specifies the maximum number of feature importance values per document.

      • Specifies the type of the predicted field to write. Acceptable values are: string, number, boolean. When boolean is provided 1.0 is transformed to true and 0.0 to false.

      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • Specifies the field to which the top classes are written. Defaults to top_classes.

    • Hide text_classification attributes Show text_classification attributes object
      • Specifies the number of top class predictions to return. Defaults to 0.

      • Tokenization options stored in inference configuration

        Hide tokenization attributes Show tokenization attributes object
        • bert object
          Hide bert attributes Show bert attributes object
        • bert_ja object
          Hide bert_ja attributes Show bert_ja attributes object
        • mpnet object
          Hide mpnet attributes Show mpnet attributes object
        • roberta object
          Hide roberta attributes Show roberta attributes object
        • Hide xlm_roberta attributes Show xlm_roberta attributes object
      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • Classification labels to apply other than the stored labels. Must have the same deminsions as the default configured labels

      • Hide vocabulary attribute Show vocabulary attribute object
    • Hide zero_shot_classification attributes Show zero_shot_classification attributes object
      • Tokenization options stored in inference configuration

        Hide tokenization attributes Show tokenization attributes object
        • bert object
          Hide bert attributes Show bert attributes object
        • bert_ja object
          Hide bert_ja attributes Show bert_ja attributes object
        • mpnet object
          Hide mpnet attributes Show mpnet attributes object
        • roberta object
          Hide roberta attributes Show roberta attributes object
        • Hide xlm_roberta attributes Show xlm_roberta attributes object
      • Hypothesis template used when tokenizing labels for prediction

      • classification_labels array[string] Required

        The zero shot classification labels indicating entailment, neutral, and contradiction Must contain exactly and only entailment, neutral, and contradiction

      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • Indicates if more than one true label exists.

      • labels array[string]

        The labels to predict.

    • Hide fill_mask attributes Show fill_mask attributes object
      • The string/token which will be removed from incoming documents and replaced with the inference prediction(s). In a response, this field contains the mask token for the specified model/tokenizer. Each model and tokenizer has a predefined mask token which cannot be changed. Thus, it is recommended not to set this value in requests. However, if this field is present in a request, its value must match the predefined value for that model/tokenizer, otherwise the request will fail.

      • Specifies the number of top class predictions to return. Defaults to 0.

      • Tokenization options stored in inference configuration

        Hide tokenization attributes Show tokenization attributes object
        • bert object
          Hide bert attributes Show bert attributes object
        • bert_ja object
          Hide bert_ja attributes Show bert_ja attributes object
        • mpnet object
          Hide mpnet attributes Show mpnet attributes object
        • roberta object
          Hide roberta attributes Show roberta attributes object
        • Hide xlm_roberta attributes Show xlm_roberta attributes object
      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • vocabulary object Required
        Hide vocabulary attribute Show vocabulary attribute object
    • Hide learning_to_rank attributes Show learning_to_rank attributes object
    • ner object
      Hide ner attributes Show ner attributes object
      • Tokenization options stored in inference configuration

        Hide tokenization attributes Show tokenization attributes object
        • bert object
          Hide bert attributes Show bert attributes object
        • bert_ja object
          Hide bert_ja attributes Show bert_ja attributes object
        • mpnet object
          Hide mpnet attributes Show mpnet attributes object
        • roberta object
          Hide roberta attributes Show roberta attributes object
        • Hide xlm_roberta attributes Show xlm_roberta attributes object
      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • The token classification labels. Must be IOB formatted tags

      • Hide vocabulary attribute Show vocabulary attribute object
    • Hide pass_through attributes Show pass_through attributes object
      • Tokenization options stored in inference configuration

        Hide tokenization attributes Show tokenization attributes object
        • bert object
          Hide bert attributes Show bert attributes object
        • bert_ja object
          Hide bert_ja attributes Show bert_ja attributes object
        • mpnet object
          Hide mpnet attributes Show mpnet attributes object
        • roberta object
          Hide roberta attributes Show roberta attributes object
        • Hide xlm_roberta attributes Show xlm_roberta attributes object
      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • Hide vocabulary attribute Show vocabulary attribute object
    • Hide text_embedding attributes Show text_embedding attributes object
      • The number of dimensions in the embedding output

      • Tokenization options stored in inference configuration

        Hide tokenization attributes Show tokenization attributes object
        • bert object
          Hide bert attributes Show bert attributes object
        • bert_ja object
          Hide bert_ja attributes Show bert_ja attributes object
        • mpnet object
          Hide mpnet attributes Show mpnet attributes object
        • roberta object
          Hide roberta attributes Show roberta attributes object
        • Hide xlm_roberta attributes Show xlm_roberta attributes object
      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • vocabulary object Required
        Hide vocabulary attribute Show vocabulary attribute object
    • Hide text_expansion attributes Show text_expansion attributes object
      • Tokenization options stored in inference configuration

        Hide tokenization attributes Show tokenization attributes object
        • bert object
          Hide bert attributes Show bert attributes object
        • bert_ja object
          Hide bert_ja attributes Show bert_ja attributes object
        • mpnet object
          Hide mpnet attributes Show mpnet attributes object
        • roberta object
          Hide roberta attributes Show roberta attributes object
        • Hide xlm_roberta attributes Show xlm_roberta attributes object
      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • vocabulary object Required
        Hide vocabulary attribute Show vocabulary attribute object
    • Hide question_answering attributes Show question_answering attributes object
      • Specifies the number of top class predictions to return. Defaults to 0.

      • Tokenization options stored in inference configuration

        Hide tokenization attributes Show tokenization attributes object
        • bert object
          Hide bert attributes Show bert attributes object
        • bert_ja object
          Hide bert_ja attributes Show bert_ja attributes object
        • mpnet object
          Hide mpnet attributes Show mpnet attributes object
        • roberta object
          Hide roberta attributes Show roberta attributes object
        • Hide xlm_roberta attributes Show xlm_roberta attributes object
      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • The maximum answer length to consider

  • input object
    Hide input attribute Show input attribute object
  • metadata object

    An object map that contains metadata about the model.

  • Values are tree_ensemble, lang_ident, or pytorch.

  • The estimated memory usage in bytes to keep the trained model in memory. This property is supported only if defer_definition_decompression is true or the model definition is not supplied.

  • The platform architecture (if applicable) of the trained mode. If the model only works on one platform, because it is heavily optimized for a particular processor architecture and OS combination, then this field specifies which. The format of the string must match the platform identifiers used by Elasticsearch, so one of, linux-x86_64, linux-aarch64, darwin-x86_64, darwin-aarch64, or windows-x86_64. For portable models (those that work independent of processor architecture or OS features), leave this field unset.

  • tags array[string]

    An array of tags to organize the model.

  • Hide prefix_strings attributes Show prefix_strings attributes object
    • ingest string

      String prepended to input at ingest

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • model_id string Required
    • Values are tree_ensemble, lang_ident, or pytorch.

    • tags array[string] Required

      A comma delimited string of tags. A trained model can have many tags, or none.

    • version string
    • Information on the creator of the trained model.

    • create_time string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    • Any field map described in the inference configuration takes precedence.

      Hide default_field_map attribute Show default_field_map attribute object
      • * string Additional properties
    • The free-text description of the trained model.

    • The estimated heap usage in bytes to keep the trained model in memory.

    • The estimated number of operations to use the trained model.

    • True if the full model definition is present.

    • Inference configuration provided when storing the model config

      Hide inference_config attributes Show inference_config attributes object
      • Hide regression attributes Show regression attributes object
      • Hide classification attributes Show classification attributes object
        • Specifies the number of top class predictions to return. Defaults to 0.

        • Specifies the maximum number of feature importance values per document.

        • Specifies the type of the predicted field to write. Acceptable values are: string, number, boolean. When boolean is provided 1.0 is transformed to true and 0.0 to false.

        • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • Specifies the field to which the top classes are written. Defaults to top_classes.

      • Hide text_classification attributes Show text_classification attributes object
        • Specifies the number of top class predictions to return. Defaults to 0.

        • Tokenization options stored in inference configuration

          Hide tokenization attributes Show tokenization attributes object
          • bert object
            Hide bert attributes Show bert attributes object
          • bert_ja object
            Hide bert_ja attributes Show bert_ja attributes object
          • mpnet object
            Hide mpnet attributes Show mpnet attributes object
          • roberta object
            Hide roberta attributes Show roberta attributes object
          • Hide xlm_roberta attributes Show xlm_roberta attributes object
        • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • Classification labels to apply other than the stored labels. Must have the same deminsions as the default configured labels

        • Hide vocabulary attribute Show vocabulary attribute object
      • Hide zero_shot_classification attributes Show zero_shot_classification attributes object
        • Tokenization options stored in inference configuration

          Hide tokenization attributes Show tokenization attributes object
          • bert object
            Hide bert attributes Show bert attributes object
          • bert_ja object
            Hide bert_ja attributes Show bert_ja attributes object
          • mpnet object
            Hide mpnet attributes Show mpnet attributes object
          • roberta object
            Hide roberta attributes Show roberta attributes object
          • Hide xlm_roberta attributes Show xlm_roberta attributes object
        • Hypothesis template used when tokenizing labels for prediction

        • classification_labels array[string] Required

          The zero shot classification labels indicating entailment, neutral, and contradiction Must contain exactly and only entailment, neutral, and contradiction

        • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • Indicates if more than one true label exists.

        • labels array[string]

          The labels to predict.

      • Hide fill_mask attributes Show fill_mask attributes object
        • The string/token which will be removed from incoming documents and replaced with the inference prediction(s). In a response, this field contains the mask token for the specified model/tokenizer. Each model and tokenizer has a predefined mask token which cannot be changed. Thus, it is recommended not to set this value in requests. However, if this field is present in a request, its value must match the predefined value for that model/tokenizer, otherwise the request will fail.

        • Specifies the number of top class predictions to return. Defaults to 0.

        • Tokenization options stored in inference configuration

          Hide tokenization attributes Show tokenization attributes object
          • bert object
            Hide bert attributes Show bert attributes object
          • bert_ja object
            Hide bert_ja attributes Show bert_ja attributes object
          • mpnet object
            Hide mpnet attributes Show mpnet attributes object
          • roberta object
            Hide roberta attributes Show roberta attributes object
          • Hide xlm_roberta attributes Show xlm_roberta attributes object
        • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • vocabulary object Required
          Hide vocabulary attribute Show vocabulary attribute object
      • Hide learning_to_rank attributes Show learning_to_rank attributes object
      • ner object
        Hide ner attributes Show ner attributes object
        • Tokenization options stored in inference configuration

          Hide tokenization attributes Show tokenization attributes object
          • bert object
            Hide bert attributes Show bert attributes object
          • bert_ja object
            Hide bert_ja attributes Show bert_ja attributes object
          • mpnet object
            Hide mpnet attributes Show mpnet attributes object
          • roberta object
            Hide roberta attributes Show roberta attributes object
          • Hide xlm_roberta attributes Show xlm_roberta attributes object
        • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • The token classification labels. Must be IOB formatted tags

        • Hide vocabulary attribute Show vocabulary attribute object
      • Hide pass_through attributes Show pass_through attributes object
        • Tokenization options stored in inference configuration

          Hide tokenization attributes Show tokenization attributes object
          • bert object
            Hide bert attributes Show bert attributes object
          • bert_ja object
            Hide bert_ja attributes Show bert_ja attributes object
          • mpnet object
            Hide mpnet attributes Show mpnet attributes object
          • roberta object
            Hide roberta attributes Show roberta attributes object
          • Hide xlm_roberta attributes Show xlm_roberta attributes object
        • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • Hide vocabulary attribute Show vocabulary attribute object
      • Hide text_embedding attributes Show text_embedding attributes object
        • The number of dimensions in the embedding output

        • Tokenization options stored in inference configuration

          Hide tokenization attributes Show tokenization attributes object
          • bert object
            Hide bert attributes Show bert attributes object
          • bert_ja object
            Hide bert_ja attributes Show bert_ja attributes object
          • mpnet object
            Hide mpnet attributes Show mpnet attributes object
          • roberta object
            Hide roberta attributes Show roberta attributes object
          • Hide xlm_roberta attributes Show xlm_roberta attributes object
        • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • vocabulary object Required
          Hide vocabulary attribute Show vocabulary attribute object
      • Hide text_expansion attributes Show text_expansion attributes object
        • Tokenization options stored in inference configuration

          Hide tokenization attributes Show tokenization attributes object
          • bert object
            Hide bert attributes Show bert attributes object
          • bert_ja object
            Hide bert_ja attributes Show bert_ja attributes object
          • mpnet object
            Hide mpnet attributes Show mpnet attributes object
          • roberta object
            Hide roberta attributes Show roberta attributes object
          • Hide xlm_roberta attributes Show xlm_roberta attributes object
        • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • vocabulary object Required
          Hide vocabulary attribute Show vocabulary attribute object
      • Hide question_answering attributes Show question_answering attributes object
        • Specifies the number of top class predictions to return. Defaults to 0.

        • Tokenization options stored in inference configuration

          Hide tokenization attributes Show tokenization attributes object
          • bert object
            Hide bert attributes Show bert attributes object
          • bert_ja object
            Hide bert_ja attributes Show bert_ja attributes object
          • mpnet object
            Hide mpnet attributes Show mpnet attributes object
          • roberta object
            Hide roberta attributes Show roberta attributes object
          • Hide xlm_roberta attributes Show xlm_roberta attributes object
        • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

        • The maximum answer length to consider

    • input object Required
      Hide input attribute Show input attribute object
      • field_names array[string] Required

        An array of input field names for the model.

    • The license level of the trained model.

    • metadata object
      Hide metadata attributes Show metadata attributes object
      • model_aliases array[string]
      • An object that contains the baseline for feature importance values. For regression analysis, it is a single value. For classification analysis, there is a value for each class.

        Hide feature_importance_baseline attribute Show feature_importance_baseline attribute object
        • * string Additional properties
      • hyperparameters array[object]

        List of the available hyperparameters optimized during the fine_parameter_tuning phase as well as specified by the user.

        Hide hyperparameters attributes Show hyperparameters attributes object
        • A positive number showing how much the parameter influences the variation of the loss function. For hyperparameters with values that are not specified by the user but tuned during hyperparameter optimization.

        • name string Required
        • A number between 0 and 1 showing the proportion of influence on the variation of the loss function among all tuned hyperparameters. For hyperparameters with values that are not specified by the user but tuned during hyperparameter optimization.

        • supplied boolean Required

          Indicates if the hyperparameter is specified by the user (true) or optimized (false).

        • value number Required

          The value of the hyperparameter, either optimized or specified by the user.

      • An array of the total feature importance for each feature used from the training data set. This array of objects is returned if data frame analytics trained the model and the request includes total_feature_importance in the include request parameter.

        Hide total_feature_importance attributes Show total_feature_importance attributes object
        • feature_name string Required
        • importance array[object] Required

          A collection of feature importance statistics related to the training data set for this particular feature.

          Hide importance attributes Show importance attributes object
          • mean_magnitude number Required

            The average magnitude of this feature across all the training data. This value is the average of the absolute values of the importance for this feature.

          • max number Required

            The maximum importance value across all the training data for this feature.

          • min number Required

            The minimum importance value across all the training data for this feature.

        • classes array[object] Required

          If the trained model is a classification model, feature importance statistics are gathered per target class value.

          Hide classes attributes Show classes attributes object
          • class_name string Required
          • importance array[object] Required

            A collection of feature importance statistics related to the training data set for this particular feature.

    • Hide model_package attributes Show model_package attributes object
    • location object
      Hide location attribute Show location attribute object
      • index object Required
        Hide index attribute Show index attribute object
    • Hide prefix_strings attributes Show prefix_strings attributes object
      • ingest string

        String prepended to input at ingest

PUT /_ml/trained_models/{model_id}
curl \
 --request PUT 'http://api.example.com/_ml/trained_models/{model_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"compressed_definition":"string","definition":{"preprocessors":[{"frequency_encoding":{"field":"string","feature_name":"string","frequency_map":{"additionalProperty1":42.0,"additionalProperty2":42.0}},"one_hot_encoding":{"field":"string","hot_map":{"additionalProperty1":"string","additionalProperty2":"string"}},"target_mean_encoding":{"field":"string","feature_name":"string","target_map":{"additionalProperty1":42.0,"additionalProperty2":42.0},"default_value":42.0}}],"trained_model":{"tree":{"classification_labels":["string"],"feature_names":["string"],"target_type":"string","tree_structure":[{"decision_type":"string","default_left":true,"leaf_value":42.0,"left_child":42.0,"node_index":42.0,"right_child":42.0,"split_feature":42.0,"split_gain":42.0,"threshold":42.0}]},"tree_node":{"decision_type":"string","default_left":true,"leaf_value":42.0,"left_child":42.0,"node_index":42.0,"right_child":42.0,"split_feature":42.0,"split_gain":42.0,"threshold":42.0},"ensemble":{"aggregate_output":{"logistic_regression":{"weights":42.0},"weighted_sum":{"weights":42.0},"weighted_mode":{"weights":42.0},"exponent":{"weights":42.0}},"classification_labels":["string"],"feature_names":["string"],"target_type":"string","trained_models":[{}]}}},"description":"string","inference_config":{"regression":{"results_field":"string","num_top_feature_importance_values":42.0},"classification":{"num_top_classes":42.0,"num_top_feature_importance_values":42.0,"prediction_field_type":"string","results_field":"string","top_classes_results_field":"string"},"text_classification":{"num_top_classes":42.0,"tokenization":{"bert":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"bert_ja":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"mpnet":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"roberta":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true,"add_prefix_space":true},"xlm_roberta":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true}},"results_field":"string","classification_labels":["string"],"vocabulary":{"index":"string"}},"zero_shot_classification":{"tokenization":{"bert":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"bert_ja":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"mpnet":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"roberta":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true,"add_prefix_space":true},"xlm_roberta":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true}},"hypothesis_template":"string","classification_labels":["string"],"results_field":"string","multi_label":true,"labels":["string"]},"fill_mask":{"mask_token":"string","num_top_classes":42.0,"tokenization":{"bert":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"bert_ja":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"mpnet":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"roberta":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true,"add_prefix_space":true},"xlm_roberta":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true}},"results_field":"string","vocabulary":{"index":"string"}},"learning_to_rank":{"default_params":{"additionalProperty1":{},"additionalProperty2":{}},"feature_extractors":[{}],"num_top_feature_importance_values":42.0},"ner":{"tokenization":{"bert":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"bert_ja":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"mpnet":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"roberta":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true,"add_prefix_space":true},"xlm_roberta":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true}},"results_field":"string","classification_labels":["string"],"vocabulary":{"index":"string"}},"pass_through":{"tokenization":{"bert":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"bert_ja":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"mpnet":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"roberta":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true,"add_prefix_space":true},"xlm_roberta":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true}},"results_field":"string","vocabulary":{"index":"string"}},"text_embedding":{"embedding_size":42.0,"tokenization":{"bert":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"bert_ja":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"mpnet":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"roberta":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true,"add_prefix_space":true},"xlm_roberta":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true}},"results_field":"string","vocabulary":{"index":"string"}},"text_expansion":{"tokenization":{"bert":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"bert_ja":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"mpnet":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"roberta":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true,"add_prefix_space":true},"xlm_roberta":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true}},"results_field":"string","vocabulary":{"index":"string"}},"question_answering":{"num_top_classes":42.0,"tokenization":{"bert":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"bert_ja":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"mpnet":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true},"roberta":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true,"add_prefix_space":true},"xlm_roberta":{"do_lower_case":true,"max_sequence_length":42.0,"span":42.0,"truncate":"first","with_special_tokens":true}},"results_field":"string","max_answer_length":42.0}},"input":{"field_names":"string"},"metadata":{},"model_type":"tree_ensemble","model_size_bytes":42.0,"platform_architecture":"string","tags":["string"],"prefix_strings":{"ingest":"string","search":"string"}}'

Delete an unreferenced trained model Added in 7.10.0

DELETE /_ml/trained_models/{model_id}

The request deletes a trained inference model that is not referenced by an ingest pipeline.

Path parameters

  • model_id string Required

    The unique identifier of the trained model.

Query parameters

  • force boolean

    Forcefully deletes a trained model that is referenced by ingest pipelines or has a started deployment.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ml/trained_models/{model_id}
curl \
 --request DELETE 'http://api.example.com/_ml/trained_models/{model_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when deleting an existing trained inference model.
{
  "acknowledged": true
}

Create or update a trained model alias Added in 7.13.0

PUT /_ml/trained_models/{model_id}/model_aliases/{model_alias}

A trained model alias is a logical name used to reference a single trained model. You can use aliases instead of trained model identifiers to make it easier to reference your models. For example, you can use aliases in inference aggregations and processors. An alias must be unique and refer to only a single trained model. However, you can have multiple aliases for each trained model. If you use this API to update an alias such that it references a different trained model ID and the model uses a different type of data frame analytics, an error occurs. For example, this situation occurs if you have a trained model for regression analysis and a trained model for classification analysis; you cannot reassign an alias from one type of trained model to another. If you use this API to update an alias and there are very few input fields in common between the old and new trained models for the model alias, the API returns a warning.

Path parameters

  • model_id string Required

    The identifier for the trained model that the alias refers to.

  • model_alias string Required

    The alias to create or update. This value cannot end in numbers.

Query parameters

  • reassign boolean

    Specifies whether the alias gets reassigned to the specified trained model if it is already assigned to a different model. If the alias is already assigned and this parameter is false, the API returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /_ml/trained_models/{model_id}/model_aliases/{model_alias}
curl \
 --request PUT 'http://api.example.com/_ml/trained_models/{model_id}/model_aliases/{model_alias}' \
 --header "Authorization: $API_KEY"

Delete a trained model alias Added in 7.13.0

DELETE /_ml/trained_models/{model_id}/model_aliases/{model_alias}

This API deletes an existing model alias that refers to a trained model. If the model alias is missing or refers to a model other than the one identified by the model_id, this API returns an error.

Path parameters

  • model_id string Required

    The trained model ID to which the model alias refers.

  • model_alias string Required

    The model alias to delete.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ml/trained_models/{model_id}/model_aliases/{model_alias}
curl \
 --request DELETE 'http://api.example.com/_ml/trained_models/{model_id}/model_aliases/{model_alias}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when deleting a trained model alias.
{
  "acknowledged": true
}

Get trained model configuration info Added in 7.10.0

GET /_ml/trained_models

Query parameters

  • Specifies what to do when the request:

    • Contains wildcard expressions and there are no models that match.
    • Contains the _all string or no identifiers and there are no matches.
    • Contains wildcard expressions and there are only partial matches.

    If true, it returns an empty array when there are no matches and the subset of results when there are partial matches.

  • Specifies whether the included model definition should be returned as a JSON map (true) or in a custom compressed format (false).

  • Indicates if certain fields should be removed from the configuration on retrieval. This allows the configuration to be in an acceptable format to be retrieved and then added to another cluster.

  • from number

    Skips the specified number of models.

  • include string

    A comma delimited string of optional fields to include in the response body.

    Supported values include:

    • definition: Includes the model definition.
    • feature_importance_baseline: Includes the baseline for feature importance values.
    • hyperparameters: Includes the information about hyperparameters used to train the model. This information consists of the value, the absolute and relative importance of the hyperparameter as well as an indicator of whether it was specified by the user or tuned during hyperparameter optimization.
    • total_feature_importance: Includes the total feature importance for the training data set. The baseline and total feature importance values are returned in the metadata field in the response body.
    • definition_status: Includes the model definition status.

    Values are definition, feature_importance_baseline, hyperparameters, total_feature_importance, or definition_status.

  • size number

    Specifies the maximum number of models to obtain.

  • tags string | array[string]

    A comma delimited string of tags. A trained model can have many tags, or none. When supplied, only trained models that contain all the supplied tags are returned.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • trained_model_configs array[object] Required

      An array of trained model resources, which are sorted by the model_id value in ascending order.

      Hide trained_model_configs attributes Show trained_model_configs attributes object
      • model_id string Required
      • Values are tree_ensemble, lang_ident, or pytorch.

      • tags array[string] Required

        A comma delimited string of tags. A trained model can have many tags, or none.

      • version string
      • Information on the creator of the trained model.

      • create_time string | number

        A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

      • Any field map described in the inference configuration takes precedence.

        Hide default_field_map attribute Show default_field_map attribute object
        • * string Additional properties
      • The free-text description of the trained model.

      • The estimated heap usage in bytes to keep the trained model in memory.

      • The estimated number of operations to use the trained model.

      • True if the full model definition is present.

      • Inference configuration provided when storing the model config

        Hide inference_config attributes Show inference_config attributes object
        • Hide regression attributes Show regression attributes object
        • Hide classification attributes Show classification attributes object
          • Specifies the number of top class predictions to return. Defaults to 0.

          • Specifies the maximum number of feature importance values per document.

          • Specifies the type of the predicted field to write. Acceptable values are: string, number, boolean. When boolean is provided 1.0 is transformed to true and 0.0 to false.

          • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • Specifies the field to which the top classes are written. Defaults to top_classes.

        • Hide text_classification attributes Show text_classification attributes object
          • Specifies the number of top class predictions to return. Defaults to 0.

          • Tokenization options stored in inference configuration

            Hide tokenization attributes Show tokenization attributes object
          • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • Classification labels to apply other than the stored labels. Must have the same deminsions as the default configured labels

          • Hide vocabulary attribute Show vocabulary attribute object
        • Hide zero_shot_classification attributes Show zero_shot_classification attributes object
          • Tokenization options stored in inference configuration

            Hide tokenization attributes Show tokenization attributes object
          • Hypothesis template used when tokenizing labels for prediction

          • classification_labels array[string] Required

            The zero shot classification labels indicating entailment, neutral, and contradiction Must contain exactly and only entailment, neutral, and contradiction

          • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • Indicates if more than one true label exists.

          • labels array[string]

            The labels to predict.

        • Hide fill_mask attributes Show fill_mask attributes object
          • The string/token which will be removed from incoming documents and replaced with the inference prediction(s). In a response, this field contains the mask token for the specified model/tokenizer. Each model and tokenizer has a predefined mask token which cannot be changed. Thus, it is recommended not to set this value in requests. However, if this field is present in a request, its value must match the predefined value for that model/tokenizer, otherwise the request will fail.

          • Specifies the number of top class predictions to return. Defaults to 0.

          • Tokenization options stored in inference configuration

            Hide tokenization attributes Show tokenization attributes object
          • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

          • vocabulary object Required
            Hide vocabulary attribute Show vocabulary attribute object
        • Hide learning_to_rank attributes Show learning_to_rank attributes object
        • ner object
          Hide ner attributes Show ner attributes object
        • Hide pass_through attributes Show pass_through attributes object
        • Hide text_embedding attributes Show text_embedding attributes object
        • Hide text_expansion attributes Show text_expansion attributes object
        • Hide question_answering attributes Show question_answering attributes object
      • input object Required
        Hide input attribute Show input attribute object
        • field_names array[string] Required

          An array of input field names for the model.

      • The license level of the trained model.

      • metadata object
        Hide metadata attributes Show metadata attributes object
        • model_aliases array[string]
        • An object that contains the baseline for feature importance values. For regression analysis, it is a single value. For classification analysis, there is a value for each class.

          Hide feature_importance_baseline attribute Show feature_importance_baseline attribute object
          • * string Additional properties
        • hyperparameters array[object]

          List of the available hyperparameters optimized during the fine_parameter_tuning phase as well as specified by the user.

          Hide hyperparameters attributes Show hyperparameters attributes object
          • A positive number showing how much the parameter influences the variation of the loss function. For hyperparameters with values that are not specified by the user but tuned during hyperparameter optimization.

          • name string Required
          • A number between 0 and 1 showing the proportion of influence on the variation of the loss function among all tuned hyperparameters. For hyperparameters with values that are not specified by the user but tuned during hyperparameter optimization.

          • supplied boolean Required

            Indicates if the hyperparameter is specified by the user (true) or optimized (false).

          • value number Required

            The value of the hyperparameter, either optimized or specified by the user.

        • An array of the total feature importance for each feature used from the training data set. This array of objects is returned if data frame analytics trained the model and the request includes total_feature_importance in the include request parameter.

          Hide total_feature_importance attributes Show total_feature_importance attributes object
          • feature_name string Required
          • importance array[object] Required

            A collection of feature importance statistics related to the training data set for this particular feature.

          • classes array[object] Required

            If the trained model is a classification model, feature importance statistics are gathered per target class value.

      • Hide model_package attributes Show model_package attributes object
      • location object
        Hide location attribute Show location attribute object
        • index object Required
          Hide index attribute Show index attribute object
      • Hide prefix_strings attributes Show prefix_strings attributes object
        • ingest string

          String prepended to input at ingest

GET /_ml/trained_models
curl \
 --request GET 'http://api.example.com/_ml/trained_models' \
 --header "Authorization: $API_KEY"

Get trained models usage info Added in 7.10.0

GET /_ml/trained_models/{model_id}/_stats

You can get usage information for multiple trained models in a single API request by using a comma-separated list of model IDs or a wildcard expression.

Path parameters

  • model_id string | array[string] Required

    The unique identifier of the trained model or a model alias. It can be a comma-separated list or a wildcard expression.

Query parameters

  • Specifies what to do when the request:

    • Contains wildcard expressions and there are no models that match.
    • Contains the _all string or no identifiers and there are no matches.
    • Contains wildcard expressions and there are only partial matches.

    If true, it returns an empty array when there are no matches and the subset of results when there are partial matches.

  • from number

    Skips the specified number of models.

  • size number

    Specifies the maximum number of models to obtain.

Responses

GET /_ml/trained_models/{model_id}/_stats
curl \
 --request GET 'http://api.example.com/_ml/trained_models/{model_id}/_stats' \
 --header "Authorization: $API_KEY"

Get trained models usage info Added in 7.10.0

GET /_ml/trained_models/_stats

You can get usage information for multiple trained models in a single API request by using a comma-separated list of model IDs or a wildcard expression.

Query parameters

  • Specifies what to do when the request:

    • Contains wildcard expressions and there are no models that match.
    • Contains the _all string or no identifiers and there are no matches.
    • Contains wildcard expressions and there are only partial matches.

    If true, it returns an empty array when there are no matches and the subset of results when there are partial matches.

  • from number

    Skips the specified number of models.

  • size number

    Specifies the maximum number of models to obtain.

Responses

GET /_ml/trained_models/_stats
curl \
 --request GET 'http://api.example.com/_ml/trained_models/_stats' \
 --header "Authorization: $API_KEY"

Evaluate a trained model Added in 8.3.0

POST /_ml/trained_models/{model_id}/_infer

Path parameters

  • model_id string Required

    The unique identifier of the trained model.

Query parameters

  • timeout string

    Controls the amount of time to wait for inference results.

application/json

Body Required

  • docs array[object] Required

    An array of objects to pass to the model for inference. The objects should contain a fields matching your configured trained model input. Typically, for NLP models, the field name is text_field. Currently, for NLP models, only a single value is allowed.

    Hide docs attribute Show docs attribute object
    • * object Additional properties
  • Hide inference_config attributes Show inference_config attributes object
    • Hide regression attributes Show regression attributes object
    • Hide classification attributes Show classification attributes object
      • Specifies the number of top class predictions to return. Defaults to 0.

      • Specifies the maximum number of feature importance values per document.

      • Specifies the type of the predicted field to write. Acceptable values are: string, number, boolean. When boolean is provided 1.0 is transformed to true and 0.0 to false.

      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • Specifies the field to which the top classes are written. Defaults to top_classes.

    • Hide text_classification attributes Show text_classification attributes object
      • Specifies the number of top class predictions to return. Defaults to 0.

      • Hide tokenization attributes Show tokenization attributes object
        • truncate string

          Values are first, second, or none.

        • span number

          Span options to apply

      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • Classification labels to apply other than the stored labels. Must have the same deminsions as the default configured labels

    • Hide zero_shot_classification attributes Show zero_shot_classification attributes object
      • Hide tokenization attributes Show tokenization attributes object
        • truncate string

          Values are first, second, or none.

        • span number

          Span options to apply

      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • Update the configured multi label option. Indicates if more than one true label exists. Defaults to the configured value.

      • labels array[string] Required

        The labels to predict.

    • Hide fill_mask attributes Show fill_mask attributes object
      • Specifies the number of top class predictions to return. Defaults to 0.

      • Hide tokenization attributes Show tokenization attributes object
        • truncate string

          Values are first, second, or none.

        • span number

          Span options to apply

      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

    • ner object
      Hide ner attributes Show ner attributes object
      • Hide tokenization attributes Show tokenization attributes object
        • truncate string

          Values are first, second, or none.

        • span number

          Span options to apply

      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

    • Hide pass_through attributes Show pass_through attributes object
      • Hide tokenization attributes Show tokenization attributes object
        • truncate string

          Values are first, second, or none.

        • span number

          Span options to apply

      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

    • Hide text_embedding attributes Show text_embedding attributes object
      • Hide tokenization attributes Show tokenization attributes object
        • truncate string

          Values are first, second, or none.

        • span number

          Span options to apply

      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

    • Hide text_expansion attributes Show text_expansion attributes object
      • Hide tokenization attributes Show tokenization attributes object
        • truncate string

          Values are first, second, or none.

        • span number

          Span options to apply

      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

    • Hide question_answering attributes Show question_answering attributes object
      • question string Required

        The question to answer given the inference context

      • Specifies the number of top class predictions to return. Defaults to 0.

      • Hide tokenization attributes Show tokenization attributes object
        • truncate string

          Values are first, second, or none.

        • span number

          Span options to apply

      • The field that is added to incoming documents to contain the inference prediction. Defaults to predicted_value.

      • The maximum answer length to consider for extraction

Responses

POST /_ml/trained_models/{model_id}/_infer
curl \
 --request POST 'http://api.example.com/_ml/trained_models/{model_id}/_infer' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"docs":[{"additionalProperty1":{},"additionalProperty2":{}}],"inference_config":{"regression":{"results_field":"string","num_top_feature_importance_values":42.0},"classification":{"num_top_classes":42.0,"num_top_feature_importance_values":42.0,"prediction_field_type":"string","results_field":"string","top_classes_results_field":"string"},"text_classification":{"num_top_classes":42.0,"tokenization":{"truncate":"first","span":42.0},"results_field":"string","classification_labels":["string"]},"zero_shot_classification":{"tokenization":{"truncate":"first","span":42.0},"results_field":"string","multi_label":true,"labels":["string"]},"fill_mask":{"num_top_classes":42.0,"tokenization":{"truncate":"first","span":42.0},"results_field":"string"},"ner":{"tokenization":{"truncate":"first","span":42.0},"results_field":"string"},"pass_through":{"tokenization":{"truncate":"first","span":42.0},"results_field":"string"},"text_embedding":{"tokenization":{"truncate":"first","span":42.0},"results_field":"string"},"text_expansion":{"tokenization":{"truncate":"first","span":42.0},"results_field":"string"},"question_answering":{"question":"string","num_top_classes":42.0,"tokenization":{"truncate":"first","span":42.0},"results_field":"string","max_answer_length":42.0}}}'

Create part of a trained model definition Added in 8.0.0

PUT /_ml/trained_models/{model_id}/definition/{part}

Path parameters

  • model_id string Required

    The unique identifier of the trained model.

  • part number Required

    The definition part number. When the definition is loaded for inference the definition parts are streamed in the order of their part number. The first part must be 0 and the final part must be total_parts - 1.

application/json

Body Required

  • definition string Required

    The definition part for the model. Must be a base64 encoded string.

  • The total uncompressed definition length in bytes. Not base64 encoded.

  • total_parts number Required

    The total number of parts that will be uploaded. Must be greater than 0.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /_ml/trained_models/{model_id}/definition/{part}
curl \
 --request PUT 'http://api.example.com/_ml/trained_models/{model_id}/definition/{part}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"definition":"string","total_definition_length":42.0,"total_parts":42.0}'

Create a trained model vocabulary Added in 8.0.0

PUT /_ml/trained_models/{model_id}/vocabulary

This API is supported only for natural language processing (NLP) models. The vocabulary is stored in the index as described in inference_config.*.vocabulary of the trained model definition.

Path parameters

  • model_id string Required

    The unique identifier of the trained model.

application/json

Body Required

  • vocabulary array[string] Required

    The model vocabulary, which must not be empty.

  • merges array[string]

    The optional model merges if required by the tokenizer.

  • scores array[number]

    The optional vocabulary value scores if required by the tokenizer.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /_ml/trained_models/{model_id}/vocabulary
curl \
 --request PUT 'http://api.example.com/_ml/trained_models/{model_id}/vocabulary' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"vocabulary":["string"],"merges":["string"],"scores":[42.0]}'

Start a trained model deployment Added in 8.0.0

POST /_ml/trained_models/{model_id}/deployment/_start

It allocates the model to every machine learning node.

Path parameters

  • model_id string Required

    The unique identifier of the trained model. Currently, only PyTorch models are supported.

Query parameters

  • cache_size number | string

    The inference cache size (in memory outside the JVM heap) per node for the model. The default value is the same size as the model_size_bytes. To disable the cache, 0b can be provided.

  • The number of model allocations on each node where the model is deployed. All allocations on a node share the same copy of the model in memory but use a separate set of threads to evaluate the model. Increasing this value generally increases the throughput. If this setting is greater than the number of hardware threads it will automatically be changed to a value less than the number of hardware threads. If adaptive_allocations is enabled, do not set this value, because it’s automatically set.

  • priority string

    The deployment priority.

    Values are normal or low.

  • Specifies the number of inference requests that are allowed in the queue. After the number of requests exceeds this value, new requests are rejected with a 429 error.

  • Sets the number of threads used by each model allocation during inference. This generally increases the inference speed. The inference process is a compute-bound process; any number greater than the number of available hardware threads on the machine does not increase the inference speed. If this setting is greater than the number of hardware threads it will automatically be changed to a value less than the number of hardware threads.

  • timeout string

    Specifies the amount of time to wait for the model to deploy.

  • wait_for string

    Specifies the allocation status to wait for before returning.

    Supported values include:

    • started: The trained model is started on at least one node.
    • starting: Trained model deployment is starting but it is not yet deployed on any nodes.
    • fully_allocated: Trained model deployment has started on all valid nodes.

    Values are started, starting, or fully_allocated.

application/json

Body

  • Hide adaptive_allocations attributes Show adaptive_allocations attributes object
    • enabled boolean Required

      If true, adaptive_allocations is enabled

    • Specifies the minimum number of allocations to scale to. If set, it must be greater than or equal to 0. If not defined, the deployment scales to 0.

    • Specifies the maximum number of allocations to scale to. If set, it must be greater than or equal to min_number_of_allocations.

Responses

POST /_ml/trained_models/{model_id}/deployment/_start
curl \
 --request POST 'http://api.example.com/_ml/trained_models/{model_id}/deployment/_start' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"adaptive_allocations":{"enabled":true,"min_number_of_allocations":42.0,"max_number_of_allocations":42.0}}'

Stop a trained model deployment Added in 8.0.0

POST /_ml/trained_models/{model_id}/deployment/_stop

Path parameters

  • model_id string Required

    The unique identifier of the trained model.

Query parameters

  • Specifies what to do when the request: contains wildcard expressions and there are no deployments that match; contains the _all string or no identifiers and there are no matches; or contains wildcard expressions and there are only partial matches. By default, it returns an empty array when there are no matches and the subset of results when there are partial matches. If false, the request returns a 404 status code when there are no matches or only partial matches.

  • force boolean

    Forcefully stops the deployment, even if it is used by ingest pipelines. You can't use these pipelines until you restart the model deployment.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
POST /_ml/trained_models/{model_id}/deployment/_stop
curl \
 --request POST 'http://api.example.com/_ml/trained_models/{model_id}/deployment/_stop' \
 --header "Authorization: $API_KEY"

Update a trained model deployment Added in 8.6.0

POST /_ml/trained_models/{model_id}/deployment/_update

Path parameters

  • model_id string Required

    The unique identifier of the trained model. Currently, only PyTorch models are supported.

Query parameters

  • The number of model allocations on each node where the model is deployed. All allocations on a node share the same copy of the model in memory but use a separate set of threads to evaluate the model. Increasing this value generally increases the throughput. If this setting is greater than the number of hardware threads it will automatically be changed to a value less than the number of hardware threads.

application/json

Body

  • The number of model allocations on each node where the model is deployed. All allocations on a node share the same copy of the model in memory but use a separate set of threads to evaluate the model. Increasing this value generally increases the throughput. If this setting is greater than the number of hardware threads it will automatically be changed to a value less than the number of hardware threads. If adaptive_allocations is enabled, do not set this value, because it’s automatically set.

  • Hide adaptive_allocations attributes Show adaptive_allocations attributes object
    • enabled boolean Required

      If true, adaptive_allocations is enabled

    • Specifies the minimum number of allocations to scale to. If set, it must be greater than or equal to 0. If not defined, the deployment scales to 0.

    • Specifies the maximum number of allocations to scale to. If set, it must be greater than or equal to min_number_of_allocations.

Responses

POST /_ml/trained_models/{model_id}/deployment/_update
curl \
 --request POST 'http://api.example.com/_ml/trained_models/{model_id}/deployment/_update' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"number_of_allocations":42.0,"adaptive_allocations":{"enabled":true,"min_number_of_allocations":42.0,"max_number_of_allocations":42.0}}'

Query rules

Query rules enable you to configure per-query rules that are applied at query time to queries that match the specific rule. Query rules are organized into rulesets, collections of query rules that are matched against incoming queries. Query rules are applied using the rule query. If a query matches one or more rules in the ruleset, the query is re-written to apply the rules before searching. This allows pinning documents for only queries that match a specific term.

Learn more about the rule query

Get a query rule Added in 8.15.0

GET /_query_rules/{ruleset_id}/_rule/{rule_id}

Get details about a query rule within a query ruleset.

External documentation

Path parameters

  • ruleset_id string Required

    The unique identifier of the query ruleset containing the rule to retrieve

  • rule_id string Required

    The unique identifier of the query rule within the specified ruleset to retrieve

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • rule_id string Required
    • type string Required

      Values are pinned or exclude.

    • criteria object | array[object] Required

      The criteria that must be met for the rule to be applied. If multiple criteria are specified for a rule, all criteria must be met for the rule to be applied.

      One of:
      Hide attributes Show attributes
      • type string Required

        Values are global, exact, exact_fuzzy, fuzzy, prefix, suffix, contains, lt, lte, gt, gte, or always.

      • metadata string

        The metadata field to match against. This metadata will be used to match against match_criteria sent in the rule. It is required for all criteria types except always.

      • values array[object]

        The values to match against the metadata field. Only one value must match for the criteria to be met. It is required for all criteria types except always.

    • actions object Required
      Hide actions attributes Show actions attributes object
      • ids array[string]

        The unique document IDs of the documents to apply the rule to. Only one of ids or docs may be specified and at least one must be specified.

      • docs array[object]

        The documents to apply the rule to. Only one of ids or docs may be specified and at least one must be specified. There is a maximum value of 100 documents in a rule. You can specify the following attributes for each document:

        • _index: The index of the document to pin.
        • _id: The unique document ID.
        Hide docs attributes Show docs attributes object
    • priority number
GET /_query_rules/{ruleset_id}/_rule/{rule_id}
curl \
 --request GET 'http://api.example.com/_query_rules/{ruleset_id}/_rule/{rule_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _query_rules/my-ruleset/_rule/my-rule1`.
{
  "rule_id": "my-rule1",
  "type": "pinned",
  "criteria": [
    {
      "type": "contains",
      "metadata": "query_string",
      "values": [
        "pugs",
        "puggles"
      ]
    }
  ],
  "actions": {
    "ids": [
      "id1",
      "id2"
    ]
  }
}

Create or update a query rule Added in 8.15.0

PUT /_query_rules/{ruleset_id}/_rule/{rule_id}

Create or update a query rule within a query ruleset.

IMPORTANT: Due to limitations within pinned queries, you can only pin documents using ids or docs, but cannot use both in single rule. It is advised to use one or the other in query rulesets, to avoid errors. Additionally, pinned queries have a maximum limit of 100 pinned hits. If multiple matching rules pin more than 100 documents, only the first 100 documents are pinned in the order they are specified in the ruleset.

Path parameters

  • ruleset_id string Required

    The unique identifier of the query ruleset containing the rule to be created or updated.

  • rule_id string Required

    The unique identifier of the query rule within the specified ruleset to be created or updated.

application/json

Body Required

  • type string Required

    Values are pinned or exclude.

  • criteria object | array[object] Required

    The criteria that must be met for the rule to be applied. If multiple criteria are specified for a rule, all criteria must be met for the rule to be applied.

    One of:
    Hide attributes Show attributes
    • type string Required

      Values are global, exact, exact_fuzzy, fuzzy, prefix, suffix, contains, lt, lte, gt, gte, or always.

    • metadata string

      The metadata field to match against. This metadata will be used to match against match_criteria sent in the rule. It is required for all criteria types except always.

    • values array[object]

      The values to match against the metadata field. Only one value must match for the criteria to be met. It is required for all criteria types except always.

  • actions object Required
    Hide actions attributes Show actions attributes object
    • ids array[string]

      The unique document IDs of the documents to apply the rule to. Only one of ids or docs may be specified and at least one must be specified.

    • docs array[object]

      The documents to apply the rule to. Only one of ids or docs may be specified and at least one must be specified. There is a maximum value of 100 documents in a rule. You can specify the following attributes for each document:

      • _index: The index of the document to pin.
      • _id: The unique document ID.
      Hide docs attributes Show docs attributes object
  • priority number

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_query_rules/{ruleset_id}/_rule/{rule_id}
curl \
 --request PUT 'http://api.example.com/_query_rules/{ruleset_id}/_rule/{rule_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"match_criteria\": {\n    \"query_string\": \"puggles\"\n  }\n}"'
Request example
Run `POST _query_rules/my-ruleset/_test` to test a ruleset. Provide the match criteria that you want to test against.
{
  "match_criteria": {
    "query_string": "puggles"
  }
}

Delete a query rule Added in 8.15.0

DELETE /_query_rules/{ruleset_id}/_rule/{rule_id}

Delete a query rule within a query ruleset. This is a destructive action that is only recoverable by re-adding the same rule with the create or update query rule API.

Path parameters

  • ruleset_id string Required

    The unique identifier of the query ruleset containing the rule to delete

  • rule_id string Required

    The unique identifier of the query rule within the specified ruleset to delete

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_query_rules/{ruleset_id}/_rule/{rule_id}
curl \
 --request DELETE 'http://api.example.com/_query_rules/{ruleset_id}/_rule/{rule_id}' \
 --header "Authorization: $API_KEY"

Get a query ruleset Added in 8.10.0

GET /_query_rules/{ruleset_id}

Get details about a query ruleset.

Path parameters

  • ruleset_id string Required

    The unique identifier of the query ruleset

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • ruleset_id string Required
    • rules array[object] Required

      Rules associated with the query ruleset.

      Hide rules attributes Show rules attributes object
      • rule_id string Required
      • type string Required

        Values are pinned or exclude.

      • criteria object | array[object] Required

        The criteria that must be met for the rule to be applied. If multiple criteria are specified for a rule, all criteria must be met for the rule to be applied.

        One of:
        Hide attributes Show attributes
        • type string Required

          Values are global, exact, exact_fuzzy, fuzzy, prefix, suffix, contains, lt, lte, gt, gte, or always.

        • metadata string

          The metadata field to match against. This metadata will be used to match against match_criteria sent in the rule. It is required for all criteria types except always.

        • values array[object]

          The values to match against the metadata field. Only one value must match for the criteria to be met. It is required for all criteria types except always.

      • actions object Required
        Hide actions attributes Show actions attributes object
        • ids array[string]

          The unique document IDs of the documents to apply the rule to. Only one of ids or docs may be specified and at least one must be specified.

        • docs array[object]

          The documents to apply the rule to. Only one of ids or docs may be specified and at least one must be specified. There is a maximum value of 100 documents in a rule. You can specify the following attributes for each document:

          • _index: The index of the document to pin.
          • _id: The unique document ID.
          Hide docs attributes Show docs attributes object
      • priority number
GET /_query_rules/{ruleset_id}
curl \
 --request GET 'http://api.example.com/_query_rules/{ruleset_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _query_rules/my-ruleset/`.
{
    "ruleset_id": "my-ruleset",
    "rules": [
        {
            "rule_id": "my-rule1",
            "type": "pinned",
            "criteria": [
                {
                    "type": "contains",
                    "metadata": "query_string",
                    "values": [ "pugs", "puggles" ]
                }
            ],
            "actions": {
                "ids": [
                    "id1",
                    "id2"
                ]
            }
        },
        {
            "rule_id": "my-rule2",
            "type": "pinned",
            "criteria": [
                {
                    "type": "fuzzy",
                    "metadata": "query_string",
                    "values": [ "rescue dogs" ]
                }
            ],
            "actions": {
                "docs": [
                    {
                        "_index": "index1",
                        "_id": "id3"
                    },
                    {
                        "_index": "index2",
                        "_id": "id4"
                    }
                ]
            }
        }
    ]
}

Create or update a query ruleset Added in 8.10.0

PUT /_query_rules/{ruleset_id}

There is a limit of 100 rules per ruleset. This limit can be increased by using the xpack.applications.rules.max_rules_per_ruleset cluster setting.

IMPORTANT: Due to limitations within pinned queries, you can only select documents using ids or docs, but cannot use both in single rule. It is advised to use one or the other in query rulesets, to avoid errors. Additionally, pinned queries have a maximum limit of 100 pinned hits. If multiple matching rules pin more than 100 documents, only the first 100 documents are pinned in the order they are specified in the ruleset.

External documentation

Path parameters

  • ruleset_id string Required

    The unique identifier of the query ruleset to be created or updated.

application/json

Body Required

  • rules object | array[object]

    One of:
    Hide attributes Show attributes
    • rule_id string Required
    • type string Required

      Values are pinned or exclude.

    • criteria object | array[object] Required

      The criteria that must be met for the rule to be applied. If multiple criteria are specified for a rule, all criteria must be met for the rule to be applied.

      One of:
      Hide attributes Show attributes
      • type string Required

        Values are global, exact, exact_fuzzy, fuzzy, prefix, suffix, contains, lt, lte, gt, gte, or always.

      • metadata string

        The metadata field to match against. This metadata will be used to match against match_criteria sent in the rule. It is required for all criteria types except always.

      • values array[object]

        The values to match against the metadata field. Only one value must match for the criteria to be met. It is required for all criteria types except always.

    • actions object Required
      Hide actions attributes Show actions attributes object
      • ids array[string]

        The unique document IDs of the documents to apply the rule to. Only one of ids or docs may be specified and at least one must be specified.

      • docs array[object]

        The documents to apply the rule to. Only one of ids or docs may be specified and at least one must be specified. There is a maximum value of 100 documents in a rule. You can specify the following attributes for each document:

        • _index: The index of the document to pin.
        • _id: The unique document ID.
        Hide docs attributes Show docs attributes object
    • priority number

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_query_rules/{ruleset_id}
curl \
 --request PUT 'http://api.example.com/_query_rules/{ruleset_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"rules\": [\n        {\n            \"rule_id\": \"my-rule1\",\n            \"type\": \"pinned\",\n            \"criteria\": [\n                {\n                    \"type\": \"contains\",\n                    \"metadata\": \"user_query\",\n                    \"values\": [ \"pugs\", \"puggles\" ]\n                },\n                {\n                    \"type\": \"exact\",\n                    \"metadata\": \"user_country\",\n                    \"values\": [ \"us\" ]\n                }\n            ],\n            \"actions\": {\n                \"ids\": [\n                    \"id1\",\n                    \"id2\"\n                ]\n            }\n        },\n        {\n            \"rule_id\": \"my-rule2\",\n            \"type\": \"pinned\",\n            \"criteria\": [\n                {\n                    \"type\": \"fuzzy\",\n                    \"metadata\": \"user_query\",\n                    \"values\": [ \"rescue dogs\" ]\n                }\n            ],\n            \"actions\": {\n                \"docs\": [\n                    {\n                        \"_index\": \"index1\",\n                        \"_id\": \"id3\"\n                    },\n                    {\n                        \"_index\": \"index2\",\n                        \"_id\": \"id4\"\n                    }\n                ]\n            }\n        }\n    ]\n}"'
Request example
Run `PUT _query_rules/my-ruleset` to create a new query ruleset. Two rules are associated with `my-ruleset`. `my-rule1` will pin documents with IDs `id1` and `id2` when `user_query` contains `pugs` or `puggles` and `user_country` exactly matches `us`. `my-rule2` will exclude documents from different specified indices with IDs `id3` and `id4` when the `query_string` fuzzily matches `rescue dogs`.
{
    "rules": [
        {
            "rule_id": "my-rule1",
            "type": "pinned",
            "criteria": [
                {
                    "type": "contains",
                    "metadata": "user_query",
                    "values": [ "pugs", "puggles" ]
                },
                {
                    "type": "exact",
                    "metadata": "user_country",
                    "values": [ "us" ]
                }
            ],
            "actions": {
                "ids": [
                    "id1",
                    "id2"
                ]
            }
        },
        {
            "rule_id": "my-rule2",
            "type": "pinned",
            "criteria": [
                {
                    "type": "fuzzy",
                    "metadata": "user_query",
                    "values": [ "rescue dogs" ]
                }
            ],
            "actions": {
                "docs": [
                    {
                        "_index": "index1",
                        "_id": "id3"
                    },
                    {
                        "_index": "index2",
                        "_id": "id4"
                    }
                ]
            }
        }
    ]
}

Delete a query ruleset Added in 8.10.0

DELETE /_query_rules/{ruleset_id}

Remove a query ruleset and its associated data. This is a destructive action that is not recoverable.

Path parameters

  • ruleset_id string Required

    The unique identifier of the query ruleset to delete

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_query_rules/{ruleset_id}
curl \
 --request DELETE 'http://api.example.com/_query_rules/{ruleset_id}' \
 --header "Authorization: $API_KEY"

Get all query rulesets Added in 8.10.0

GET /_query_rules

Get summarized information about the query rulesets.

Query parameters

  • from number

    The offset from the first result to fetch.

  • size number

    The maximum number of results to retrieve.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • results array[object] Required
      Hide results attributes Show results attributes object
      • ruleset_id string Required
      • rule_total_count number Required

        The number of rules associated with the ruleset.

      • A map of criteria type (for example, exact) to the number of rules of that type.

        NOTE: The counts in rule_criteria_types_counts may be larger than the value of rule_total_count because a rule may have multiple criteria.

        Hide rule_criteria_types_counts attribute Show rule_criteria_types_counts attribute object
        • * number Additional properties
      • rule_type_counts object Required

        A map of rule type (for example, pinned) to the number of rules of that type.

        Hide rule_type_counts attribute Show rule_type_counts attribute object
        • * number Additional properties
GET /_query_rules
curl \
 --request GET 'http://api.example.com/_query_rules' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _query_rules/?from=0&size=3`.
{
    "count": 3,
    "results": [
        {
            "ruleset_id": "ruleset-1",
            "rule_total_count": 1,
            "rule_criteria_types_counts": {
                "exact": 1
            }
        },
        {
            "ruleset_id": "ruleset-2",
            "rule_total_count": 2,
            "rule_criteria_types_counts": {
                "exact": 1,
                "fuzzy": 1
            }
        },
        {
            "ruleset_id": "ruleset-3",
            "rule_total_count": 3,
            "rule_criteria_types_counts": {
                "exact": 1,
                "fuzzy": 2
            }
        }
    ]
}

Test a query ruleset Added in 8.10.0

POST /_query_rules/{ruleset_id}/_test

Evaluate match criteria against a query ruleset to identify the rules that would match that criteria.

Path parameters

  • ruleset_id string Required

    The unique identifier of the query ruleset to be created or updated

application/json

Body Required

  • match_criteria object Required

    The match criteria to apply to rules in the given query ruleset. Match criteria should match the keys defined in the criteria.metadata field of the rule.

    Hide match_criteria attribute Show match_criteria attribute object
    • * object Additional properties

Responses

POST /_query_rules/{ruleset_id}/_test
curl \
 --request POST 'http://api.example.com/_query_rules/{ruleset_id}/_test' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"rules\": [\n        {\n            \"rule_id\": \"my-rule1\",\n            \"type\": \"pinned\",\n            \"criteria\": [\n                {\n                    \"type\": \"contains\",\n                    \"metadata\": \"user_query\",\n                    \"values\": [ \"pugs\", \"puggles\" ]\n                },\n                {\n                    \"type\": \"exact\",\n                    \"metadata\": \"user_country\",\n                    \"values\": [ \"us\" ]\n                }\n            ],\n            \"actions\": {\n                \"ids\": [\n                    \"id1\",\n                    \"id2\"\n                ]\n            }\n        },\n        {\n            \"rule_id\": \"my-rule2\",\n            \"type\": \"pinned\",\n            \"criteria\": [\n                {\n                    \"type\": \"fuzzy\",\n                    \"metadata\": \"user_query\",\n                    \"values\": [ \"rescue dogs\" ]\n                }\n            ],\n            \"actions\": {\n                \"docs\": [\n                    {\n                        \"_index\": \"index1\",\n                        \"_id\": \"id3\"\n                    },\n                    {\n                        \"_index\": \"index2\",\n                        \"_id\": \"id4\"\n                    }\n                ]\n            }\n        }\n    ]\n}"'
Request example
Run `PUT _query_rules/my-ruleset` to create a new query ruleset. Two rules are associated with `my-ruleset`. `my-rule1` will pin documents with IDs `id1` and `id2` when `user_query` contains `pugs` or `puggles` and `user_country` exactly matches `us`. `my-rule2` will exclude documents from different specified indices with IDs `id3` and `id4` when the `query_string` fuzzily matches `rescue dogs`.
{
    "rules": [
        {
            "rule_id": "my-rule1",
            "type": "pinned",
            "criteria": [
                {
                    "type": "contains",
                    "metadata": "user_query",
                    "values": [ "pugs", "puggles" ]
                },
                {
                    "type": "exact",
                    "metadata": "user_country",
                    "values": [ "us" ]
                }
            ],
            "actions": {
                "ids": [
                    "id1",
                    "id2"
                ]
            }
        },
        {
            "rule_id": "my-rule2",
            "type": "pinned",
            "criteria": [
                {
                    "type": "fuzzy",
                    "metadata": "user_query",
                    "values": [ "rescue dogs" ]
                }
            ],
            "actions": {
                "docs": [
                    {
                        "_index": "index1",
                        "_id": "id3"
                    },
                    {
                        "_index": "index2",
                        "_id": "id4"
                    }
                ]
            }
        }
    ]
}
Response examples (200)
A successful response from `POST _query_rules/my-ruleset/_test`.
{
  "total_matched_rules": 1,
  "matched_rules": [
    {
      "ruleset_id": "my-ruleset",
      "rule_id": "my-rule1"
    }
  ]
}

Script

Use the script support APIs to get a list of supported script contexts and languages. Use the stored script APIs to manage stored scripts and search templates.

External documentation

Get a script or search template

GET /_scripts/{id}

Retrieves a stored script or search template.

Path parameters

  • id string Required

    The identifier for the stored script or search template.

Query parameters

  • The period to wait for the master node. If the master node is not available before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • _id string Required
    • found boolean Required
    • script object
      Hide script attributes Show script attributes object
      • lang string Required

        Any of:

        Values are painless, expression, mustache, or java.

      • options object
        Hide options attribute Show options attribute object
        • * string Additional properties
      • source string | object Required

        One of:
GET /_scripts/{id}
curl \
 --request GET 'http://api.example.com/_scripts/{id}' \
 --header "Authorization: $API_KEY"

Create or update a script or search template

PUT /_scripts/{id}

Creates or updates a stored script or search template.

External documentation

Path parameters

  • id string Required

    The identifier for the stored script or search template. It must be unique within the cluster.

Query parameters

  • context string

    The context in which the script or search template should run. To prevent errors, the API immediately compiles the script or template in this context. If you specify both this and the <context> path parameter, the API uses the request path parameter.

  • The period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

  • timeout string

    The period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

application/json

Body Required

  • script object Required
    Hide script attributes Show script attributes object
    • lang string Required

      Any of:

      Values are painless, expression, mustache, or java.

    • options object
      Hide options attribute Show options attribute object
      • * string Additional properties
    • source string | object Required

      One of:

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /_scripts/{id}
curl \
 --request PUT 'http://api.example.com/_scripts/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"script\": {\n    \"lang\": \"mustache\",\n    \"source\": {\n      \"query\": {\n        \"match\": {\n          \"message\": \"{{query_string}}\"\n        }\n      },\n      \"from\": \"{{from}}\",\n      \"size\": \"{{size}}\"\n    }\n  }\n}"'
Request examples
Run `PUT _scripts/my-search-template` to create a search template.
{
  "script": {
    "lang": "mustache",
    "source": {
      "query": {
        "match": {
          "message": "{{query_string}}"
        }
      },
      "from": "{{from}}",
      "size": "{{size}}"
    }
  }
}
Run `PUT _scripts/my-stored-script` to create a stored script.
{
  "script": {
    "lang": "painless",
    "source": "Math.log(_score * 2) + params['my_modifier']"
  }
}

Create or update a script or search template

POST /_scripts/{id}

Creates or updates a stored script or search template.

External documentation

Path parameters

  • id string Required

    The identifier for the stored script or search template. It must be unique within the cluster.

Query parameters

  • context string

    The context in which the script or search template should run. To prevent errors, the API immediately compiles the script or template in this context. If you specify both this and the <context> path parameter, the API uses the request path parameter.

  • The period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

  • timeout string

    The period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

application/json

Body Required

  • script object Required
    Hide script attributes Show script attributes object
    • lang string Required

      Any of:

      Values are painless, expression, mustache, or java.

    • options object
      Hide options attribute Show options attribute object
      • * string Additional properties
    • source string | object Required

      One of:

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_scripts/{id}
curl \
 --request POST 'http://api.example.com/_scripts/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"script\": {\n    \"lang\": \"mustache\",\n    \"source\": {\n      \"query\": {\n        \"match\": {\n          \"message\": \"{{query_string}}\"\n        }\n      },\n      \"from\": \"{{from}}\",\n      \"size\": \"{{size}}\"\n    }\n  }\n}"'
Request examples
Run `PUT _scripts/my-search-template` to create a search template.
{
  "script": {
    "lang": "mustache",
    "source": {
      "query": {
        "match": {
          "message": "{{query_string}}"
        }
      },
      "from": "{{from}}",
      "size": "{{size}}"
    }
  }
}
Run `PUT _scripts/my-stored-script` to create a stored script.
{
  "script": {
    "lang": "painless",
    "source": "Math.log(_score * 2) + params['my_modifier']"
  }
}

Delete a script or search template

DELETE /_scripts/{id}

Deletes a stored script or search template.

Path parameters

  • id string Required

    The identifier for the stored script or search template.

Query parameters

  • The period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

  • timeout string

    The period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_scripts/{id}
curl \
 --request DELETE 'http://api.example.com/_scripts/{id}' \
 --header "Authorization: $API_KEY"

Create or update a script or search template

PUT /_scripts/{id}/{context}

Creates or updates a stored script or search template.

External documentation

Path parameters

  • id string Required

    The identifier for the stored script or search template. It must be unique within the cluster.

  • context string Required

    The context in which the script or search template should run. To prevent errors, the API immediately compiles the script or template in this context.

Query parameters

  • context string

    The context in which the script or search template should run. To prevent errors, the API immediately compiles the script or template in this context. If you specify both this and the <context> path parameter, the API uses the request path parameter.

  • The period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

  • timeout string

    The period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

application/json

Body Required

  • script object Required
    Hide script attributes Show script attributes object
    • lang string Required

      Any of:

      Values are painless, expression, mustache, or java.

    • options object
      Hide options attribute Show options attribute object
      • * string Additional properties
    • source string | object Required

      One of:

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /_scripts/{id}/{context}
curl \
 --request PUT 'http://api.example.com/_scripts/{id}/{context}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"script\": {\n    \"lang\": \"mustache\",\n    \"source\": {\n      \"query\": {\n        \"match\": {\n          \"message\": \"{{query_string}}\"\n        }\n      },\n      \"from\": \"{{from}}\",\n      \"size\": \"{{size}}\"\n    }\n  }\n}"'
Request examples
Run `PUT _scripts/my-search-template` to create a search template.
{
  "script": {
    "lang": "mustache",
    "source": {
      "query": {
        "match": {
          "message": "{{query_string}}"
        }
      },
      "from": "{{from}}",
      "size": "{{size}}"
    }
  }
}
Run `PUT _scripts/my-stored-script` to create a stored script.
{
  "script": {
    "lang": "painless",
    "source": "Math.log(_score * 2) + params['my_modifier']"
  }
}

Create or update a script or search template

POST /_scripts/{id}/{context}

Creates or updates a stored script or search template.

External documentation

Path parameters

  • id string Required

    The identifier for the stored script or search template. It must be unique within the cluster.

  • context string Required

    The context in which the script or search template should run. To prevent errors, the API immediately compiles the script or template in this context.

Query parameters

  • context string

    The context in which the script or search template should run. To prevent errors, the API immediately compiles the script or template in this context. If you specify both this and the <context> path parameter, the API uses the request path parameter.

  • The period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

  • timeout string

    The period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

application/json

Body Required

  • script object Required
    Hide script attributes Show script attributes object
    • lang string Required

      Any of:

      Values are painless, expression, mustache, or java.

    • options object
      Hide options attribute Show options attribute object
      • * string Additional properties
    • source string | object Required

      One of:

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_scripts/{id}/{context}
curl \
 --request POST 'http://api.example.com/_scripts/{id}/{context}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"script\": {\n    \"lang\": \"mustache\",\n    \"source\": {\n      \"query\": {\n        \"match\": {\n          \"message\": \"{{query_string}}\"\n        }\n      },\n      \"from\": \"{{from}}\",\n      \"size\": \"{{size}}\"\n    }\n  }\n}"'
Request examples
Run `PUT _scripts/my-search-template` to create a search template.
{
  "script": {
    "lang": "mustache",
    "source": {
      "query": {
        "match": {
          "message": "{{query_string}}"
        }
      },
      "from": "{{from}}",
      "size": "{{size}}"
    }
  }
}
Run `PUT _scripts/my-stored-script` to create a stored script.
{
  "script": {
    "lang": "painless",
    "source": "Math.log(_score * 2) + params['my_modifier']"
  }
}

Run a script Technical preview

GET /_scripts/painless/_execute

Runs a script and returns a result. Use this API to build and test scripts, such as when defining a script for a runtime field. This API requires very few dependencies and is especially useful if you don't have permissions to write documents on a cluster.

The API uses several contexts, which control how scripts are run, what variables are available at runtime, and what the return type is.

Each context requires a script, but additional parameters depend on the context you're using for that script.

application/json

Body

  • context string

    Values are painless_test, filter, score, boolean_field, date_field, double_field, geo_point_field, ip_field, keyword_field, long_field, or composite_field.

  • Hide context_setup attributes Show context_setup attributes object
    • document object Required

      Document that's temporarily indexed in-memory and accessible from the script.

    • index string Required
    • query object

      An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
  • script object
    Hide script attributes Show script attributes object
    • source string | object

      One of:
    • id string
    • params object

      Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

      Hide params attribute Show params attribute object
      • * object Additional properties
    • lang string

      Any of:

      Values are painless, expression, mustache, or java.

    • options object
      Hide options attribute Show options attribute object
      • * string Additional properties

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
GET /_scripts/painless/_execute
curl \
 --request GET 'http://api.example.com/_scripts/painless/_execute' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"script\": {\n    \"source\": \"params.count / params.total\",\n    \"params\": {\n      \"count\": 100.0,\n      \"total\": 1000.0\n    }\n  }\n}"'
Run `POST /_scripts/painless/_execute`. The `painless_test` context is the default context. It runs scripts without additional parameters. The only variable that is available is `params`, which can be used to access user defined values. The result of the script is always converted to a string.
{
  "script": {
    "source": "params.count / params.total",
    "params": {
      "count": 100.0,
      "total": 1000.0
    }
  }
}
Run `POST /_scripts/painless/_execute` with a `filter` context. It treats scripts as if they were run inside a script query. For testing purposes, a document must be provided so that it will be temporarily indexed in-memory and is accessible from the script. More precisely, the `_source`, stored fields, and doc values of such a document are available to the script being tested.
{
  "script": {
    "source": "doc['field'].value.length() <= params.max_length",
    "params": {
      "max_length": 4
    }
  },
  "context": "filter",
  "context_setup": {
    "index": "my-index-000001",
    "document": {
      "field": "four"
    }
  }
}
Run `POST /_scripts/painless/_execute` with a `score` context. It treats scripts as if they were run inside a `script_score` function in a `function_score` query.
{
  "script": {
    "source": "doc['rank'].value / params.max_rank",
    "params": {
      "max_rank": 5.0
    }
  },
  "context": "score",
  "context_setup": {
    "index": "my-index-000001",
    "document": {
      "rank": 4
    }
  }
}
Response examples (200)
A successful response from `POST /_scripts/painless/_execute` with a `painless_test` context.
{
  "result": "0.1"
}
A successful response from `POST /_scripts/painless/_execute` with a `filter` context.
{
  "result": true
}
A successful response from `POST /_scripts/painless/_execute` with a `score` context.
{
  "result": 0.8
}

Run a script Technical preview

POST /_scripts/painless/_execute

Runs a script and returns a result. Use this API to build and test scripts, such as when defining a script for a runtime field. This API requires very few dependencies and is especially useful if you don't have permissions to write documents on a cluster.

The API uses several contexts, which control how scripts are run, what variables are available at runtime, and what the return type is.

Each context requires a script, but additional parameters depend on the context you're using for that script.

application/json

Body

  • context string

    Values are painless_test, filter, score, boolean_field, date_field, double_field, geo_point_field, ip_field, keyword_field, long_field, or composite_field.

  • Hide context_setup attributes Show context_setup attributes object
    • document object Required

      Document that's temporarily indexed in-memory and accessible from the script.

    • index string Required
    • query object

      An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
  • script object
    Hide script attributes Show script attributes object
    • source string | object

      One of:
    • id string
    • params object

      Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

      Hide params attribute Show params attribute object
      • * object Additional properties
    • lang string

      Any of:

      Values are painless, expression, mustache, or java.

    • options object
      Hide options attribute Show options attribute object
      • * string Additional properties

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
POST /_scripts/painless/_execute
curl \
 --request POST 'http://api.example.com/_scripts/painless/_execute' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"script\": {\n    \"source\": \"params.count / params.total\",\n    \"params\": {\n      \"count\": 100.0,\n      \"total\": 1000.0\n    }\n  }\n}"'
Run `POST /_scripts/painless/_execute`. The `painless_test` context is the default context. It runs scripts without additional parameters. The only variable that is available is `params`, which can be used to access user defined values. The result of the script is always converted to a string.
{
  "script": {
    "source": "params.count / params.total",
    "params": {
      "count": 100.0,
      "total": 1000.0
    }
  }
}
Run `POST /_scripts/painless/_execute` with a `filter` context. It treats scripts as if they were run inside a script query. For testing purposes, a document must be provided so that it will be temporarily indexed in-memory and is accessible from the script. More precisely, the `_source`, stored fields, and doc values of such a document are available to the script being tested.
{
  "script": {
    "source": "doc['field'].value.length() <= params.max_length",
    "params": {
      "max_length": 4
    }
  },
  "context": "filter",
  "context_setup": {
    "index": "my-index-000001",
    "document": {
      "field": "four"
    }
  }
}
Run `POST /_scripts/painless/_execute` with a `score` context. It treats scripts as if they were run inside a `script_score` function in a `function_score` query.
{
  "script": {
    "source": "doc['rank'].value / params.max_rank",
    "params": {
      "max_rank": 5.0
    }
  },
  "context": "score",
  "context_setup": {
    "index": "my-index-000001",
    "document": {
      "rank": 4
    }
  }
}
Response examples (200)
A successful response from `POST /_scripts/painless/_execute` with a `painless_test` context.
{
  "result": "0.1"
}
A successful response from `POST /_scripts/painless/_execute` with a `filter` context.
{
  "result": true
}
A successful response from `POST /_scripts/painless/_execute` with a `score` context.
{
  "result": 0.8
}

Get async search results Added in 7.7.0

GET /_async_search/{id}

Retrieve the results of a previously submitted asynchronous search request. If the Elasticsearch security features are enabled, access to the results of a specific async search is restricted to the user or API key that submitted it.

Path parameters

  • id string Required

    A unique identifier for the async search.

Query parameters

  • The length of time that the async search should be available in the cluster. When not specified, the keep_alive set with the corresponding submit async request will be used. Otherwise, it is possible to override the value and extend the validity of the request. When this period expires, the search, if still running, is cancelled. If the search is completed, its saved results are deleted.

  • typed_keys boolean

    Specify whether aggregation and suggester names should be prefixed by their respective types in the response

  • Specifies to wait for the search to be completed up until the provided timeout. Final results will be returned if available before the timeout expires, otherwise the currently available results will be returned once the timeout expires. By default no timeout is set meaning that the currently available results will be returned without any additional wait.

Responses

GET /_async_search/{id}
curl \
 --request GET 'http://api.example.com/_async_search/{id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A succesful response from `GET /_async_search/FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=`.
{
  "id" : "FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=",
  "is_partial" : false, 
  "is_running" : false, 
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986, 
  "completion_time_in_millis" : 1583945903130, 
  "response" : {
    "took" : 12144,
    "timed_out" : false,
    "num_reduce_phases" : 46, 
    "_shards" : {
      "total" : 562,
      "successful" : 188, 
      "skipped" : 0,
      "failed" : 0
    },
    "hits" : {
      "total" : {
        "value" : 456433,
        "relation" : "eq"
      },
      "max_score" : null,
      "hits" : [ ]
    },
    "aggregations" : { 
      "sale_date" :  {
        "buckets" : []
      }
    }
  }
}

Delete an async search Added in 7.7.0

DELETE /_async_search/{id}

If the asynchronous search is still running, it is cancelled. Otherwise, the saved search results are deleted. If the Elasticsearch security features are enabled, the deletion of a specific async search is restricted to: the authenticated user that submitted the original search request; users that have the cancel_task cluster privilege.

Path parameters

  • id string Required

    A unique identifier for the async search.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_async_search/{id}
curl \
 --request DELETE 'http://api.example.com/_async_search/{id}' \
 --header "Authorization: $API_KEY"

Get the async search status Added in 7.11.0

GET /_async_search/status/{id}

Get the status of a previously submitted async search request given its identifier, without retrieving search results. If the Elasticsearch security features are enabled, the access to the status of a specific async search is restricted to:

  • The user or API key that submitted the original async search request.
  • Users that have the monitor cluster privilege or greater privileges.

Path parameters

  • id string Required

    A unique identifier for the async search.

Query parameters

  • The length of time that the async search needs to be available. Ongoing async searches and any saved search results are deleted after this period.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string
    • is_partial boolean Required

      When the query is no longer running, this property indicates whether the search failed or was successfully completed on all shards. While the query is running, is_partial is always set to true.

    • is_running boolean Required

      Indicates whether the search is still running or has completed.


      If the search failed after some shards returned their results or the node that is coordinating the async search dies, results may be partial even though is_running is false.

    • expiration_time string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    • Time unit for milliseconds

    • start_time string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    • Time unit for milliseconds

    • completion_time string | number

      A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

    • Time unit for milliseconds

    • _shards object Required
      Hide _shards attributes Show _shards attributes object
    • Hide _clusters attributes Show _clusters attributes object
    • If the async search completed, this field shows the status code of the search. For example, 200 indicates that the async search was successfully completed. 503 indicates that the async search was completed with an error.

GET /_async_search/status/{id}
curl \
 --request GET 'http://api.example.com/_async_search/status/{id}' \
 --header "Authorization: $API_KEY"
A succesful response from `GET /_async_search/status/FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=`, which retrieves the status of a previously submitted async search without the results.
{
  "id" : "FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=",
  "is_running" : true,
  "is_partial" : true,
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986,
  "_shards" : {
      "total" : 562,
      "successful" : 188, 
      "skipped" : 0,
      "failed" : 0
  }
}
A succesful response from `GET /_async_search/status/FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=` for an async search that has completed. The status response has an additional `completion_status` field that shows the status code of the completed async search.
{
  "id" : "FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=",
  "is_running" : false,
  "is_partial" : false,
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986,
  "_shards" : {
      "total" : 562,
      "successful" : 562,
      "skipped" : 0,
      "failed" : 0
  },
"completion_status" : 200 
}
A response from `GET /_async_search/status/FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=` for an async search that has completed with an error. The status response has an additional `completion_status` field that shows the status code of the completed async search.
{
  "id" : "FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=",
  "is_running" : false,
  "is_partial" : true,
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986,
  "_shards" : {
      "total" : 562,
      "successful" : 450,
      "skipped" : 0,
      "failed" : 112
  },
"completion_status" : 503 
}

Run an async search Added in 7.7.0

POST /_async_search

When the primary sort of the results is an indexed field, shards get sorted based on minimum and maximum value that they hold for that field. Partial results become available following the sort criteria that was requested.

Warning: Asynchronous search does not support scroll or search requests that include only the suggest section.

By default, Elasticsearch does not allow you to store an async search response larger than 10Mb and an attempt to do this results in an error. The maximum allowed size for a stored async search response can be set by changing the search.max_async_search_response_size cluster level setting.

Query parameters

  • Blocks and waits until the search is completed up to a certain timeout. When the async search completes within the timeout, the response won’t include the ID as the results are not stored in the cluster.

  • Specifies how long the async search needs to be available. Ongoing async searches and any saved search results are deleted after this period.

  • If true, results are stored for later retrieval when the search completes within the wait_for_completion_timeout.

  • Whether to ignore if a wildcard indices expression resolves into no concrete indices. (This includes _all string or when no indices have been specified)

  • Indicate if an error should be returned if there is a partial search failure or timeout

  • analyzer string

    The analyzer to use for the query string

  • Specify whether wildcard and prefix queries should be analyzed (default: false)

  • Affects how often partial results become available, which happens whenever shard results are reduced. A partial reduction is performed every time the coordinating node has received a certain number of new shard responses (5 by default).

  • The default value is the only supported value.

  • The default operator for query string query (AND or OR)

    Values are and, AND, or, or OR.

  • df string

    The field to use as default where no field prefix is given in the query string

  • docvalue_fields string | array[string]

    A comma-separated list of fields to return as the docvalue representation of a field for each hit

  • expand_wildcards string | array[string]

    Whether to expand wildcard expression to concrete indices that are open, closed or both.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • explain boolean

    Specify whether to return detailed information about score computation as part of a hit

  • Whether specified concrete, expanded or aliased indices should be ignored when throttled

  • Whether specified concrete indices should be ignored when unavailable (missing or closed)

  • lenient boolean

    Specify whether format-based query failures (such as providing text to a numeric field) should be ignored

  • The number of concurrent shard requests per node this search executes concurrently. This value should be used to limit the impact of the search on the cluster in order to limit the number of concurrent shard requests

  • Specify the node or shard the operation should be performed on (default: random)

  • Specify if request cache should be used for this request or not, defaults to true

  • routing string

    A comma-separated list of specific routing values

  • Search operation type

    Supported values include:

    • query_then_fetch: Documents are scored using local term and document frequencies for the shard. This is usually faster but less accurate.
    • dfs_query_then_fetch: Documents are scored using global term and document frequencies across all shards. This is usually slower but more accurate.

    Values are query_then_fetch or dfs_query_then_fetch.

  • stats array[string]

    Specific 'tag' of the request for logging and statistical purposes

  • stored_fields string | array[string]

    A comma-separated list of stored fields to return as part of a hit

  • Specifies which field to use for suggestions.

  • Specify suggest mode

    Supported values include:

    • missing: Only generate suggestions for terms that are not in the shard.
    • popular: Only suggest terms that occur in more docs on the shard than the original term.
    • always: Suggest any matching suggestions based on terms in the suggest text.

    Values are missing, popular, or always.

  • How many suggestions to return in response

  • The source text for which the suggestions should be returned.

  • The maximum number of documents to collect for each shard, upon reaching which the query execution will terminate early.

  • timeout string

    Explicit operation timeout

  • track_total_hits boolean | number

    Indicate if the number of documents that match the query should be tracked. A number can also be specified, to accurately track the total hit count up to the number.

  • Whether to calculate and return scores even if they are not used for sorting

  • typed_keys boolean

    Specify whether aggregation and suggester names should be prefixed by their respective types in the response

  • Indicates whether hits.total should be rendered as an integer or an object in the rest search response

  • version boolean

    Specify whether to return document version as part of a hit

  • _source boolean | string | array[string]

    True or false to return the _source field or not, or a list of fields to return

  • _source_excludes string | array[string]

    A list of fields to exclude from the returned _source field

  • _source_includes string | array[string]

    A list of fields to extract and return from the _source field

  • Specify whether to return sequence number and primary term of the last modification of each hit

  • q string

    Query in the Lucene query string syntax

  • size number

    Number of hits to return (default: 10)

  • from number

    Starting offset (default: 0)

  • sort string | array[string]

    A comma-separated list of : pairs

application/json

Body

  • collapse object
    External documentation
  • explain boolean

    If true, returns detailed information about score computation as part of a hit.

  • ext object

    Configuration of search extensions defined by Elasticsearch plugins.

    Hide ext attribute Show ext attribute object
    • * object Additional properties
  • from number

    Starting document offset. By default, you cannot page through more than 10,000 hits using the from and size parameters. To page through more hits, use the search_after parameter.

  • Hide highlight attributes Show highlight attributes object
    • A string that contains each boundary character.

    • How far to scan for boundary characters.

    • Values are chars, sentence, or word.

    • Controls which locale is used to search for sentence and word boundaries. This parameter takes a form of a language tag, for example: "en-US", "fr-FR", "ja-JP".

    • force_source boolean Deprecated
    • Values are simple or span.

    • The size of the highlighted fragment in characters.

    • An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
    • If set to a non-negative value, highlighting stops at this defined maximum limit. The rest of the text is not processed, thus not highlighted and no error is returned The max_analyzed_offset query setting does not override the index.highlight.max_analyzed_offset setting, which prevails when it’s set to lower value than the query setting.

    • The amount of text you want to return from the beginning of the field if there are no matching fragments to highlight.

    • The maximum number of fragments to return. If the number of fragments is set to 0, no fragments are returned. Instead, the entire field contents are highlighted and returned. This can be handy when you need to highlight short texts such as a title or address, but fragmentation is not required. If number_of_fragments is 0, fragment_size is ignored.

    • options object
      Hide options attribute Show options attribute object
      • * object Additional properties
    • order string

      Value is score.

    • Controls the number of matching phrases in a document that are considered. Prevents the fvh highlighter from analyzing too many phrases and consuming too much memory. When using matched_fields, phrase_limit phrases per matched field are considered. Raising the limit increases query time and consumes more memory. Only supported by the fvh highlighter.

    • post_tags array[string]

      Use in conjunction with pre_tags to define the HTML tags to use for the highlighted text. By default, highlighted text is wrapped in <em> and </em> tags.

    • pre_tags array[string]

      Use in conjunction with post_tags to define the HTML tags to use for the highlighted text. By default, highlighted text is wrapped in <em> and </em> tags.

    • By default, only fields that contains a query match are highlighted. Set to false to highlight all fields.

    • Value is styled.

    • encoder string

      Values are default or html.

    • fields object Required
  • track_total_hits boolean | number

    Number of hits matching the query to count accurately. If true, the exact number of hits is returned at the cost of some performance. If false, the response does not include the total number of hits matching the query. Defaults to 10,000 hits.

  • indices_boost array[object]

    Boosts the _score of documents from specified indices.

    Hide indices_boost attribute Show indices_boost attribute object
    • * number Additional properties
  • docvalue_fields array[object]

    Array of wildcard (*) patterns. The request returns doc values for field names matching these patterns in the hits.fields property of the response.

    Hide docvalue_fields attributes Show docvalue_fields attributes object
    • field string Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • format string

      The format in which the values are returned.

  • knn object | array[object]

    Defines the approximate kNN search to run.

    One of:
    Hide attributes Show attributes
    • field string Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • query_vector array[number]
    • Hide query_vector_builder attribute Show query_vector_builder attribute object
    • k number

      The final number of nearest neighbors to return as top hits

    • The number of nearest neighbor candidates to consider per shard

    • boost number

      Boost value to apply to kNN scores

    • filter object | array[object]

      Filters for the kNN search query

      One of:

      An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
    • The minimum similarity for a vector to be considered a match

    • Hide inner_hits attributes Show inner_hits attributes object
      • name string
      • size number

        The maximum number of hits to return per inner_hits.

      • from number

        Inner hit starting document offset.

      • collapse object
        External documentation
      • docvalue_fields array[object]
        Hide docvalue_fields attributes Show docvalue_fields attributes object
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • format string

          The format in which the values are returned.

      • explain boolean
      • Hide highlight attributes Show highlight attributes object
        • A string that contains each boundary character.

        • How far to scan for boundary characters.

        • Values are chars, sentence, or word.

        • Controls which locale is used to search for sentence and word boundaries. This parameter takes a form of a language tag, for example: "en-US", "fr-FR", "ja-JP".

        • force_source boolean Deprecated
        • Values are simple or span.

        • The size of the highlighted fragment in characters.

        • An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

          External documentation
        • If set to a non-negative value, highlighting stops at this defined maximum limit. The rest of the text is not processed, thus not highlighted and no error is returned The max_analyzed_offset query setting does not override the index.highlight.max_analyzed_offset setting, which prevails when it’s set to lower value than the query setting.

        • The amount of text you want to return from the beginning of the field if there are no matching fragments to highlight.

        • The maximum number of fragments to return. If the number of fragments is set to 0, no fragments are returned. Instead, the entire field contents are highlighted and returned. This can be handy when you need to highlight short texts such as a title or address, but fragmentation is not required. If number_of_fragments is 0, fragment_size is ignored.

        • options object
          Hide options attribute Show options attribute object
          • * object Additional properties
        • order string

          Value is score.

        • Controls the number of matching phrases in a document that are considered. Prevents the fvh highlighter from analyzing too many phrases and consuming too much memory. When using matched_fields, phrase_limit phrases per matched field are considered. Raising the limit increases query time and consumes more memory. Only supported by the fvh highlighter.

        • post_tags array[string]

          Use in conjunction with pre_tags to define the HTML tags to use for the highlighted text. By default, highlighted text is wrapped in <em> and </em> tags.

        • pre_tags array[string]

          Use in conjunction with post_tags to define the HTML tags to use for the highlighted text. By default, highlighted text is wrapped in <em> and </em> tags.

        • By default, only fields that contains a query match are highlighted. Set to false to highlight all fields.

        • Value is styled.

        • encoder string

          Values are default or html.

        • fields object Required
      • Hide script_fields attribute Show script_fields attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • script object Required
            Hide script attributes Show script attributes object
            • id string
            • params object

              Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            • options object
      • fields array[string]

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • sort string | object | array[string | object]

        One of:

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • _source boolean | object

        Defines how to fetch a source. Fetching can be disabled entirely, or the source can be filtered.

        One of:
      • stored_fields string | array[string]
      • version boolean
    • Hide rescore_vector attribute Show rescore_vector attribute object
      • oversample number Required

        Applies the specified oversample factor to k on the approximate kNN search

  • Minimum _score for matching documents. Documents with a lower _score are not included in search results and results collected by aggregations.

  • An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • profile boolean
  • query object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • rescore object | array[object]

    One of:
    Hide attributes Show attributes
    • query object
      Hide query attributes Show query attributes object
    • Hide learning_to_rank attributes Show learning_to_rank attributes object
      • model_id string Required

        The unique identifier of the trained model uploaded to Elasticsearch

      • params object

        Named parameters to be passed to the query templates used for feature

        Hide params attribute Show params attribute object
        • * object Additional properties
  • Retrieve a script evaluation (based on different fields) for each hit.

    Hide script_fields attribute Show script_fields attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • script object Required
        Hide script attributes Show script attributes object
        • source string | object

          One of:
        • id string
        • params object

          Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          Hide params attribute Show params attribute object
          • * object Additional properties
        • lang string

          Any of:

          Values are painless, expression, mustache, or java.

        • options object
          Hide options attribute Show options attribute object
          • * string Additional properties
  • search_after array[number | string | boolean | null]

    A field value.

  • size number

    The number of hits to return. By default, you cannot page through more than 10,000 hits using the from and size parameters. To page through more hits, use the search_after parameter.

  • slice object
    Hide slice attributes Show slice attributes object
    • field string

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • id string Required
    • max number Required
  • sort string | object | array[string | object]

    One of:

    Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • _source boolean | object

    Defines how to fetch a source. Fetching can be disabled entirely, or the source can be filtered.

    One of:
  • fields array[object]

    Array of wildcard (*) patterns. The request returns values for field names matching these patterns in the hits.fields property of the response.

    Hide fields attributes Show fields attributes object
    • field string Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • format string

      The format in which the values are returned.

  • suggest object
    Hide suggest attribute Show suggest attribute object
    • text string

      Global suggest text, to avoid repetition when the same text is used in several suggesters

  • Maximum number of documents to collect for each shard. If a query reaches this limit, Elasticsearch terminates the query early. Elasticsearch collects documents before sorting. Defaults to 0, which does not terminate query execution early.

  • timeout string

    Specifies the period of time to wait for a response from each shard. If no response is received before the timeout expires, the request fails and returns an error. Defaults to no timeout.

  • If true, calculate and return document scores, even if the scores are not used for sorting.

  • version boolean

    If true, returns document version as part of a hit.

  • If true, returns sequence number and primary term of the last modification of each hit. See Optimistic concurrency control.

  • stored_fields string | array[string]
  • pit object
    Hide pit attributes Show pit attributes object
    • id string Required
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • Hide runtime_mappings attribute Show runtime_mappings attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • fields object

        For type composite

        Hide fields attribute Show fields attribute object
        • * object Additional properties
          Hide * attribute Show * attribute object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • fetch_fields array[object]

        For type lookup

        Hide fetch_fields attributes Show fetch_fields attributes object
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • format string
      • format string

        A custom format for date type runtime fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • script object
        Hide script attributes Show script attributes object
        • source string | object

          One of:
        • id string
        • params object

          Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          Hide params attribute Show params attribute object
          • * object Additional properties
        • lang string

          Any of:

          Values are painless, expression, mustache, or java.

        • options object
          Hide options attribute Show options attribute object
          • * string Additional properties
      • type string Required

        Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

  • stats array[string]

    Stats groups to associate with the search. Each group maintains a statistics aggregation for its associated searches. You can retrieve these stats using the indices stats API.

Responses

POST /_async_search
curl \
 --request POST 'http://api.example.com/_async_search' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"sort\": [\n    { \"date\": { \"order\": \"asc\" } }\n  ],\n  \"aggs\": {\n    \"sale_date\": {\n      \"date_histogram\": {\n        \"field\": \"date\",\n        \"calendar_interval\": \"1d\"\n      }\n    }\n  }\n}"'
Request example
Perform a search request asynchronously with `POST /sales*/_async_search?size=0`. It accepts the same parameters and request body as the search API.
{
  "sort": [
    { "date": { "order": "asc" } }
  ],
  "aggs": {
    "sale_date": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "1d"
      }
    }
  }
}
Response examples (200)
A successful response when performing search asynchronously.
{
  "id" : "FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=",
  "is_partial" : true,
  "is_running" : true,
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986,
  "response" : {
    "took" : 1122,
    "timed_out" : false,
    "num_reduce_phases" : 0,
    "_shards" : {
      "total" : 562,
      "successful" : 3,
      "skipped" : 0,
      "failed" : 0
    },
    "hits" : {
      "total" : {
        "value" : 157483,
        "relation" : "gte"
      },
      "max_score" : null,
      "hits" : [ ]
    }
  }
}

Run an async search Added in 7.7.0

POST /{index}/_async_search

When the primary sort of the results is an indexed field, shards get sorted based on minimum and maximum value that they hold for that field. Partial results become available following the sort criteria that was requested.

Warning: Asynchronous search does not support scroll or search requests that include only the suggest section.

By default, Elasticsearch does not allow you to store an async search response larger than 10Mb and an attempt to do this results in an error. The maximum allowed size for a stored async search response can be set by changing the search.max_async_search_response_size cluster level setting.

Path parameters

  • index string | array[string] Required

    A comma-separated list of index names to search; use _all or empty string to perform the operation on all indices

Query parameters

  • Blocks and waits until the search is completed up to a certain timeout. When the async search completes within the timeout, the response won’t include the ID as the results are not stored in the cluster.

  • Specifies how long the async search needs to be available. Ongoing async searches and any saved search results are deleted after this period.

  • If true, results are stored for later retrieval when the search completes within the wait_for_completion_timeout.

  • Whether to ignore if a wildcard indices expression resolves into no concrete indices. (This includes _all string or when no indices have been specified)

  • Indicate if an error should be returned if there is a partial search failure or timeout

  • analyzer string

    The analyzer to use for the query string

  • Specify whether wildcard and prefix queries should be analyzed (default: false)

  • Affects how often partial results become available, which happens whenever shard results are reduced. A partial reduction is performed every time the coordinating node has received a certain number of new shard responses (5 by default).

  • The default value is the only supported value.

  • The default operator for query string query (AND or OR)

    Values are and, AND, or, or OR.

  • df string

    The field to use as default where no field prefix is given in the query string

  • docvalue_fields string | array[string]

    A comma-separated list of fields to return as the docvalue representation of a field for each hit

  • expand_wildcards string | array[string]

    Whether to expand wildcard expression to concrete indices that are open, closed or both.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • explain boolean

    Specify whether to return detailed information about score computation as part of a hit

  • Whether specified concrete, expanded or aliased indices should be ignored when throttled

  • Whether specified concrete indices should be ignored when unavailable (missing or closed)

  • lenient boolean

    Specify whether format-based query failures (such as providing text to a numeric field) should be ignored

  • The number of concurrent shard requests per node this search executes concurrently. This value should be used to limit the impact of the search on the cluster in order to limit the number of concurrent shard requests

  • Specify the node or shard the operation should be performed on (default: random)

  • Specify if request cache should be used for this request or not, defaults to true

  • routing string

    A comma-separated list of specific routing values

  • Search operation type

    Supported values include:

    • query_then_fetch: Documents are scored using local term and document frequencies for the shard. This is usually faster but less accurate.
    • dfs_query_then_fetch: Documents are scored using global term and document frequencies across all shards. This is usually slower but more accurate.

    Values are query_then_fetch or dfs_query_then_fetch.

  • stats array[string]

    Specific 'tag' of the request for logging and statistical purposes

  • stored_fields string | array[string]

    A comma-separated list of stored fields to return as part of a hit

  • Specifies which field to use for suggestions.

  • Specify suggest mode

    Supported values include:

    • missing: Only generate suggestions for terms that are not in the shard.
    • popular: Only suggest terms that occur in more docs on the shard than the original term.
    • always: Suggest any matching suggestions based on terms in the suggest text.

    Values are missing, popular, or always.

  • How many suggestions to return in response

  • The source text for which the suggestions should be returned.

  • The maximum number of documents to collect for each shard, upon reaching which the query execution will terminate early.

  • timeout string

    Explicit operation timeout

  • track_total_hits boolean | number

    Indicate if the number of documents that match the query should be tracked. A number can also be specified, to accurately track the total hit count up to the number.

  • Whether to calculate and return scores even if they are not used for sorting

  • typed_keys boolean

    Specify whether aggregation and suggester names should be prefixed by their respective types in the response

  • Indicates whether hits.total should be rendered as an integer or an object in the rest search response

  • version boolean

    Specify whether to return document version as part of a hit

  • _source boolean | string | array[string]

    True or false to return the _source field or not, or a list of fields to return

  • _source_excludes string | array[string]

    A list of fields to exclude from the returned _source field

  • _source_includes string | array[string]

    A list of fields to extract and return from the _source field

  • Specify whether to return sequence number and primary term of the last modification of each hit

  • q string

    Query in the Lucene query string syntax

  • size number

    Number of hits to return (default: 10)

  • from number

    Starting offset (default: 0)

  • sort string | array[string]

    A comma-separated list of : pairs

application/json

Body

  • collapse object
    External documentation
  • explain boolean

    If true, returns detailed information about score computation as part of a hit.

  • ext object

    Configuration of search extensions defined by Elasticsearch plugins.

    Hide ext attribute Show ext attribute object
    • * object Additional properties
  • from number

    Starting document offset. By default, you cannot page through more than 10,000 hits using the from and size parameters. To page through more hits, use the search_after parameter.

  • Hide highlight attributes Show highlight attributes object
    • A string that contains each boundary character.

    • How far to scan for boundary characters.

    • Values are chars, sentence, or word.

    • Controls which locale is used to search for sentence and word boundaries. This parameter takes a form of a language tag, for example: "en-US", "fr-FR", "ja-JP".

    • force_source boolean Deprecated
    • Values are simple or span.

    • The size of the highlighted fragment in characters.

    • An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
    • If set to a non-negative value, highlighting stops at this defined maximum limit. The rest of the text is not processed, thus not highlighted and no error is returned The max_analyzed_offset query setting does not override the index.highlight.max_analyzed_offset setting, which prevails when it’s set to lower value than the query setting.

    • The amount of text you want to return from the beginning of the field if there are no matching fragments to highlight.

    • The maximum number of fragments to return. If the number of fragments is set to 0, no fragments are returned. Instead, the entire field contents are highlighted and returned. This can be handy when you need to highlight short texts such as a title or address, but fragmentation is not required. If number_of_fragments is 0, fragment_size is ignored.

    • options object
      Hide options attribute Show options attribute object
      • * object Additional properties
    • order string

      Value is score.

    • Controls the number of matching phrases in a document that are considered. Prevents the fvh highlighter from analyzing too many phrases and consuming too much memory. When using matched_fields, phrase_limit phrases per matched field are considered. Raising the limit increases query time and consumes more memory. Only supported by the fvh highlighter.

    • post_tags array[string]

      Use in conjunction with pre_tags to define the HTML tags to use for the highlighted text. By default, highlighted text is wrapped in <em> and </em> tags.

    • pre_tags array[string]

      Use in conjunction with post_tags to define the HTML tags to use for the highlighted text. By default, highlighted text is wrapped in <em> and </em> tags.

    • By default, only fields that contains a query match are highlighted. Set to false to highlight all fields.

    • Value is styled.

    • encoder string

      Values are default or html.

    • fields object Required
  • track_total_hits boolean | number

    Number of hits matching the query to count accurately. If true, the exact number of hits is returned at the cost of some performance. If false, the response does not include the total number of hits matching the query. Defaults to 10,000 hits.

  • indices_boost array[object]

    Boosts the _score of documents from specified indices.

    Hide indices_boost attribute Show indices_boost attribute object
    • * number Additional properties
  • docvalue_fields array[object]

    Array of wildcard (*) patterns. The request returns doc values for field names matching these patterns in the hits.fields property of the response.

    Hide docvalue_fields attributes Show docvalue_fields attributes object
    • field string Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • format string

      The format in which the values are returned.

  • knn object | array[object]

    Defines the approximate kNN search to run.

    One of:
    Hide attributes Show attributes
    • field string Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • query_vector array[number]
    • Hide query_vector_builder attribute Show query_vector_builder attribute object
    • k number

      The final number of nearest neighbors to return as top hits

    • The number of nearest neighbor candidates to consider per shard

    • boost number

      Boost value to apply to kNN scores

    • filter object | array[object]

      Filters for the kNN search query

      One of:

      An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
    • The minimum similarity for a vector to be considered a match

    • Hide inner_hits attributes Show inner_hits attributes object
      • name string
      • size number

        The maximum number of hits to return per inner_hits.

      • from number

        Inner hit starting document offset.

      • collapse object
        External documentation
      • docvalue_fields array[object]
        Hide docvalue_fields attributes Show docvalue_fields attributes object
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • format string

          The format in which the values are returned.

      • explain boolean
      • Hide highlight attributes Show highlight attributes object
        • A string that contains each boundary character.

        • How far to scan for boundary characters.

        • Values are chars, sentence, or word.

        • Controls which locale is used to search for sentence and word boundaries. This parameter takes a form of a language tag, for example: "en-US", "fr-FR", "ja-JP".

        • force_source boolean Deprecated
        • Values are simple or span.

        • The size of the highlighted fragment in characters.

        • An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

          External documentation
        • If set to a non-negative value, highlighting stops at this defined maximum limit. The rest of the text is not processed, thus not highlighted and no error is returned The max_analyzed_offset query setting does not override the index.highlight.max_analyzed_offset setting, which prevails when it’s set to lower value than the query setting.

        • The amount of text you want to return from the beginning of the field if there are no matching fragments to highlight.

        • The maximum number of fragments to return. If the number of fragments is set to 0, no fragments are returned. Instead, the entire field contents are highlighted and returned. This can be handy when you need to highlight short texts such as a title or address, but fragmentation is not required. If number_of_fragments is 0, fragment_size is ignored.

        • options object
          Hide options attribute Show options attribute object
          • * object Additional properties
        • order string

          Value is score.

        • Controls the number of matching phrases in a document that are considered. Prevents the fvh highlighter from analyzing too many phrases and consuming too much memory. When using matched_fields, phrase_limit phrases per matched field are considered. Raising the limit increases query time and consumes more memory. Only supported by the fvh highlighter.

        • post_tags array[string]

          Use in conjunction with pre_tags to define the HTML tags to use for the highlighted text. By default, highlighted text is wrapped in <em> and </em> tags.

        • pre_tags array[string]

          Use in conjunction with post_tags to define the HTML tags to use for the highlighted text. By default, highlighted text is wrapped in <em> and </em> tags.

        • By default, only fields that contains a query match are highlighted. Set to false to highlight all fields.

        • Value is styled.

        • encoder string

          Values are default or html.

        • fields object Required
      • Hide script_fields attribute Show script_fields attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • script object Required
            Hide script attributes Show script attributes object
            • id string
            • params object

              Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            • options object
      • fields array[string]

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • sort string | object | array[string | object]

        One of:

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • _source boolean | object

        Defines how to fetch a source. Fetching can be disabled entirely, or the source can be filtered.

        One of:
      • stored_fields string | array[string]
      • version boolean
    • Hide rescore_vector attribute Show rescore_vector attribute object
      • oversample number Required

        Applies the specified oversample factor to k on the approximate kNN search

  • Minimum _score for matching documents. Documents with a lower _score are not included in search results and results collected by aggregations.

  • An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • profile boolean
  • query object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • rescore object | array[object]

    One of:
    Hide attributes Show attributes
    • query object
      Hide query attributes Show query attributes object
    • Hide learning_to_rank attributes Show learning_to_rank attributes object
      • model_id string Required

        The unique identifier of the trained model uploaded to Elasticsearch

      • params object

        Named parameters to be passed to the query templates used for feature

        Hide params attribute Show params attribute object
        • * object Additional properties
  • Retrieve a script evaluation (based on different fields) for each hit.

    Hide script_fields attribute Show script_fields attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • script object Required
        Hide script attributes Show script attributes object
        • source string | object

          One of:
        • id string
        • params object

          Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          Hide params attribute Show params attribute object
          • * object Additional properties
        • lang string

          Any of:

          Values are painless, expression, mustache, or java.

        • options object
          Hide options attribute Show options attribute object
          • * string Additional properties
  • search_after array[number | string | boolean | null]

    A field value.

  • size number

    The number of hits to return. By default, you cannot page through more than 10,000 hits using the from and size parameters. To page through more hits, use the search_after parameter.

  • slice object
    Hide slice attributes Show slice attributes object
    • field string

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • id string Required
    • max number Required
  • sort string | object | array[string | object]

    One of:

    Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • _source boolean | object

    Defines how to fetch a source. Fetching can be disabled entirely, or the source can be filtered.

    One of:
  • fields array[object]

    Array of wildcard (*) patterns. The request returns values for field names matching these patterns in the hits.fields property of the response.

    Hide fields attributes Show fields attributes object
    • field string Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • format string

      The format in which the values are returned.

  • suggest object
    Hide suggest attribute Show suggest attribute object
    • text string

      Global suggest text, to avoid repetition when the same text is used in several suggesters

  • Maximum number of documents to collect for each shard. If a query reaches this limit, Elasticsearch terminates the query early. Elasticsearch collects documents before sorting. Defaults to 0, which does not terminate query execution early.

  • timeout string

    Specifies the period of time to wait for a response from each shard. If no response is received before the timeout expires, the request fails and returns an error. Defaults to no timeout.

  • If true, calculate and return document scores, even if the scores are not used for sorting.

  • version boolean

    If true, returns document version as part of a hit.

  • If true, returns sequence number and primary term of the last modification of each hit. See Optimistic concurrency control.

  • stored_fields string | array[string]
  • pit object
    Hide pit attributes Show pit attributes object
    • id string Required
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • Hide runtime_mappings attribute Show runtime_mappings attribute object
    • * object Additional properties
      Hide * attributes Show * attributes object
      • fields object

        For type composite

        Hide fields attribute Show fields attribute object
        • * object Additional properties
          Hide * attribute Show * attribute object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • fetch_fields array[object]

        For type lookup

        Hide fetch_fields attributes Show fetch_fields attributes object
        • field string Required

          Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

        • format string
      • format string

        A custom format for date type runtime fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • script object
        Hide script attributes Show script attributes object
        • source string | object

          One of:
        • id string
        • params object

          Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

          Hide params attribute Show params attribute object
          • * object Additional properties
        • lang string

          Any of:

          Values are painless, expression, mustache, or java.

        • options object
          Hide options attribute Show options attribute object
          • * string Additional properties
      • type string Required

        Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

  • stats array[string]

    Stats groups to associate with the search. Each group maintains a statistics aggregation for its associated searches. You can retrieve these stats using the indices stats API.

Responses

POST /{index}/_async_search
curl \
 --request POST 'http://api.example.com/{index}/_async_search' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"sort\": [\n    { \"date\": { \"order\": \"asc\" } }\n  ],\n  \"aggs\": {\n    \"sale_date\": {\n      \"date_histogram\": {\n        \"field\": \"date\",\n        \"calendar_interval\": \"1d\"\n      }\n    }\n  }\n}"'
Request example
Perform a search request asynchronously with `POST /sales*/_async_search?size=0`. It accepts the same parameters and request body as the search API.
{
  "sort": [
    { "date": { "order": "asc" } }
  ],
  "aggs": {
    "sale_date": {
      "date_histogram": {
        "field": "date",
        "calendar_interval": "1d"
      }
    }
  }
}
Response examples (200)
A successful response when performing search asynchronously.
{
  "id" : "FmRldE8zREVEUzA2ZVpUeGs2ejJFUFEaMkZ5QTVrSTZSaVN3WlNFVmtlWHJsdzoxMDc=",
  "is_partial" : true,
  "is_running" : true,
  "start_time_in_millis" : 1583945890986,
  "expiration_time_in_millis" : 1584377890986,
  "response" : {
    "took" : 1122,
    "timed_out" : false,
    "num_reduce_phases" : 0,
    "_shards" : {
      "total" : 562,
      "successful" : 3,
      "skipped" : 0,
      "failed" : 0
    },
    "hits" : {
      "total" : {
        "value" : 157483,
        "relation" : "gte"
      },
      "max_score" : null,
      "hits" : [ ]
    }
  }
}

Run a scrolling search

GET /_search/scroll

IMPORTANT: The scroll API is no longer recommend for deep pagination. If you need to preserve the index state while paging through more than 10,000 hits, use the search_after parameter with a point in time (PIT).

The scroll API gets large sets of results from a single scrolling search request. To get the necessary scroll ID, submit a search API request that includes an argument for the scroll query parameter. The scroll parameter indicates how long Elasticsearch should retain the search context for the request. The search response returns a scroll ID in the _scroll_id response body parameter. You can then use the scroll ID with the scroll API to retrieve the next batch of results for the request. If the Elasticsearch security features are enabled, the access to the results of a specific scroll ID is restricted to the user or API key that submitted the search.

You can also use the scroll API to specify a new scroll parameter that extends or shortens the retention period for the search context.

IMPORTANT: Results from a scrolling search reflect the state of the index at the time of the initial search request. Subsequent indexing or document changes only affect later search and scroll requests.

External documentation

Query parameters

  • scroll string

    The period to retain the search context for scrolling.

  • scroll_id string Deprecated

    The scroll ID for scrolled search

  • If true, the API response’s hit.total property is returned as an integer. If false, the API response’s hit.total property is returned as an object.

application/json

Body

  • scroll string

    A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • scroll_id string Required

Responses

GET /_search/scroll
curl \
 --request GET 'http://api.example.com/_search/scroll' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"scroll_id\" : \"DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ==\"\n}"'
Request example
Run `GET /_search/scroll` to get the next batch of results for a scrolling search.
{
  "scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ=="
}

Run a scrolling search

POST /_search/scroll

IMPORTANT: The scroll API is no longer recommend for deep pagination. If you need to preserve the index state while paging through more than 10,000 hits, use the search_after parameter with a point in time (PIT).

The scroll API gets large sets of results from a single scrolling search request. To get the necessary scroll ID, submit a search API request that includes an argument for the scroll query parameter. The scroll parameter indicates how long Elasticsearch should retain the search context for the request. The search response returns a scroll ID in the _scroll_id response body parameter. You can then use the scroll ID with the scroll API to retrieve the next batch of results for the request. If the Elasticsearch security features are enabled, the access to the results of a specific scroll ID is restricted to the user or API key that submitted the search.

You can also use the scroll API to specify a new scroll parameter that extends or shortens the retention period for the search context.

IMPORTANT: Results from a scrolling search reflect the state of the index at the time of the initial search request. Subsequent indexing or document changes only affect later search and scroll requests.

External documentation

Query parameters

  • scroll string

    The period to retain the search context for scrolling.

  • scroll_id string Deprecated

    The scroll ID for scrolled search

  • If true, the API response’s hit.total property is returned as an integer. If false, the API response’s hit.total property is returned as an object.

application/json

Body

  • scroll string

    A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • scroll_id string Required

Responses

POST /_search/scroll
curl \
 --request POST 'http://api.example.com/_search/scroll' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"scroll_id\" : \"DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ==\"\n}"'
Request example
Run `GET /_search/scroll` to get the next batch of results for a scrolling search.
{
  "scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ=="
}

Clear a scrolling search

DELETE /_search/scroll

Clear the search context and results for a scrolling search.

External documentation
application/json

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • succeeded boolean Required

      If true, the request succeeded. This does not indicate whether any scrolling search requests were cleared.

    • num_freed number Required

      The number of scrolling search requests cleared.

DELETE /_search/scroll
curl \
 --request DELETE 'http://api.example.com/_search/scroll' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"scroll_id\": \"DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ==\"\n}"'
Request example
Run `DELETE /_search/scroll` to clear the search context and results for a scrolling search.
{
  "scroll_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ=="
}

Run a scrolling search

GET /_search/scroll/{scroll_id}

IMPORTANT: The scroll API is no longer recommend for deep pagination. If you need to preserve the index state while paging through more than 10,000 hits, use the search_after parameter with a point in time (PIT).

The scroll API gets large sets of results from a single scrolling search request. To get the necessary scroll ID, submit a search API request that includes an argument for the scroll query parameter. The scroll parameter indicates how long Elasticsearch should retain the search context for the request. The search response returns a scroll ID in the _scroll_id response body parameter. You can then use the scroll ID with the scroll API to retrieve the next batch of results for the request. If the Elasticsearch security features are enabled, the access to the results of a specific scroll ID is restricted to the user or API key that submitted the search.

You can also use the scroll API to specify a new scroll parameter that extends or shortens the retention period for the search context.

IMPORTANT: Results from a scrolling search reflect the state of the index at the time of the initial search request. Subsequent indexing or document changes only affect later search and scroll requests.

External documentation

Path parameters

Query parameters

  • scroll string

    The period to retain the search context for scrolling.

  • scroll_id string Deprecated

    The scroll ID for scrolled search

  • If true, the API response’s hit.total property is returned as an integer. If false, the API response’s hit.total property is returned as an object.

application/json

Body

  • scroll string

    A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • scroll_id string Required

Responses

GET /_search/scroll/{scroll_id}
curl \
 --request GET 'http://api.example.com/_search/scroll/{scroll_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"scroll_id\" : \"DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ==\"\n}"'
Request example
Run `GET /_search/scroll` to get the next batch of results for a scrolling search.
{
  "scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ=="
}

Run a scrolling search

POST /_search/scroll/{scroll_id}

IMPORTANT: The scroll API is no longer recommend for deep pagination. If you need to preserve the index state while paging through more than 10,000 hits, use the search_after parameter with a point in time (PIT).

The scroll API gets large sets of results from a single scrolling search request. To get the necessary scroll ID, submit a search API request that includes an argument for the scroll query parameter. The scroll parameter indicates how long Elasticsearch should retain the search context for the request. The search response returns a scroll ID in the _scroll_id response body parameter. You can then use the scroll ID with the scroll API to retrieve the next batch of results for the request. If the Elasticsearch security features are enabled, the access to the results of a specific scroll ID is restricted to the user or API key that submitted the search.

You can also use the scroll API to specify a new scroll parameter that extends or shortens the retention period for the search context.

IMPORTANT: Results from a scrolling search reflect the state of the index at the time of the initial search request. Subsequent indexing or document changes only affect later search and scroll requests.

External documentation

Path parameters

Query parameters

  • scroll string

    The period to retain the search context for scrolling.

  • scroll_id string Deprecated

    The scroll ID for scrolled search

  • If true, the API response’s hit.total property is returned as an integer. If false, the API response’s hit.total property is returned as an object.

application/json

Body

  • scroll string

    A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

  • scroll_id string Required

Responses

POST /_search/scroll/{scroll_id}
curl \
 --request POST 'http://api.example.com/_search/scroll/{scroll_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"scroll_id\" : \"DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ==\"\n}"'
Request example
Run `GET /_search/scroll` to get the next batch of results for a scrolling search.
{
  "scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ=="
}

Clear a scrolling search

DELETE /_search/scroll/{scroll_id}

Clear the search context and results for a scrolling search.

External documentation

Path parameters

  • scroll_id string | array[string] Required Deprecated

    A comma-separated list of scroll IDs to clear. To clear all scroll IDs, use _all. IMPORTANT: Scroll IDs can be long. It is recommended to specify scroll IDs in the request body parameter.

application/json

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • succeeded boolean Required

      If true, the request succeeded. This does not indicate whether any scrolling search requests were cleared.

    • num_freed number Required

      The number of scrolling search requests cleared.

DELETE /_search/scroll/{scroll_id}
curl \
 --request DELETE 'http://api.example.com/_search/scroll/{scroll_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"scroll_id\": \"DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ==\"\n}"'
Request example
Run `DELETE /_search/scroll` to clear the search context and results for a scrolling search.
{
  "scroll_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ=="
}

Close a point in time Added in 7.10.0

DELETE /_pit

A point in time must be opened explicitly before being used in search requests. The keep_alive parameter tells Elasticsearch how long it should persist. A point in time is automatically closed when the keep_alive period has elapsed. However, keeping points in time has a cost; close them as soon as they are no longer required for search requests.

application/json

Body

  • id string Required

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • succeeded boolean Required

      If true, all search contexts associated with the point-in-time ID were successfully closed.

    • num_freed number Required

      The number of search contexts that were successfully closed.

DELETE /_pit
curl \
 --request DELETE 'http://api.example.com/_pit' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"id\": \"46ToAwMDaWR5BXV1aWQyKwZub2RlXzMAAAAAAAAAACoBYwADaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQADaWR5BXV1aWQyKgZub2RlXzIAAAAAAAAAAAwBYgACBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==\"\n}"'
Request example
Run `DELETE /_pit` to close a point-in-time.
{
  "id": "46ToAwMDaWR5BXV1aWQyKwZub2RlXzMAAAAAAAAAACoBYwADaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQADaWR5BXV1aWQyKgZub2RlXzIAAAAAAAAAAAwBYgACBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA=="
}
Response examples (200)
A successful response from `DELETE /_pit`.
{
  "succeeded": true, 
  "num_freed": 3     
}

Count search results

GET /_count

Get the number of documents matching a query.

The query can be provided either by using a simple query string as a parameter, or by defining Query DSL within the request body. The query is optional. When no query is provided, the API uses match_all to count all the documents.

The count API supports multi-target syntax. You can run a single count API search across multiple data streams and indices.

The operation is broadcast across all shards. For each shard ID group, a replica is chosen and the search is run against it. This means that replicas increase the scalability of the count.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • analyzer string

    The analyzer to use for the query string. This parameter can be used only when the q query string parameter is specified.

  • If true, wildcard and prefix queries are analyzed. This parameter can be used only when the q query string parameter is specified.

  • The default operator for query string query: AND or OR. This parameter can be used only when the q query string parameter is specified.

    Values are and, AND, or, or OR.

  • df string

    The field to use as a default when no field prefix is given in the query string. This parameter can be used only when the q query string parameter is specified.

  • expand_wildcards string | array[string]

    The type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. It supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • ignore_throttled boolean Deprecated

    If true, concrete, expanded, or aliased indices are ignored when frozen.

  • If false, the request returns an error if it targets a missing or closed index.

  • lenient boolean

    If true, format-based query failures (such as providing text to a numeric field) in the query string will be ignored. This parameter can be used only when the q query string parameter is specified.

  • The minimum _score value that documents must have to be included in the result.

  • The node or shard the operation should be performed on. By default, it is random.

  • routing string

    A custom value used to route operations to a specific shard.

  • The maximum number of documents to collect for each shard. If a query reaches this limit, Elasticsearch terminates the query early. Elasticsearch collects documents before sorting.

    IMPORTANT: Use with caution. Elasticsearch applies this parameter to each shard handling the request. When possible, let Elasticsearch perform early termination automatically. Avoid specifying this parameter for requests that target data streams with backing indices across multiple data tiers.

  • q string

    The query in Lucene query string syntax. This parameter cannot be used with a request body.

application/json

Body

Responses

GET /_count
curl \
 --request GET 'http://api.example.com/_count' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"query\" : {\n    \"term\" : { \"user.id\" : \"kimchy\" }\n  }\n}"'
Request example
Run `GET /my-index-000001/_count?q=user:kimchy`. Alternatively, run `GET /my-index-000001/_count` with the same query in the request body. Both requests count the number of documents in `my-index-000001` with a `user.id` of `kimchy`.
{
  "query" : {
    "term" : { "user.id" : "kimchy" }
  }
}
Response examples (200)
A successful response from `GET /my-index-000001/_count?q=user:kimchy`.
{
  "count": 1,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  }
}

Count search results

POST /_count

Get the number of documents matching a query.

The query can be provided either by using a simple query string as a parameter, or by defining Query DSL within the request body. The query is optional. When no query is provided, the API uses match_all to count all the documents.

The count API supports multi-target syntax. You can run a single count API search across multiple data streams and indices.

The operation is broadcast across all shards. For each shard ID group, a replica is chosen and the search is run against it. This means that replicas increase the scalability of the count.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • analyzer string

    The analyzer to use for the query string. This parameter can be used only when the q query string parameter is specified.

  • If true, wildcard and prefix queries are analyzed. This parameter can be used only when the q query string parameter is specified.

  • The default operator for query string query: AND or OR. This parameter can be used only when the q query string parameter is specified.

    Values are and, AND, or, or OR.

  • df string

    The field to use as a default when no field prefix is given in the query string. This parameter can be used only when the q query string parameter is specified.

  • expand_wildcards string | array[string]

    The type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. It supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • ignore_throttled boolean Deprecated

    If true, concrete, expanded, or aliased indices are ignored when frozen.

  • If false, the request returns an error if it targets a missing or closed index.

  • lenient boolean

    If true, format-based query failures (such as providing text to a numeric field) in the query string will be ignored. This parameter can be used only when the q query string parameter is specified.

  • The minimum _score value that documents must have to be included in the result.

  • The node or shard the operation should be performed on. By default, it is random.

  • routing string

    A custom value used to route operations to a specific shard.

  • The maximum number of documents to collect for each shard. If a query reaches this limit, Elasticsearch terminates the query early. Elasticsearch collects documents before sorting.

    IMPORTANT: Use with caution. Elasticsearch applies this parameter to each shard handling the request. When possible, let Elasticsearch perform early termination automatically. Avoid specifying this parameter for requests that target data streams with backing indices across multiple data tiers.

  • q string

    The query in Lucene query string syntax. This parameter cannot be used with a request body.

application/json

Body

Responses

POST /_count
curl \
 --request POST 'http://api.example.com/_count' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"query\" : {\n    \"term\" : { \"user.id\" : \"kimchy\" }\n  }\n}"'
Request example
Run `GET /my-index-000001/_count?q=user:kimchy`. Alternatively, run `GET /my-index-000001/_count` with the same query in the request body. Both requests count the number of documents in `my-index-000001` with a `user.id` of `kimchy`.
{
  "query" : {
    "term" : { "user.id" : "kimchy" }
  }
}
Response examples (200)
A successful response from `GET /my-index-000001/_count?q=user:kimchy`.
{
  "count": 1,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  }
}