Elasticsearch API

Base URL
http://api.example.com

Elasticsearch provides REST APIs that are used by the UI components and can be called directly to configure and access Elasticsearch features.

Documentation source and versions

This documentation is derived from the main branch of the elasticsearch-specification repository. It is provided under license Attribution-NonCommercial-NoDerivatives 4.0 International. This documentation contains work-in-progress information for future Elastic Stack releases.

Last update on May 6, 2025.

This API is provided under license Apache 2.0.
























Create a behavioral analytics collection Deprecated Technical preview

PUT /_application/analytics/{name}

Path parameters

  • name string Required

    The name of the analytics collection to be created or updated.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

    • name string Required
PUT /_application/analytics/{name}
curl \
 --request PUT 'http://api.example.com/_application/analytics/{name}' \
 --header "Authorization: $API_KEY"

Delete a behavioral analytics collection Deprecated Technical preview

DELETE /_application/analytics/{name}

The associated data stream is also deleted.

Path parameters

  • name string Required

    The name of the analytics collection to be deleted

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_application/analytics/{name}
curl \
 --request DELETE 'http://api.example.com/_application/analytics/{name}' \
 --header "Authorization: $API_KEY"

Get behavioral analytics collections Deprecated Technical preview

GET /_application/analytics

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object Additional properties
      Hide * attribute Show * attribute object
      • event_data_stream object Required
        Hide event_data_stream attribute Show event_data_stream attribute object
GET /_application/analytics
curl \
 --request GET 'http://api.example.com/_application/analytics' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _application/analytics/my*`
{
  "my_analytics_collection": {
      "event_data_stream": {
          "name": "behavioral_analytics-events-my_analytics_collection"
      }
  },
  "my_analytics_collection2": {
      "event_data_stream": {
          "name": "behavioral_analytics-events-my_analytics_collection2"
      }
  }
}

Create a behavioral analytics collection event Deprecated Technical preview

POST /_application/analytics/{collection_name}/event/{event_type} External documentation

Path parameters

  • collection_name string Required

    The name of the behavioral analytics collection.

  • event_type string Required

    The analytics event type.

    Values are page_view, search, or search_click.

Query parameters

  • debug boolean

    Whether the response type has to include more details

application/json

Body Required

object object

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
POST /_application/analytics/{collection_name}/event/{event_type}
curl \
 --request POST 'http://api.example.com/_application/analytics/{collection_name}/event/{event_type}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"session\": {\n    \"id\": \"1797ca95-91c9-4e2e-b1bd-9c38e6f386a9\"\n  },\n  \"user\": {\n    \"id\": \"5f26f01a-bbee-4202-9298-81261067abbd\"\n  },\n  \"search\":{\n    \"query\": \"search term\",\n    \"results\": {\n      \"items\": [\n        {\n          \"document\": {\n            \"id\": \"123\",\n            \"index\": \"products\"\n          }\n        }\n      ],\n      \"total_results\": 10\n    },\n    \"sort\": {\n      \"name\": \"relevance\"\n    },\n    \"search_application\": \"website\"\n  },\n  \"document\":{\n    \"id\": \"123\",\n    \"index\": \"products\"\n  }\n}"'
Request example
Run `POST _application/analytics/my_analytics_collection/event/search_click` to send a `search_click` event to an analytics collection called `my_analytics_collection`.
{
  "session": {
    "id": "1797ca95-91c9-4e2e-b1bd-9c38e6f386a9"
  },
  "user": {
    "id": "5f26f01a-bbee-4202-9298-81261067abbd"
  },
  "search":{
    "query": "search term",
    "results": {
      "items": [
        {
          "document": {
            "id": "123",
            "index": "products"
          }
        }
      ],
      "total_results": 10
    },
    "sort": {
      "name": "relevance"
    },
    "search_application": "website"
  },
  "document":{
    "id": "123",
    "index": "products"
  }
}









Get shard allocation information

GET /_cat/allocation

Get a snapshot of the number of shards allocated to each data node and their disk space.

IMPORTANT: CAT APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications.

Query parameters

  • bytes string

    The unit used to display byte values.

    Values are b, kb, mb, gb, tb, or pb.

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

  • local boolean

    If true, the request computes the list of selected nodes from the local cluster state. If false the list of selected nodes are computed from the cluster state of the master node. In both cases the coordinating node will send requests for further information to each selected node.

  • Period to wait for a connection to the master node.

Responses

GET /_cat/allocation
curl \
 --request GET 'http://api.example.com/_cat/allocation' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET /_cat/allocation?v=true&format=json`. It shows a single shard is allocated to the one node available.
[
  {
    "shards": "1",
    "shards.undesired": "0",
    "write_load.forecast": "0.0",
    "disk.indices.forecast": "260b",
    "disk.indices": "260b",
    "disk.used": "47.3gb",
    "disk.avail": "43.4gb",
    "disk.total": "100.7gb",
    "disk.percent": "46",
    "host": "127.0.0.1",
    "ip": "127.0.0.1",
    "node": "CSUXak2",
    "node.role": "himrst"
  }
]




























































Get datafeeds Added in 7.7.0

GET /_cat/ml/datafeeds/{datafeed_id}

Get configuration and usage information about datafeeds. This API returns a maximum of 10,000 datafeeds. If the Elasticsearch security features are enabled, you must have monitor_ml, monitor, manage_ml, or manage cluster privileges to use this API.

IMPORTANT: CAT APIs are only intended for human consumption using the Kibana console or command line. They are not intended for use by applications. For application consumption, use the get datafeed statistics API.

Path parameters

  • datafeed_id string Required

    A numerical character string that uniquely identifies the datafeed.

Query parameters

  • Specifies what to do when the request:

    • Contains wildcard expressions and there are no datafeeds that match.
    • Contains the _all string or no identifiers and there are no matches.
    • Contains wildcard expressions and there are only partial matches.

    If true, the API returns an empty datafeeds array when there are no matches and the subset of results when there are partial matches. If false, the API returns a 404 status code when there are no matches or only partial matches.

  • h string | array[string]

    Comma-separated list of column names to display.

    Supported values include:

    • ae (or assignment_explanation): For started datafeeds only, contains messages relating to the selection of a node.
    • bc (or buckets.count, bucketsCount): The number of buckets processed.
    • id: A numerical character string that uniquely identifies the datafeed.
    • na (or node.address, nodeAddress): For started datafeeds only, the network address of the node where the datafeed is started.
    • ne (or node.ephemeral_id, nodeEphemeralId): For started datafeeds only, the ephemeral ID of the node where the datafeed is started.
    • ni (or node.id, nodeId): For started datafeeds only, the unique identifier of the node where the datafeed is started.
    • nn (or node.name, nodeName): For started datafeeds only, the name of the node where the datafeed is started.
    • sba (or search.bucket_avg, searchBucketAvg): The average search time per bucket, in milliseconds.
    • sc (or search.count, searchCount): The number of searches run by the datafeed.
    • seah (or search.exp_avg_hour, searchExpAvgHour): The exponential average search time per hour, in milliseconds.
    • st (or search.time, searchTime): The total time the datafeed spent searching, in milliseconds.
    • s (or state): The status of the datafeed: starting, started, stopping, or stopped. If starting, the datafeed has been requested to start but has not yet started. If started, the datafeed is actively receiving data. If stopping, the datafeed has been requested to stop gracefully and is completing its final action. If stopped, the datafeed is stopped and will not receive data until it is re-started.
  • s string | array[string]

    Comma-separated list of column names or column aliases used to sort the response.

    Supported values include:

    • ae (or assignment_explanation): For started datafeeds only, contains messages relating to the selection of a node.
    • bc (or buckets.count, bucketsCount): The number of buckets processed.
    • id: A numerical character string that uniquely identifies the datafeed.
    • na (or node.address, nodeAddress): For started datafeeds only, the network address of the node where the datafeed is started.
    • ne (or node.ephemeral_id, nodeEphemeralId): For started datafeeds only, the ephemeral ID of the node where the datafeed is started.
    • ni (or node.id, nodeId): For started datafeeds only, the unique identifier of the node where the datafeed is started.
    • nn (or node.name, nodeName): For started datafeeds only, the name of the node where the datafeed is started.
    • sba (or search.bucket_avg, searchBucketAvg): The average search time per bucket, in milliseconds.
    • sc (or search.count, searchCount): The number of searches run by the datafeed.
    • seah (or search.exp_avg_hour, searchExpAvgHour): The exponential average search time per hour, in milliseconds.
    • st (or search.time, searchTime): The total time the datafeed spent searching, in milliseconds.
    • s (or state): The status of the datafeed: starting, started, stopping, or stopped. If starting, the datafeed has been requested to start but has not yet started. If started, the datafeed is actively receiving data. If stopping, the datafeed has been requested to stop gracefully and is completing its final action. If stopped, the datafeed is stopped and will not receive data until it is re-started.
  • time string

    The unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string

      The datafeed identifier.

    • state string

      Values are started, stopped, starting, or stopping.

    • For started datafeeds only, contains messages relating to the selection of a node.

    • The number of buckets processed.

    • The number of searches run by the datafeed.

    • The total time the datafeed spent searching, in milliseconds.

    • The average search time per bucket, in milliseconds.

    • The exponential average search time per hour, in milliseconds.

    • node.id string

      The unique identifier of the assigned node. For started datafeeds only, this information pertains to the node upon which the datafeed is started.

    • The name of the assigned node. For started datafeeds only, this information pertains to the node upon which the datafeed is started.

    • The ephemeral identifier of the assigned node. For started datafeeds only, this information pertains to the node upon which the datafeed is started.

    • The network address of the assigned node. For started datafeeds only, this information pertains to the node upon which the datafeed is started.

GET /_cat/ml/datafeeds/{datafeed_id}
curl \
 --request GET 'http://api.example.com/_cat/ml/datafeeds/{datafeed_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _cat/ml/datafeeds?v=true&format=json`.
[
  {
    "id": "datafeed-high_sum_total_sales",
    "state": "stopped",
    "buckets.count": "743",
    "search.count": "7"
  },
  {
    "id": "datafeed-low_request_rate",
    "state": "stopped",
    "buckets.count": "1457",
    "search.count": "3"
  },
  {
    "id": "datafeed-response_code_rates",
    "state": "stopped",
    "buckets.count": "1460",
    "search.count": "18"
  },
  {
    "id": "datafeed-url_scanning",
    "state": "stopped",
    "buckets.count": "1460",
    "search.count": "18"
  }
]
























Get pending task information

GET /_cat/pending_tasks

Get information about cluster-level changes that have not yet taken effect. IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the pending cluster tasks API.

Query parameters

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

  • local boolean

    If true, the request computes the list of selected nodes from the local cluster state. If false the list of selected nodes are computed from the cluster state of the master node. In both cases the coordinating node will send requests for further information to each selected node.

  • Period to wait for a connection to the master node.

  • time string

    Unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
GET /_cat/pending_tasks
curl \
 --request GET 'http://api.example.com/_cat/pending_tasks' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET /_cat/pending_tasks?v=trueh=insertOrder,timeInQueue,priority,source&format=json`.
[
  { "insertOrder": "1685", "timeInQueue": "855ms", "priority": "HIGH", "source": "update-mapping [foo][t]"},
    { "insertOrder": "1686", "timeInQueue": "843ms", "priority": "HIGH", "source": "update-mapping [foo][t]"},
    { "insertOrder": "1693", "timeInQueue": "753ms", "priority": "HIGH", "source": "refresh-mapping [foo][[t]]"},
    { "insertOrder": "1688", "timeInQueue": "816ms", "priority": "HIGH", "source": "update-mapping [foo][t]"},
    { "insertOrder": "1689", "timeInQueue": "802ms", "priority": "HIGH", "source": "update-mapping [foo][t]"},
    { "insertOrder": "1690", "timeInQueue": "787ms", "priority": "HIGH", "source": "update-mapping [foo][t]"},
    { "insertOrder": "1691", "timeInQueue": "773ms", "priority": "HIGH", "source": "update-mapping [foo][t]"}
]








Get shard recovery information

GET /_cat/recovery/{index}

Get information about ongoing and completed shard recoveries. Shard recovery is the process of initializing a shard copy, such as restoring a primary shard from a snapshot or syncing a replica shard from a primary shard. When a shard recovery completes, the recovered shard is available for search and indexing. For data streams, the API returns information about the stream’s backing indices. IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the index recovery API.

Path parameters

  • index string | array[string] Required

    A comma-separated list of data streams, indices, and aliases used to limit the request. Supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

Query parameters

  • If true, the response only includes ongoing shard recoveries.

  • bytes string

    The unit used to display byte values.

    Values are b, kb, mb, gb, tb, or pb.

  • detailed boolean

    If true, the response includes detailed information about shard recoveries.

  • index string | array[string]

    Comma-separated list or wildcard expression of index names to limit the returned information

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

  • time string

    Unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

Responses

GET /_cat/recovery/{index}
curl \
 --request GET 'http://api.example.com/_cat/recovery/{index}' \
 --header "Authorization: $API_KEY"
A successful response from `GET _cat/recovery?v=true&format=json`. In this example, the source and target nodes are the same because the recovery type is `store`, meaning they were read from local storage on node start.
[
  {
    "index": "my-index-000001 ",
    "shard": "0",
    "time": "13ms",
    "type": "store",
    "stage": "done",
    "source_host": "n/a",
    "source_node": "n/a",
    "target_host": "127.0.0.1",
    "target_node": "node-0",
    "repository": "n/a",
    "snapshot": "n/a",
    "files": "0",
    "files_recovered": "0",
    "files_percent": "100.0%",
    "files_total": "13",
    "bytes": "0b",
    "bytes_recovered": "0b",
    "bytes_percent": "100.0%",
    "bytes_total": "9928b",
    "translog_ops": "0",
    "translog_ops_recovered": "0",
    "translog_ops_percent": "100.0%"
  }
]
A successful response from `GET _cat/recovery?v=true&h=i,s,t,ty,st,shost,thost,f,fp,b,bp&format=json`. You can retrieve information about an ongoing recovery for example when you increase the replica count of an index and bring another node online to host the replicas. In this example, the recovery type is `peer`, meaning the shard recovered from another node. The `files` and `bytes` are real-time measurements.
[
  {
    "i": "my-index-000001",
    "s": "0",
    "t": "1252ms",
    "ty": "peer",
    "st": "done",
    "shost": "192.168.1.1",
    "thost": "192.168.1.1",
    "f": "0",
    "fp": "100.0%",
    "b": "0b",
    "bp": "100.0%",
  }
]
A successful response from `GET _cat/recovery?v=true&h=i,s,t,ty,st,rep,snap,f,fp,b,bp&format=json`. You can restore backups of an index using the snapshot and restore API. You can use the cat recovery API to get information about a snapshot recovery.
[
  {
    "i": "my-index-000001",
    "s": "0",
    "t": "1978ms",
    "ty": "snapshot",
    "st": "done",
    "rep": "my-repo",
    "snap": "snap-1",
    "f": "79",
    "fp": "8.0%",
    "b": "12086",
    "bp": "9.0%"
  }
]

Get snapshot repository information Added in 2.1.0

GET /_cat/repositories

Get a list of snapshot repositories for a cluster. IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the get snapshot repository API.

Query parameters

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

  • local boolean

    If true, the request computes the list of selected nodes from the local cluster state. If false the list of selected nodes are computed from the cluster state of the master node. In both cases the coordinating node will send requests for further information to each selected node.

  • Period to wait for a connection to the master node.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string

      The unique repository identifier.

    • type string

      The repository type.

GET /_cat/repositories
curl \
 --request GET 'http://api.example.com/_cat/repositories' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET /_cat/repositories?v=true&format=json`.
[
  {
    "id": "repo1",
    "type": "fs"
  },
  {
    "id": "repo2",
    "type": "s3"
  }
]




Get segment information

GET /_cat/segments/{index}

Get low-level information about the Lucene segments in index shards. For data streams, the API returns information about the backing indices. IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the index segments API.

Path parameters

  • index string | array[string] Required

    A comma-separated list of data streams, indices, and aliases used to limit the request. Supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

Query parameters

  • bytes string

    The unit used to display byte values.

    Values are b, kb, mb, gb, tb, or pb.

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

  • local boolean

    If true, the request computes the list of selected nodes from the local cluster state. If false the list of selected nodes are computed from the cluster state of the master node. In both cases the coordinating node will send requests for further information to each selected node.

  • Period to wait for a connection to the master node.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • index string
    • shard string

      The shard name.

    • prirep string

      The shard type: primary or replica.

    • ip string

      The IP address of the node where it lives.

    • id string
    • segment string

      The segment name, which is derived from the segment generation and used internally to create file names in the directory of the shard.

    • The segment generation number. Elasticsearch increments this generation number for each segment written then uses this number to derive the segment name.

    • The number of documents in the segment. This excludes deleted documents and counts any nested documents separately from their parents. It also excludes documents which were indexed recently and do not yet belong to a segment.

    • The number of deleted documents in the segment, which might be higher or lower than the number of delete operations you have performed. This number excludes deletes that were performed recently and do not yet belong to a segment. Deleted documents are cleaned up by the automatic merge process if it makes sense to do so. Also, Elasticsearch creates extra deleted documents to internally track the recent history of operations on a shard.

    • If true, the segment is synced to disk. Segments that are synced can survive a hard reboot. If false, the data from uncommitted segments is also stored in the transaction log so that Elasticsearch is able to replay changes on the next start.

    • If true, the segment is searchable. If false, the segment has most likely been written to disk but needs a refresh to be searchable.

    • version string
    • compound string

      If true, the segment is stored in a compound file. This means Lucene merged all files from the segment in a single file to save file descriptors.

GET /_cat/segments/{index}
curl \
 --request GET 'http://api.example.com/_cat/segments/{index}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET /_cat/segments?v=true&format=json`.
[
  {
    "index": "test",
    "shard": "0",
    "prirep": "p",
    "ip": "127.0.0.1",
    "segment": "_0",
    "generation": "0",
    "docs.count": "1",
    "docs.deleted": "0",
    "size": "3kb",
    "size.memory": "0",
    "committed": "false",
    "searchable": "true",
    "version": "9.12.0",
    "compound": "true"
  },
  {
    "index": "test1",
    "shard": "0",
    "prirep": "p",
    "ip": "127.0.0.1",
    "segment": "_0",
    "generation": "0",
    "docs.count": "1",
    "docs.deleted": "0",
    "size": "3kb",
    "size.memory": "0",
    "committed": "false",
    "searchable": "true",
    "version": "9.12.0",
    "compound": "true"
  }
]












Get snapshot information Added in 2.1.0

GET /_cat/snapshots/{repository}

Get information about the snapshots stored in one or more repositories. A snapshot is a backup of an index or running Elasticsearch cluster. IMPORTANT: cat APIs are only intended for human consumption using the command line or Kibana console. They are not intended for use by applications. For application consumption, use the get snapshot API.

Path parameters

  • repository string | array[string] Required

    A comma-separated list of snapshot repositories used to limit the request. Accepts wildcard expressions. _all returns all repositories. If any repository fails during the request, Elasticsearch returns an error.

Query parameters

  • If true, the response does not include information from unavailable snapshots.

  • h string | array[string]

    List of columns to appear in the response. Supports simple wildcards.

  • s string | array[string]

    List of columns that determine how the table should be sorted. Sorting defaults to ascending and can be changed by setting :asc or :desc as a suffix to the column name.

  • Period to wait for a connection to the master node.

  • time string

    Unit used to display time values.

    Values are nanos, micros, ms, s, m, h, or d.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • id string

      The unique identifier for the snapshot.

    • The repository name.

    • status string

      The state of the snapshot process. Returned values include: FAILED: The snapshot process failed. INCOMPATIBLE: The snapshot process is incompatible with the current cluster version. IN_PROGRESS: The snapshot process started but has not completed. PARTIAL: The snapshot process completed with a partial success. SUCCESS: The snapshot process completed with a full success.

    • start_epoch number | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • start_time string | object

      A time of day, expressed either as hh:mm, noon, midnight, or an hour/minutes structure.

      One of:
    • end_epoch number | string

      Some APIs will return values such as numbers also as a string (notably epoch timestamps). This behavior is used to capture this behavior while keeping the semantics of the field type.

      Depending on the target language, code generators can keep the union or remove it and leniently parse strings to the target type.

    • end_time string

      Time of day, expressed as HH:MM:SS

    • duration string

      A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • indices string

      The number of indices in the snapshot.

    • The number of successful shards in the snapshot.

    • The number of failed shards in the snapshot.

    • The total number of shards in the snapshot.

    • reason string

      The reason for any snapshot failures.

GET /_cat/snapshots/{repository}
curl \
 --request GET 'http://api.example.com/_cat/snapshots/{repository}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET /_cat/snapshots/repo1?v=true&s=id&format=json`.
[
  {
    "id": "snap1",
    "repository": "repo1",
    "status": "FAILED",
    "start_epoch": "1445616705",
    "start_time": "18:11:45",
    "end_epoch": "1445616978",
    "end_time": "18:16:18",
    "duration": "4.6m",
    "indices": "1",
    "successful_shards": "4",
    "failed_shards": "1",
    "total_shards": "5"
  },
  {
    "id": "snap2",
    "repository": "repo1",
    "status": "SUCCESS",
    "start_epoch": "1445634298",
    "start_time": "23:04:58",
    "end_epoch": "1445634672",
    "end_time": "23:11:12",
    "duration": "6.2m",
    "indices": "2",
    "successful_shards": "10",
    "failed_shards": "0",
    "total_shards": "10"
  }
]









































Clear cluster voting config exclusions Added in 7.0.0

DELETE /_cluster/voting_config_exclusions

Remove master-eligible nodes from the voting configuration exclusion list.

External documentation

Query parameters

  • Period to wait for a connection to the master node.

  • Specifies whether to wait for all excluded nodes to be removed from the cluster before clearing the voting configuration exclusions list. Defaults to true, meaning that all excluded nodes must be removed from the cluster before this API takes any action. If set to false then the voting configuration exclusions list is cleared even if some excluded nodes are still in the cluster.

Responses

DELETE /_cluster/voting_config_exclusions
curl \
 --request DELETE 'http://api.example.com/_cluster/voting_config_exclusions' \
 --header "Authorization: $API_KEY"
























































Clear the archived repositories metering Technical preview

DELETE /_nodes/{node_id}/_repositories_metering/{max_archive_version}

Clear the archived repositories metering information in the cluster.

Path parameters

  • node_id string | array[string] Required

    Comma-separated list of node IDs or names used to limit returned information.

  • max_archive_version number Required

    Specifies the maximum archive_version to be cleared from the archive.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • _nodes object
      Hide _nodes attributes Show _nodes attributes object
      • failures array[object]
        Hide failures attributes Show failures attributes object
      • total number Required

        Total number of nodes selected by the request.

      • successful number Required

        Number of nodes that responded successfully to the request.

      • failed number Required

        Number of nodes that rejected the request or failed to respond. If this value is not 0, a reason for the rejection or failure is included in the response.

    • cluster_name string Required
    • nodes object Required

      Contains repositories metering information for the nodes selected by the request.

      Hide nodes attribute Show nodes attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • repository_name string Required
        • repository_type string Required

          Repository type.

        • repository_location object Required
          Hide repository_location attributes Show repository_location attributes object
        • Time unit for milliseconds

        • Time unit for milliseconds

        • archived boolean Required

          A flag that tells whether or not this object has been archived. When a repository is closed or updated the repository metering information is archived and kept for a certain period of time. This allows retrieving the repository metering information of previous repository instantiations.

        • request_counts object Required
          Hide request_counts attributes Show request_counts attributes object
          • Number of Get Blob Properties requests (Azure)

          • GetBlob number

            Number of Get Blob requests (Azure)

          • Number of List Blobs requests (Azure)

          • PutBlob number

            Number of Put Blob requests (Azure)

          • PutBlock number

            Number of Put Block (Azure)

          • Number of Put Block List requests

          • Number of get object requests (GCP, S3)

          • Number of list objects requests (GCP, S3)

          • Number of insert object requests, including simple, multipart and resumable uploads. Resumable uploads can perform multiple http requests to insert a single object but they are considered as a single request since they are billed as an individual operation. (GCP)

          • Number of PutObject requests (S3)

          • Number of Multipart requests, including CreateMultipartUpload, UploadPart and CompleteMultipartUpload requests (S3)

DELETE /_nodes/{node_id}/_repositories_metering/{max_archive_version}
curl \
 --request DELETE 'http://api.example.com/_nodes/{node_id}/_repositories_metering/{max_archive_version}' \
 --header "Authorization: $API_KEY"




































































Path parameters

  • metric string | array[string] Required

    Limits the information returned to the specific metrics. A comma-separated list of the following options: _all, rest_actions.

Query parameters

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • _nodes object
      Hide _nodes attributes Show _nodes attributes object
      • failures array[object]
        Hide failures attributes Show failures attributes object
      • total number Required

        Total number of nodes selected by the request.

      • successful number Required

        Number of nodes that responded successfully to the request.

      • failed number Required

        Number of nodes that rejected the request or failed to respond. If this value is not 0, a reason for the rejection or failure is included in the response.

    • cluster_name string Required
    • nodes object Required
      Hide nodes attribute Show nodes attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • rest_actions object Required
          Hide rest_actions attribute Show rest_actions attribute object
          • * number Additional properties
        • Time unit for milliseconds

        • Time unit for milliseconds

        • aggregations object Required
          Hide aggregations attribute Show aggregations attribute object
          • * object Additional properties
GET /_nodes/usage/{metric}
curl \
 --request GET 'http://api.example.com/_nodes/usage/{metric}' \
 --header "Authorization: $API_KEY"














Check in a connector Technical preview

PUT /_connector/{connector_id}/_check_in

Update the last_seen field in the connector and set it to the current timestamp.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be checked in

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_check_in
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_check_in' \
 --header "Authorization: $API_KEY"
Response examples (200)
{
    "result": "updated"
}
























Cancel a connector sync job Beta

PUT /_connector/_sync_job/{connector_sync_job_id}/_cancel

Cancel a connector sync job, which sets the status to cancelling and updates cancellation_requested_at to the current time. The connector service is then responsible for setting the status of connector sync jobs to cancelled.

Path parameters

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/_sync_job/{connector_sync_job_id}/_cancel
curl \
 --request PUT 'http://api.example.com/_connector/_sync_job/{connector_sync_job_id}/_cancel' \
 --header "Authorization: $API_KEY"




Claim a connector sync job Technical preview

PUT /_connector/_sync_job/{connector_sync_job_id}/_claim

This action updates the job status to in_progress and sets the last_seen and started_at timestamps to the current time. Additionally, it can set the sync_cursor property for the sync job.

This API is not intended for direct connector management by users. It supports the implementation of services that utilize the connector protocol to communicate with Elasticsearch.

To sync data using self-managed connectors, you need to deploy the Elastic connector service on your own infrastructure. This service runs automatically on Elastic Cloud for Elastic managed connectors.

Path parameters

application/json

Body Required

  • The cursor object from the last incremental sync job. This should reference the sync_cursor field in the connector state for which the job runs.

  • worker_hostname string Required

    The host name of the current system that will run the job.

Responses

PUT /_connector/_sync_job/{connector_sync_job_id}/_claim
curl \
 --request PUT 'http://api.example.com/_connector/_sync_job/{connector_sync_job_id}/_claim' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"sync_cursor":{},"worker_hostname":"string"}'




Delete a connector sync job Beta

DELETE /_connector/_sync_job/{connector_sync_job_id}

Remove a connector sync job and its associated data. This is a destructive action that is not recoverable.

Path parameters

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_connector/_sync_job/{connector_sync_job_id}
curl \
 --request DELETE 'http://api.example.com/_connector/_sync_job/{connector_sync_job_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
{
  "acknowledged": true
}

Set a connector sync job error Technical preview

PUT /_connector/_sync_job/{connector_sync_job_id}/_error

Set the error field for a connector sync job and set its status to error.

To sync data using self-managed connectors, you need to deploy the Elastic connector service on your own infrastructure. This service runs automatically on Elastic Cloud for Elastic managed connectors.

Path parameters

application/json

Body Required

  • error string Required

    The error for the connector sync job error field.

Responses

PUT /_connector/_sync_job/{connector_sync_job_id}/_error
curl \
 --request PUT 'http://api.example.com/_connector/_sync_job/{connector_sync_job_id}/_error' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"error\": \"some-error\"\n}"'
Request example
{
    "error": "some-error"
}
























Update the connector error field Technical preview

PUT /_connector/{connector_id}/_error

Set the error field for the connector. If the error provided in the request body is non-null, the connector’s status is updated to error. Otherwise, if the error is reset to null, the connector status is updated to connected.

Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

  • error string | null

    One of:

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_error
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_error' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"error\": \"Houston, we have a problem!\"\n}"'
Request example
{
    "error": "Houston, we have a problem!"
}
Response examples (200)
{
  "result": "updated"
}
















Path parameters

  • connector_id string Required

    The unique identifier of the connector to be updated

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • result string Required

      Values are created, updated, deleted, not_found, or noop.

PUT /_connector/{connector_id}/_name
curl \
 --request PUT 'http://api.example.com/_connector/{connector_id}/_name' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"name\": \"Custom connector\",\n    \"description\": \"This is my customized connector\"\n}"'
Request example
{
    "name": "Custom connector",
    "description": "This is my customized connector"
}
Response examples (200)
{
  "result": "updated"
}





















Get auto-follow patterns Added in 6.5.0

GET /_ccr/auto_follow/{name}

Get cross-cluster replication auto-follow patterns.

External documentation

Path parameters

  • name string Required

    The auto-follow pattern collection that you want to retrieve. If you do not specify a name, the API returns information for all collections.

Query parameters

  • The period to wait for a connection to the master node. If the master node is not available before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

Responses

GET /_ccr/auto_follow/{name}
curl \
 --request GET 'http://api.example.com/_ccr/auto_follow/{name}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET /_ccr/auto_follow/my_auto_follow_pattern`, which gets auto-follow patterns.
{
  "patterns": [
    {
      "name": "my_auto_follow_pattern",
      "pattern": {
        "active": true,
        "remote_cluster" : "remote_cluster",
        "leader_index_patterns" :
        [
          "leader_index*"
        ],
        "leader_index_exclusion_patterns":
        [
          "leader_index_001"
        ],
        "follow_index_pattern" : "{{leader_index}}-follower"
      }
    }
  ]
}




















Forget a follower Added in 6.7.0

POST /{index}/_ccr/forget_follower

Remove the cross-cluster replication follower retention leases from the leader.

A following index takes out retention leases on its leader index. These leases are used to increase the likelihood that the shards of the leader index retain the history of operations that the shards of the following index need to run replication. When a follower index is converted to a regular index by the unfollow API (either by directly calling the API or by index lifecycle management tasks), these leases are removed. However, removal of the leases can fail, for example when the remote cluster containing the leader index is unavailable. While the leases will eventually expire on their own, their extended existence can cause the leader index to hold more history than necessary and prevent index lifecycle management from performing some operations on the leader index. This API exists to enable manually removing the leases when the unfollow API is unable to do so.

NOTE: This API does not stop replication by a following index. If you use this API with a follower index that is still actively following, the following index will add back retention leases on the leader. The only purpose of this API is to handle the case of failure to remove the following retention leases after the unfollow API is invoked.

External documentation

Path parameters

  • index string Required

    the name of the leader index for which specified follower retention leases should be removed

Query parameters

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

application/json

Responses

POST /{index}/_ccr/forget_follower
curl \
 --request POST 'http://api.example.com/{index}/_ccr/forget_follower' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"follower_cluster\" : \"\u003cfollower_cluster\u003e\",\n  \"follower_index\" : \"\u003cfollower_index\u003e\",\n  \"follower_index_uuid\" : \"\u003cfollower_index_uuid\u003e\",\n  \"leader_remote_cluster\" : \"\u003cleader_remote_cluster\u003e\"\n}"'
Request example
Run `POST /<leader_index>/_ccr/forget_follower`.
{
  "follower_cluster" : "<follower_cluster>",
  "follower_index" : "<follower_index>",
  "follower_index_uuid" : "<follower_index_uuid>",
  "leader_remote_cluster" : "<leader_remote_cluster>"
}
Response examples (200)
A successful response for removing the follower retention leases from the leader index.
{
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0,
    "failures" : [ ]
  }
}

Get auto-follow patterns Added in 6.5.0

GET /_ccr/auto_follow

Get cross-cluster replication auto-follow patterns.

External documentation

Query parameters

  • The period to wait for a connection to the master node. If the master node is not available before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

Responses

GET /_ccr/auto_follow
curl \
 --request GET 'http://api.example.com/_ccr/auto_follow' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET /_ccr/auto_follow/my_auto_follow_pattern`, which gets auto-follow patterns.
{
  "patterns": [
    {
      "name": "my_auto_follow_pattern",
      "pattern": {
        "active": true,
        "remote_cluster" : "remote_cluster",
        "leader_index_patterns" :
        [
          "leader_index*"
        ],
        "leader_index_exclusion_patterns":
        [
          "leader_index_001"
        ],
        "follow_index_pattern" : "{{leader_index}}-follower"
      }
    }
  ]
}

Pause an auto-follow pattern Added in 7.5.0

POST /_ccr/auto_follow/{name}/pause

Pause a cross-cluster replication auto-follow pattern. When the API returns, the auto-follow pattern is inactive. New indices that are created on the remote cluster and match the auto-follow patterns are ignored.

You can resume auto-following with the resume auto-follow pattern API. When it resumes, the auto-follow pattern is active again and automatically configures follower indices for newly created indices on the remote cluster that match its patterns. Remote indices that were created while the pattern was paused will also be followed, unless they have been deleted or closed in the interim.

External documentation

Path parameters

  • name string Required

    The name of the auto-follow pattern to pause.

Query parameters

  • The period to wait for a connection to the master node. If the master node is not available before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_ccr/auto_follow/{name}/pause
curl \
 --request POST 'http://api.example.com/_ccr/auto_follow/{name}/pause' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `POST /_ccr/auto_follow/my_auto_follow_pattern/pause`, which pauses an auto-follow pattern.
{
  "acknowledged" : true
}




Resume an auto-follow pattern Added in 7.5.0

POST /_ccr/auto_follow/{name}/resume

Resume a cross-cluster replication auto-follow pattern that was paused. The auto-follow pattern will resume configuring following indices for newly created indices that match its patterns on the remote cluster. Remote indices created while the pattern was paused will also be followed unless they have been deleted or closed in the interim.

External documentation

Path parameters

  • name string Required

    The name of the auto-follow pattern to resume.

Query parameters

  • The period to wait for a connection to the master node. If the master node is not available before the timeout expires, the request fails and returns an error. It can also be set to -1 to indicate that the request should never timeout.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_ccr/auto_follow/{name}/resume
curl \
 --request POST 'http://api.example.com/_ccr/auto_follow/{name}/resume' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response `POST /_ccr/auto_follow/my_auto_follow_pattern/resume`, which resumes an auto-follow pattern.
{
  "acknowledged" : true
}
































Get data stream lifecycles Added in 8.11.0

GET /_data_stream/{name}/_lifecycle

Get the data stream lifecycle configuration of one or more data streams.

Path parameters

  • name string | array[string] Required

    Comma-separated list of data streams to limit the request. Supports wildcards (*). To target all data streams, omit this parameter or use * or _all.

Query parameters

  • expand_wildcards string | array[string]

    Type of data stream that wildcard patterns can match. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If true, return all default settings in the response.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • data_streams array[object] Required
      Hide data_streams attributes Show data_streams attributes object
      • name string Required
      • Hide lifecycle attributes Show lifecycle attributes object
GET /_data_stream/{name}/_lifecycle
curl \
 --request GET 'http://api.example.com/_data_stream/{name}/_lifecycle' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `GET _lifecycle/stats?human&pretty`.
{
  "data_streams": [
    {
      "name": "my-data-stream-1",
      "lifecycle": {
        "enabled": true,
        "data_retention": "7d"
      }
    },
    {
      "name": "my-data-stream-2",
      "lifecycle": {
        "enabled": true,
        "data_retention": "7d"
      }
    }
  ]
}




Downsample an index Technical preview

POST /{index}/_downsample/{target_index}

Aggregate a time series (TSDS) index and store pre-computed statistical summaries (min, max, sum, value_count and avg) for each metric field grouped by a configured time interval. For example, a TSDS index that contains metrics sampled every 10 seconds can be downsampled to an hourly index. All documents within an hour interval are summarized and stored as a single document in the downsample index.

NOTE: Only indices in a time series data stream are supported. Neither field nor document level security can be defined on the source index. The source index must be read only (index.blocks.write: true).

Path parameters

  • index string Required

    Name of the time series index to downsample.

  • target_index string Required

    Name of the index to create.

application/json

Body Required

  • fixed_interval string Required

    A date histogram interval. Similar to Duration with additional units: w (week), M (month), q (quarter) and y (year)

Responses

POST /{index}/_downsample/{target_index}
curl \
 --request POST 'http://api.example.com/{index}/_downsample/{target_index}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"fixed_interval\": \"1d\"\n}"'
Request example
{
  "fixed_interval": "1d"
}




















Promote a data stream Added in 7.9.0

POST /_data_stream/_promote/{name}

Promote a data stream from a replicated data stream managed by cross-cluster replication (CCR) to a regular data stream.

With CCR auto following, a data stream from a remote cluster can be replicated to the local cluster. These data streams can't be rolled over in the local cluster. These replicated data streams roll over only if the upstream data stream rolls over. In the event that the remote cluster is no longer available, the data stream in the local cluster can be promoted to a regular data stream, which allows these data streams to be rolled over in the local cluster.

NOTE: When promoting a data stream, ensure the local cluster has a data stream enabled index template that matches the data stream. If this is missing, the data stream will not be able to roll over until a matching index template is created. This will affect the lifecycle management of the data stream and interfere with the data stream size and retention.

Path parameters

  • name string Required

    The name of the data stream

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

POST /_data_stream/_promote/{name}
curl \
 --request POST 'http://api.example.com/_data_stream/_promote/{name}' \
 --header "Authorization: $API_KEY"








Bulk index or delete documents

PUT /{index}/_bulk

Perform multiple index, create, delete, and update actions in a single request. This reduces overhead and can greatly increase indexing speed.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:

  • To use the create action, you must have the create_doc, create, index, or write index privilege. Data streams support only the create action.
  • To use the index action, you must have the create, index, or write index privilege.
  • To use the delete action, you must have the delete or write index privilege.
  • To use the update action, you must have the index or write index privilege.
  • To automatically create a data stream or index with a bulk API request, you must have the auto_configure, create_index, or manage index privilege.
  • To make the result of a bulk operation visible to search using the refresh parameter, you must have the maintenance or manage index privilege.

Automatic data stream creation requires a matching index template with data stream enabled.

The actions are specified in the request body using a newline delimited JSON (NDJSON) structure:

action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n

The index and create actions expect a source on the next line and have the same semantics as the op_type parameter in the standard index API. A create action fails if a document with the same ID already exists in the target An index action adds or replaces a document as necessary.

NOTE: Data streams support only the create action. To update or delete a document in a data stream, you must target the backing index containing the document.

An update action expects that the partial doc, upsert, and script and its options are specified on the next line.

A delete action does not expect a source on the next line and has the same semantics as the standard delete API.

NOTE: The final line of data must end with a newline character (\n). Each newline character may be preceded by a carriage return (\r). When sending NDJSON data to the _bulk endpoint, use a Content-Type header of application/json or application/x-ndjson. Because this format uses literal newline characters (\n) as delimiters, make sure that the JSON actions and sources are not pretty printed.

If you provide a target in the request path, it is used for any actions that don't explicitly specify an _index argument.

A note on the format: the idea here is to make processing as fast as possible. As some of the actions are redirected to other shards on other nodes, only action_meta_data is parsed on the receiving node side.

Client libraries using this protocol should try and strive to do something similar on the client side, and reduce buffering as much as possible.

There is no "correct" number of actions to perform in a single bulk request. Experiment with different settings to find the optimal size for your particular workload. Note that Elasticsearch limits the maximum size of a HTTP request to 100mb by default so clients must ensure that no request exceeds this size. It is not possible to index a single document that exceeds the size limit, so you must pre-process any such documents into smaller pieces before sending them to Elasticsearch. For instance, split documents into pages or chapters before indexing them, or store raw binary data in a system outside Elasticsearch and replace the raw data with a link to the external system in the documents that you send to Elasticsearch.

Client suppport for bulk requests

Some of the officially supported clients provide helpers to assist with bulk requests and reindexing:

  • Go: Check out esutil.BulkIndexer
  • Perl: Check out Search::Elasticsearch::Client::5_0::Bulk and Search::Elasticsearch::Client::5_0::Scroll
  • Python: Check out elasticsearch.helpers.*
  • JavaScript: Check out client.helpers.*
  • .NET: Check out BulkAllObservable
  • PHP: Check out bulk indexing.

Submitting bulk requests with cURL

If you're providing text file input to curl, you must use the --data-binary flag instead of plain -d. The latter doesn't preserve newlines. For example:

$ cat requests
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
$ curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/_bulk --data-binary "@requests"; echo
{"took":7, "errors": false, "items":[{"index":{"_index":"test","_id":"1","_version":1,"result":"created","forced_refresh":false}}]}

Optimistic concurrency control

Each index and delete action within a bulk API call may include the if_seq_no and if_primary_term parameters in their respective action and meta data lines. The if_seq_no and if_primary_term parameters control how operations are run, based on the last modification to existing documents. See Optimistic concurrency control for more details.

Versioning

Each bulk item can include the version value using the version field. It automatically follows the behavior of the index or delete operation based on the _version mapping. It also support the version_type.

Routing

Each bulk item can include the routing value using the routing field. It automatically follows the behavior of the index or delete operation based on the _routing mapping.

NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing setting enabled in the template.

Wait for active shards

When making bulk calls, you can set the wait_for_active_shards parameter to require a minimum number of shard copies to be active before starting to process the bulk request.

Refresh

Control when the changes made by this request are visible to search.

NOTE: Only the shards that receive the bulk request will be affected by refresh. Imagine a _bulk?refresh=wait_for request with three documents in it that happen to be routed to different shards in an index with five shards. The request will only wait for those three shards to refresh. The other two shards that make up the index do not participate in the _bulk request at all.

Path parameters

  • index string Required

    The name of the data stream, index, or index alias to perform bulk actions on.

Query parameters

  • True or false if to include the document source in the error message in case of parsing errors.

  • If true, the response will include the ingest pipelines that were run for each index or create.

  • pipeline string

    The pipeline identifier to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

  • refresh string

    If true, Elasticsearch refreshes the affected shards to make this operation visible to search. If wait_for, wait for a refresh to make this operation visible to search. If false, do nothing with refreshes. Valid values: true, false, wait_for.

    Values are true, false, or wait_for.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • _source boolean | string | array[string]

    Indicates whether to return the _source field (true or false) or contains a list of fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in _source_includes query parameter. If the _source parameter is false, this parameter is ignored.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the _source_excludes query parameter. If the _source parameter is false, this parameter is ignored.

  • timeout string

    The period each action waits for the following operations: automatic index creation, dynamic mapping updates, and waiting for active shards. The default is 1m (one minute), which guarantees Elasticsearch waits for at least the timeout before failing. The actual wait time could be longer, particularly when multiple waits occur.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default is 1, which waits for each primary shard to be active.

  • If true, the request's actions must target an index alias.

  • If true, the request's actions must target a data stream (existing or to be created).

application/json

Body object Required

One of:
  • index object
    Hide index attributes Show index attributes object
    • _id string
    • _index string
    • routing string
    • version number
    • Values are internal, external, external_gte, or force.

    • A map from the full name of fields to the name of dynamic templates. It defaults to an empty map. If a name matches a dynamic template, that template will be applied regardless of other match predicates defined in the template. If a field is already defined in the mapping, then this parameter won't be used.

      Hide dynamic_templates attribute Show dynamic_templates attribute object
      • * string Additional properties
    • pipeline string

      The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

    • If true, the request's actions must target an index alias.

  • create object
    Hide create attributes Show create attributes object
    • _id string
    • _index string
    • routing string
    • version number
    • Values are internal, external, external_gte, or force.

    • A map from the full name of fields to the name of dynamic templates. It defaults to an empty map. If a name matches a dynamic template, that template will be applied regardless of other match predicates defined in the template. If a field is already defined in the mapping, then this parameter won't be used.

      Hide dynamic_templates attribute Show dynamic_templates attribute object
      • * string Additional properties
    • pipeline string

      The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

    • If true, the request's actions must target an index alias.

  • update object
    Hide update attributes Show update attributes object
  • delete object
    Hide delete attributes Show delete attributes object

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • errors boolean Required

      If true, one or more of the operations in the bulk request did not complete successfully.

    • items array[object] Required

      The result of each operation in the bulk request, in the order they were submitted.

      Hide items attribute Show items attribute object
    • took number Required

      The length of time, in milliseconds, it took to process the bulk request.

PUT /{index}/_bulk
curl \
 --request PUT 'http://api.example.com/{index}/_bulk' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{ \"index\" : { \"_index\" : \"test\", \"_id\" : \"1\" } }\n{ \"field1\" : \"value1\" }\n{ \"delete\" : { \"_index\" : \"test\", \"_id\" : \"2\" } }\n{ \"create\" : { \"_index\" : \"test\", \"_id\" : \"3\" } }\n{ \"field1\" : \"value3\" }\n{ \"update\" : {\"_id\" : \"1\", \"_index\" : \"test\"} }\n{ \"doc\" : {\"field2\" : \"value2\"} }"'
Run `POST _bulk` to perform multiple operations.
{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "delete" : { "_index" : "test", "_id" : "2" } }
{ "create" : { "_index" : "test", "_id" : "3" } }
{ "field1" : "value3" }
{ "update" : {"_id" : "1", "_index" : "test"} }
{ "doc" : {"field2" : "value2"} }
When you run `POST _bulk` and use the `update` action, you can use `retry_on_conflict` as a field in the action itself (not in the extra payload line) to specify how many times an update should be retried in the case of a version conflict.
{ "update" : {"_id" : "1", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"} }
{ "update" : { "_id" : "0", "_index" : "index1", "retry_on_conflict" : 3} }
{ "script" : { "source": "ctx._source.counter += params.param1", "lang" : "painless", "params" : {"param1" : 1}}, "upsert" : {"counter" : 1}}
{ "update" : {"_id" : "2", "_index" : "index1", "retry_on_conflict" : 3} }
{ "doc" : {"field" : "value"}, "doc_as_upsert" : true }
{ "update" : {"_id" : "3", "_index" : "index1", "_source" : true} }
{ "doc" : {"field" : "value"} }
{ "update" : {"_id" : "4", "_index" : "index1"} }
{ "doc" : {"field" : "value"}, "_source": true}
To return only information about failed operations, run `POST /_bulk?filter_path=items.*.error`.
{ "update": {"_id": "5", "_index": "index1"} }
{ "doc": {"my_field": "foo"} }
{ "update": {"_id": "6", "_index": "index1"} }
{ "doc": {"my_field": "foo"} }
{ "create": {"_id": "7", "_index": "index1"} }
{ "my_field": "foo" }
Run `POST /_bulk` to perform a bulk request that consists of index and create actions with the `dynamic_templates` parameter. The bulk request creates two new fields `work_location` and `home_location` with type `geo_point` according to the `dynamic_templates` parameter. However, the `raw_location` field is created using default dynamic mapping rules, as a text field in that case since it is supplied as a string in the JSON document.
{ "index" : { "_index" : "my_index", "_id" : "1", "dynamic_templates": {"work_location": "geo_point"}} }
{ "field" : "value1", "work_location": "41.12,-71.34", "raw_location": "41.12,-71.34"}
{ "create" : { "_index" : "my_index", "_id" : "2", "dynamic_templates": {"home_location": "geo_point"}} }
{ "field" : "value2", "home_location": "41.12,-71.34"}
Response examples (200)
{
   "took": 30,
   "errors": false,
   "items": [
      {
         "index": {
            "_index": "test",
            "_id": "1",
            "_version": 1,
            "result": "created",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 201,
            "_seq_no" : 0,
            "_primary_term": 1
         }
      },
      {
         "delete": {
            "_index": "test",
            "_id": "2",
            "_version": 1,
            "result": "not_found",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 404,
            "_seq_no" : 1,
            "_primary_term" : 2
         }
      },
      {
         "create": {
            "_index": "test",
            "_id": "3",
            "_version": 1,
            "result": "created",
            "_shards": {
               "total": 2,
               "successful": 1,
               "failed": 0
            },
            "status": 201,
            "_seq_no" : 2,
            "_primary_term" : 3
         }
      },
      {
         "update": {
            "_index": "test",
            "_id": "1",
            "_version": 2,
            "result": "updated",
            "_shards": {
                "total": 2,
                "successful": 1,
                "failed": 0
            },
            "status": 200,
            "_seq_no" : 3,
            "_primary_term" : 4
         }
      }
   ]
}
If you run `POST /_bulk` with operations that update non-existent documents, the operations cannot complete successfully. The API returns a response with an `errors` property value `true`. The response also includes an error object for any failed operations. The error object contains additional information about the failure, such as the error type and reason.
{
  "took": 486,
  "errors": true,
  "items": [
    {
      "update": {
        "_index": "index1",
        "_id": "5",
        "status": 404,
        "error": {
          "type": "document_missing_exception",
          "reason": "[5]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "update": {
        "_index": "index1",
        "_id": "6",
        "status": 404,
        "error": {
          "type": "document_missing_exception",
          "reason": "[6]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "create": {
        "_index": "index1",
        "_id": "7",
        "_version": 1,
        "result": "created",
        "_shards": {
          "total": 2,
          "successful": 1,
          "failed": 0
        },
        "_seq_no": 0,
        "_primary_term": 1,
        "status": 201
      }
    }
  ]
}
An example response from `POST /_bulk?filter_path=items.*.error`, which returns only information about failed operations.
{
  "items": [
    {
      "update": {
        "error": {
          "type": "document_missing_exception",
          "reason": "[5]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    },
    {
      "update": {
        "error": {
          "type": "document_missing_exception",
          "reason": "[6]: document missing",
          "index_uuid": "aAsFqTI0Tc2W0LCWgPNrOA",
          "shard": "0",
          "index": "index1"
        }
      }
    }
  ]
}




Create a new document in the index Added in 5.0.0

PUT /{index}/_create/{id}

You can index a new JSON document with the /<target>/_doc/ or /<target>/_create/<_id> APIs Using _create guarantees that the document is indexed only if it does not already exist. It returns a 409 response when a document with a same ID already exists in the index. To update an existing document, you must use the /<target>/_doc/ API.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:

  • To add a document using the PUT /<target>/_create/<_id> or POST /<target>/_create/<_id> request formats, you must have the create_doc, create, index, or write index privilege.
  • To automatically create a data stream or index with this API request, you must have the auto_configure, create_index, or manage index privilege.

Automatic data stream creation requires a matching index template with data stream enabled.

Automatically create data streams and indices

If the request's target doesn't exist and matches an index template with a data_stream definition, the index operation automatically creates the data stream.

If the target doesn't exist and doesn't match a data stream template, the operation automatically creates the index and applies any matching index templates.

NOTE: Elasticsearch includes several built-in index templates. To avoid naming collisions with these templates, refer to index pattern documentation.

If no mapping exists, the index operation creates a dynamic mapping. By default, new fields and objects are automatically added to the mapping if needed.

Automatic index creation is controlled by the action.auto_create_index setting. If it is true, any index can be created automatically. You can modify this setting to explicitly allow or block automatic creation of indices that match specified patterns or set it to false to turn off automatic index creation entirely. Specify a comma-separated list of patterns you want to allow or prefix each pattern with + or - to indicate whether it should be allowed or blocked. When a list is specified, the default behaviour is to disallow.

NOTE: The action.auto_create_index setting affects the automatic creation of indices only. It does not affect the creation of data streams.

Routing

By default, shard placement — or routing — is controlled by using a hash of the document's ID value. For more explicit control, the value fed into the hash function used by the router can be directly specified on a per-operation basis using the routing parameter.

When setting up explicit mapping, you can also use the _routing field to direct the index operation to extract the routing value from the document itself. This does come at the (very minimal) cost of an additional document parsing pass. If the _routing mapping is defined and set to be required, the index operation will fail if no routing value is provided or extracted.

NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing setting enabled in the template.

Distributed

The index operation is directed to the primary shard based on its route and performed on the actual node containing this shard. After the primary shard completes the operation, if needed, the update is distributed to applicable replicas.

Active shards

To improve the resiliency of writes to the system, indexing operations can be configured to wait for a certain number of active shard copies before proceeding with the operation. If the requisite number of active shard copies are not available, then the write operation must wait and retry, until either the requisite shard copies have started or a timeout occurs. By default, write operations only wait for the primary shards to be active before proceeding (that is to say wait_for_active_shards is 1). This default can be overridden in the index settings dynamically by setting index.write.wait_for_active_shards. To alter this behavior per operation, use the wait_for_active_shards request parameter.

Valid values are all or any positive integer up to the total number of configured copies per shard in the index (which is number_of_replicas+1). Specifying a negative value or a number greater than the number of shard copies will throw an error.

For example, suppose you have a cluster of three nodes, A, B, and C and you create an index index with the number of replicas set to 3 (resulting in 4 shard copies, one more copy than there are nodes). If you attempt an indexing operation, by default the operation will only ensure the primary copy of each shard is available before proceeding. This means that even if B and C went down and A hosted the primary shard copies, the indexing operation would still proceed with only one copy of the data. If wait_for_active_shards is set on the request to 3 (and all three nodes are up), the indexing operation will require 3 active shard copies before proceeding. This requirement should be met because there are 3 active nodes in the cluster, each one holding a copy of the shard. However, if you set wait_for_active_shards to all (or to 4, which is the same in this situation), the indexing operation will not proceed as you do not have all 4 copies of each shard active in the index. The operation will timeout unless a new node is brought up in the cluster to host the fourth copy of the shard.

It is important to note that this setting greatly reduces the chances of the write operation not writing to the requisite number of shard copies, but it does not completely eliminate the possibility, because this check occurs before the write operation starts. After the write operation is underway, it is still possible for replication to fail on any number of shard copies but still succeed on the primary. The _shards section of the API response reveals the number of shard copies on which replication succeeded and failed.

External documentation

Path parameters

  • index string Required

    The name of the data stream or index to target. If the target doesn't exist and matches the name or wildcard (*) pattern of an index template with a data_stream definition, this request creates the data stream. If the target doesn't exist and doesn’t match a data stream template, this request creates the index.

  • id string Required

    A unique identifier for the document. To automatically generate a document ID, use the POST /<target>/_doc/ request format.

Query parameters

  • Only perform the operation if the document has this primary term.

  • Only perform the operation if the document has this sequence number.

  • True or false if to include the document source in the error message in case of parsing errors.

  • op_type string

    Set to create to only index the document if it does not already exist (put if absent). If a document with the specified _id already exists, the indexing operation will fail. The behavior is the same as using the <index>/_create endpoint. If a document ID is specified, this paramater defaults to index. Otherwise, it defaults to create. If the request targets a data stream, an op_type of create is required.

    Supported values include:

    • index: Overwrite any documents that already exist.
    • create: Only index documents that do not already exist.

    Values are index or create.

  • pipeline string

    The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

  • refresh string

    If true, Elasticsearch refreshes the affected shards to make this operation visible to search. If wait_for, it waits for a refresh to make this operation visible to search. If false, it does nothing with refreshes.

    Values are true, false, or wait_for.

  • If true, the destination must be an index alias.

  • If true, the request's actions must target a data stream (existing or to be created).

  • routing string

    A custom value that is used to route operations to a specific shard.

  • timeout string

    The period the request waits for the following operations: automatic index creation, dynamic mapping updates, waiting for active shards. Elasticsearch waits for at least the specified timeout period before failing. The actual wait time could be longer, particularly when multiple waits occur.

    This parameter is useful for situations where the primary shard assigned to perform the operation might not be available when the operation runs. Some reasons for this might be that the primary shard is currently recovering from a gateway or undergoing relocation. By default, the operation will wait on the primary shard to become available for at least 1 minute before failing and responding with an error. The actual wait time could be longer, particularly when multiple waits occur.

  • version number

    The explicit version number for concurrency control. It must be a non-negative long number.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. You can set it to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default value of 1 means it waits for each primary shard to be active.

application/json

Body Required

object object

Responses

PUT /{index}/_create/{id}
curl \
 --request PUT 'http://api.example.com/{index}/_create/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"@timestamp\": \"2099-11-15T13:12:00\",\n  \"message\": \"GET /search HTTP/1.1 200 1070000\",\n  \"user\": {\n    \"id\": \"kimchy\"\n  }\n}"'
Request example
Run `PUT my-index-000001/_create/1` to index a document into the `my-index-000001` index if no document with that ID exists.
{
  "@timestamp": "2099-11-15T13:12:00",
  "message": "GET /search HTTP/1.1 200 1070000",
  "user": {
    "id": "kimchy"
  }
}

Create a new document in the index Added in 5.0.0

POST /{index}/_create/{id}

You can index a new JSON document with the /<target>/_doc/ or /<target>/_create/<_id> APIs Using _create guarantees that the document is indexed only if it does not already exist. It returns a 409 response when a document with a same ID already exists in the index. To update an existing document, you must use the /<target>/_doc/ API.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias:

  • To add a document using the PUT /<target>/_create/<_id> or POST /<target>/_create/<_id> request formats, you must have the create_doc, create, index, or write index privilege.
  • To automatically create a data stream or index with this API request, you must have the auto_configure, create_index, or manage index privilege.

Automatic data stream creation requires a matching index template with data stream enabled.

Automatically create data streams and indices

If the request's target doesn't exist and matches an index template with a data_stream definition, the index operation automatically creates the data stream.

If the target doesn't exist and doesn't match a data stream template, the operation automatically creates the index and applies any matching index templates.

NOTE: Elasticsearch includes several built-in index templates. To avoid naming collisions with these templates, refer to index pattern documentation.

If no mapping exists, the index operation creates a dynamic mapping. By default, new fields and objects are automatically added to the mapping if needed.

Automatic index creation is controlled by the action.auto_create_index setting. If it is true, any index can be created automatically. You can modify this setting to explicitly allow or block automatic creation of indices that match specified patterns or set it to false to turn off automatic index creation entirely. Specify a comma-separated list of patterns you want to allow or prefix each pattern with + or - to indicate whether it should be allowed or blocked. When a list is specified, the default behaviour is to disallow.

NOTE: The action.auto_create_index setting affects the automatic creation of indices only. It does not affect the creation of data streams.

Routing

By default, shard placement — or routing — is controlled by using a hash of the document's ID value. For more explicit control, the value fed into the hash function used by the router can be directly specified on a per-operation basis using the routing parameter.

When setting up explicit mapping, you can also use the _routing field to direct the index operation to extract the routing value from the document itself. This does come at the (very minimal) cost of an additional document parsing pass. If the _routing mapping is defined and set to be required, the index operation will fail if no routing value is provided or extracted.

NOTE: Data streams do not support custom routing unless they were created with the allow_custom_routing setting enabled in the template.

Distributed

The index operation is directed to the primary shard based on its route and performed on the actual node containing this shard. After the primary shard completes the operation, if needed, the update is distributed to applicable replicas.

Active shards

To improve the resiliency of writes to the system, indexing operations can be configured to wait for a certain number of active shard copies before proceeding with the operation. If the requisite number of active shard copies are not available, then the write operation must wait and retry, until either the requisite shard copies have started or a timeout occurs. By default, write operations only wait for the primary shards to be active before proceeding (that is to say wait_for_active_shards is 1). This default can be overridden in the index settings dynamically by setting index.write.wait_for_active_shards. To alter this behavior per operation, use the wait_for_active_shards request parameter.

Valid values are all or any positive integer up to the total number of configured copies per shard in the index (which is number_of_replicas+1). Specifying a negative value or a number greater than the number of shard copies will throw an error.

For example, suppose you have a cluster of three nodes, A, B, and C and you create an index index with the number of replicas set to 3 (resulting in 4 shard copies, one more copy than there are nodes). If you attempt an indexing operation, by default the operation will only ensure the primary copy of each shard is available before proceeding. This means that even if B and C went down and A hosted the primary shard copies, the indexing operation would still proceed with only one copy of the data. If wait_for_active_shards is set on the request to 3 (and all three nodes are up), the indexing operation will require 3 active shard copies before proceeding. This requirement should be met because there are 3 active nodes in the cluster, each one holding a copy of the shard. However, if you set wait_for_active_shards to all (or to 4, which is the same in this situation), the indexing operation will not proceed as you do not have all 4 copies of each shard active in the index. The operation will timeout unless a new node is brought up in the cluster to host the fourth copy of the shard.

It is important to note that this setting greatly reduces the chances of the write operation not writing to the requisite number of shard copies, but it does not completely eliminate the possibility, because this check occurs before the write operation starts. After the write operation is underway, it is still possible for replication to fail on any number of shard copies but still succeed on the primary. The _shards section of the API response reveals the number of shard copies on which replication succeeded and failed.

External documentation

Path parameters

  • index string Required

    The name of the data stream or index to target. If the target doesn't exist and matches the name or wildcard (*) pattern of an index template with a data_stream definition, this request creates the data stream. If the target doesn't exist and doesn’t match a data stream template, this request creates the index.

  • id string Required

    A unique identifier for the document. To automatically generate a document ID, use the POST /<target>/_doc/ request format.

Query parameters

  • Only perform the operation if the document has this primary term.

  • Only perform the operation if the document has this sequence number.

  • True or false if to include the document source in the error message in case of parsing errors.

  • op_type string

    Set to create to only index the document if it does not already exist (put if absent). If a document with the specified _id already exists, the indexing operation will fail. The behavior is the same as using the <index>/_create endpoint. If a document ID is specified, this paramater defaults to index. Otherwise, it defaults to create. If the request targets a data stream, an op_type of create is required.

    Supported values include:

    • index: Overwrite any documents that already exist.
    • create: Only index documents that do not already exist.

    Values are index or create.

  • pipeline string

    The ID of the pipeline to use to preprocess incoming documents. If the index has a default ingest pipeline specified, setting the value to _none turns off the default ingest pipeline for this request. If a final pipeline is configured, it will always run regardless of the value of this parameter.

  • refresh string

    If true, Elasticsearch refreshes the affected shards to make this operation visible to search. If wait_for, it waits for a refresh to make this operation visible to search. If false, it does nothing with refreshes.

    Values are true, false, or wait_for.

  • If true, the destination must be an index alias.

  • If true, the request's actions must target a data stream (existing or to be created).

  • routing string

    A custom value that is used to route operations to a specific shard.

  • timeout string

    The period the request waits for the following operations: automatic index creation, dynamic mapping updates, waiting for active shards. Elasticsearch waits for at least the specified timeout period before failing. The actual wait time could be longer, particularly when multiple waits occur.

    This parameter is useful for situations where the primary shard assigned to perform the operation might not be available when the operation runs. Some reasons for this might be that the primary shard is currently recovering from a gateway or undergoing relocation. By default, the operation will wait on the primary shard to become available for at least 1 minute before failing and responding with an error. The actual wait time could be longer, particularly when multiple waits occur.

  • version number

    The explicit version number for concurrency control. It must be a non-negative long number.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. You can set it to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The default value of 1 means it waits for each primary shard to be active.

application/json

Body Required

object object

Responses

POST /{index}/_create/{id}
curl \
 --request POST 'http://api.example.com/{index}/_create/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"@timestamp\": \"2099-11-15T13:12:00\",\n  \"message\": \"GET /search HTTP/1.1 200 1070000\",\n  \"user\": {\n    \"id\": \"kimchy\"\n  }\n}"'
Request example
Run `PUT my-index-000001/_create/1` to index a document into the `my-index-000001` index if no document with that ID exists.
{
  "@timestamp": "2099-11-15T13:12:00",
  "message": "GET /search HTTP/1.1 200 1070000",
  "user": {
    "id": "kimchy"
  }
}




















Delete documents Added in 5.0.0

POST /{index}/_delete_by_query

Deletes documents that match the specified query.

If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or alias:

  • read
  • delete or write

You can specify the query criteria in the request URI or the request body using the same syntax as the search API. When you submit a delete by query request, Elasticsearch gets a snapshot of the data stream or index when it begins processing the request and deletes matching documents using internal versioning. If a document changes between the time that the snapshot is taken and the delete operation is processed, it results in a version conflict and the delete operation fails.

NOTE: Documents with a version equal to 0 cannot be deleted using delete by query because internal versioning does not support 0 as a valid version number.

While processing a delete by query request, Elasticsearch performs multiple search requests sequentially to find all of the matching documents to delete. A bulk delete request is performed for each batch of matching documents. If a search or bulk request is rejected, the requests are retried up to 10 times, with exponential back off. If the maximum retry limit is reached, processing halts and all failed requests are returned in the response. Any delete requests that completed successfully still stick, they are not rolled back.

You can opt to count version conflicts instead of halting and returning by setting conflicts to proceed. Note that if you opt to count version conflicts the operation could attempt to delete more documents from the source than max_docs until it has successfully deleted max_docs documents, or it has gone through every document in the source query.

Throttling delete requests

To control the rate at which delete by query issues batches of delete operations, you can set requests_per_second to any positive decimal number. This pads each batch with a wait time to throttle the rate. Set requests_per_second to -1 to disable throttling.

Throttling uses a wait time between batches so that the internal scroll requests can be given a timeout that takes the request padding into account. The padding time is the difference between the batch size divided by the requests_per_second and the time spent writing. By default the batch size is 1000, so if requests_per_second is set to 500:

target_time = 1000 / 500 per second = 2 seconds
wait_time = target_time - write_time = 2 seconds - .5 seconds = 1.5 seconds

Since the batch is issued as a single _bulk request, large batch sizes cause Elasticsearch to create many requests and wait before starting the next set. This is "bursty" instead of "smooth".

Slicing

Delete by query supports sliced scroll to parallelize the delete process. This can improve efficiency and provide a convenient way to break the request down into smaller parts.

Setting slices to auto lets Elasticsearch choose the number of slices to use. This setting will use one slice per shard, up to a certain limit. If there are multiple source data streams or indices, it will choose the number of slices based on the index or backing index with the smallest number of shards. Adding slices to the delete by query operation creates sub-requests which means it has some quirks:

  • You can see these requests in the tasks APIs. These sub-requests are "child" tasks of the task for the request with slices.
  • Fetching the status of the task for the request with slices only contains the status of completed slices.
  • These sub-requests are individually addressable for things like cancellation and rethrottling.
  • Rethrottling the request with slices will rethrottle the unfinished sub-request proportionally.
  • Canceling the request with slices will cancel each sub-request.
  • Due to the nature of slices each sub-request won't get a perfectly even portion of the documents. All documents will be addressed, but some slices may be larger than others. Expect larger slices to have a more even distribution.
  • Parameters like requests_per_second and max_docs on a request with slices are distributed proportionally to each sub-request. Combine that with the earlier point about distribution being uneven and you should conclude that using max_docs with slices might not result in exactly max_docs documents being deleted.
  • Each sub-request gets a slightly different snapshot of the source data stream or index though these are all taken at approximately the same time.

If you're slicing manually or otherwise tuning automatic slicing, keep in mind that:

  • Query performance is most efficient when the number of slices is equal to the number of shards in the index or backing index. If that number is large (for example, 500), choose a lower number as too many slices hurts performance. Setting slices higher than the number of shards generally does not improve efficiency and adds overhead.
  • Delete performance scales linearly across available resources with the number of slices.

Whether query or delete performance dominates the runtime depends on the documents being reindexed and cluster resources.

Cancel a delete by query operation

Any delete by query can be canceled using the task cancel API. For example:

POST _tasks/r1A2WoRbTwKZ516z6NEs5A:36619/_cancel

The task ID can be found by using the get tasks API.

Cancellation should happen quickly but might take a few seconds. The get task status API will continue to list the delete by query task until this task checks that it has been cancelled and terminates itself.

Path parameters

  • index string | array[string] Required

    A comma-separated list of data streams, indices, and aliases to search. It supports wildcards (*). To search all data streams or indices, omit this parameter or use * or _all.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • analyzer string

    Analyzer to use for the query string. This parameter can be used only when the q query string parameter is specified.

  • If true, wildcard and prefix queries are analyzed. This parameter can be used only when the q query string parameter is specified.

  • What to do if delete by query hits version conflicts: abort or proceed.

    Supported values include:

    • abort: Stop reindexing if there are conflicts.
    • proceed: Continue reindexing even if there are conflicts.

    Values are abort or proceed.

  • The default operator for query string query: AND or OR. This parameter can be used only when the q query string parameter is specified.

    Values are and, AND, or, or OR.

  • df string

    The field to use as default where no field prefix is given in the query string. This parameter can be used only when the q query string parameter is specified.

  • expand_wildcards string | array[string]

    The type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. It supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • from number

    Skips the specified number of documents.

  • If false, the request returns an error if it targets a missing or closed index.

  • lenient boolean

    If true, format-based query failures (such as providing text to a numeric field) in the query string will be ignored. This parameter can be used only when the q query string parameter is specified.

  • max_docs number

    The maximum number of documents to process. Defaults to all documents. When set to a value less then or equal to scroll_size, a scroll will not be used to retrieve the results for the operation.

  • The node or shard the operation should be performed on. It is random by default.

  • refresh boolean

    If true, Elasticsearch refreshes all shards involved in the delete by query after the request completes. This is different than the delete API's refresh parameter, which causes just the shard that received the delete request to be refreshed. Unlike the delete API, it does not support wait_for.

  • If true, the request cache is used for this request. Defaults to the index-level setting.

  • The throttle for this request in sub-requests per second.

  • routing string

    A custom value used to route operations to a specific shard.

  • q string

    A query in the Lucene query string syntax.

  • scroll string

    The period to retain the search context for scrolling.

  • The size of the scroll request that powers the operation.

  • The explicit timeout for each search request. It defaults to no timeout.

  • The type of the search operation. Available options include query_then_fetch and dfs_query_then_fetch.

    Supported values include:

    • query_then_fetch: Documents are scored using local term and document frequencies for the shard. This is usually faster but less accurate.
    • dfs_query_then_fetch: Documents are scored using global term and document frequencies across all shards. This is usually slower but more accurate.

    Values are query_then_fetch or dfs_query_then_fetch.

  • slices number | string

    The number of slices this task should be divided into.

  • sort array[string]

    A comma-separated list of <field>:<direction> pairs.

  • stats array[string]

    The specific tag of the request for logging and statistical purposes.

  • The maximum number of documents to collect for each shard. If a query reaches this limit, Elasticsearch terminates the query early. Elasticsearch collects documents before sorting.

    Use with caution. Elasticsearch applies this parameter to each shard handling the request. When possible, let Elasticsearch perform early termination automatically. Avoid specifying this parameter for requests that target data streams with backing indices across multiple data tiers.

  • timeout string

    The period each deletion request waits for active shards.

  • version boolean

    If true, returns the document version as part of a hit.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1). The timeout value controls how long each write request waits for unavailable shards to become available.

  • If true, the request blocks until the operation is complete. If false, Elasticsearch performs some preflight checks, launches the request, and returns a task you can use to cancel or get the status of the task. Elasticsearch creates a record of this task as a document at .tasks/task/${taskId}. When you are done with a task, you should delete the task document so Elasticsearch can reclaim the space.

application/json

Body Required

  • max_docs number

    The maximum number of documents to delete.

  • query object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • slice object
    Hide slice attributes Show slice attributes object
    • field string

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • id string Required
    • max number Required

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • batches number

      The number of scroll responses pulled back by the delete by query.

    • deleted number

      The number of documents that were successfully deleted.

    • failures array[object]

      An array of failures if there were any unrecoverable errors during the process. If this array is not empty, the request ended abnormally because of those failures. Delete by query is implemented using batches and any failures cause the entire process to end but all failures in the current batch are collected into the array. You can use the conflicts option to prevent reindex from ending on version conflicts.

      Hide failures attributes Show failures attributes object
    • noops number

      This field is always equal to zero for delete by query. It exists only so that delete by query, update by query, and reindex APIs return responses with the same structure.

    • The number of requests per second effectively run during the delete by query.

    • retries object
      Hide retries attributes Show retries attributes object
      • bulk number Required

        The number of bulk actions retried.

    • slice_id number
    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Time unit for milliseconds

    • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • Time unit for milliseconds

    • timed_out boolean

      If true, some requests run during the delete by query operation timed out.

    • took number

      Time unit for milliseconds

    • total number

      The number of documents that were successfully processed.

    • The number of version conflicts that the delete by query hit.

POST /{index}/_delete_by_query
curl \
 --request POST 'http://api.example.com/{index}/_delete_by_query' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"query\": {\n    \"match_all\": {}\n  }\n}"'
Run `POST /my-index-000001,my-index-000002/_delete_by_query` to delete all documents from multiple data streams or indices.
{
  "query": {
    "match_all": {}
  }
}
Run `POST my-index-000001/_delete_by_query` to delete a document by using a unique attribute.
{
  "query": {
    "term": {
      "user.id": "kimchy"
    }
  },
  "max_docs": 1
}
Run `POST my-index-000001/_delete_by_query` to slice a delete by query manually. Provide a slice ID and total number of slices.
{
  "slice": {
    "id": 0,
    "max": 2
  },
  "query": {
    "range": {
      "http.response.bytes": {
        "lt": 2000000
      }
    }
  }
}
Run `POST my-index-000001/_delete_by_query?refresh&slices=5` to let delete by query automatically parallelize using sliced scroll to slice on `_id`. The `slices` query parameter value specifies the number of slices to use.
{
  "query": {
    "range": {
      "http.response.bytes": {
        "lt": 2000000
      }
    }
  }
}
Response examples (200)
A successful response from `POST /my-index-000001/_delete_by_query`.
{
  "took" : 147,
  "timed_out": false,
  "total": 119,
  "deleted": 119,
  "batches": 1,
  "version_conflicts": 0,
  "noops": 0,
  "retries": {
    "bulk": 0,
    "search": 0
  },
  "throttled_millis": 0,
  "requests_per_second": -1.0,
  "throttled_until_millis": 0,
  "failures" : [ ]
}








Check for a document source Added in 5.4.0

HEAD /{index}/_source/{id}

Check whether a document source exists in an index. For example:

HEAD my-index-000001/_source/1

A document's source is not available if it is disabled in the mapping.

External documentation

Path parameters

  • index string Required

    A comma-separated list of data streams, indices, and aliases. It supports wildcards (*).

  • id string Required

    A unique identifier for the document.

Query parameters

  • The node or shard the operation should be performed on. By default, the operation is randomized between the shard replicas.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • refresh boolean

    If true, the request refreshes the relevant shards before retrieving the document. Setting it to true should be done after careful thought and verification that this does not cause a heavy load on the system (and slow down indexing).

  • routing string

    A custom value used to route operations to a specific shard.

  • _source boolean | string | array[string]

    Indicates whether to return the _source field (true or false) or lists the fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude in the response.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response.

  • version number

    The version number for concurrency control. It must match the current version of the document for the request to succeed.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

Responses

HEAD /{index}/_source/{id}
curl \
 --request HEAD 'http://api.example.com/{index}/_source/{id}' \
 --header "Authorization: $API_KEY"




Get multiple documents Added in 1.3.0

GET /_mget

Get multiple JSON documents by ID from one or more indices. If you specify an index in the request URI, you only need to specify the document IDs in the request body. To ensure fast responses, this multi get (mget) API responds with partial results if one or more shards fail.

Filter source fields

By default, the _source field is returned for every document (if stored). Use the _source and _source_include or source_exclude attributes to filter what fields are returned for a particular document. You can include the _source, _source_includes, and _source_excludes query parameters in the request URI to specify the defaults to use when there are no per-document instructions.

Get stored fields

Use the stored_fields attribute to specify the set of stored fields you want to retrieve. Any requested fields that are not stored are ignored. You can include the stored_fields query parameter in the request URI to specify the defaults to use when there are no per-document instructions.

Query parameters

  • Specifies the node or shard the operation should be performed on. Random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • refresh boolean

    If true, the request refreshes relevant shards before retrieving documents.

  • routing string

    Custom value used to route operations to a specific shard.

  • _source boolean | string | array[string]

    True or false to return the _source field or not, or a list of fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in _source_includes query parameter.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the _source_excludes query parameter. If the _source parameter is false, this parameter is ignored.

  • stored_fields string | array[string]

    If true, retrieves the document fields stored in the index rather than the document _source.

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • docs array[object] Required

      The response includes a docs array that contains the documents in the order specified in the request. The structure of the returned documents is similar to that returned by the get API. If there is a failure getting a particular document, the error is included in place of the document.

      One of:
      Hide attributes Show attributes
      • _index string Required
      • fields object

        If the stored_fields parameter is set to true and found is true, it contains the document fields stored in the index.

        Hide fields attribute Show fields attribute object
        • * object Additional properties
      • _ignored array[string]
      • found boolean Required

        Indicates whether the document exists.

      • _id string Required
      • The primary term assigned to the document for the indexing operation.

      • _routing string

        The explicit routing, if set.

      • _seq_no number
      • _source object

        If found is true, it contains the document data formatted in JSON. If the _source parameter is set to false or the stored_fields parameter is set to true, it is excluded.

      • _version number
GET /_mget
curl \
 --request GET 'http://api.example.com/_mget' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"docs\": [\n    {\n      \"_id\": \"1\"\n    },\n    {\n      \"_id\": \"2\"\n    }\n  ]\n}"'
Run `GET /my-index-000001/_mget`. When you specify an index in the request URI, only the document IDs are required in the request body.
{
  "docs": [
    {
      "_id": "1"
    },
    {
      "_id": "2"
    }
  ]
}
Run `GET /_mget`. This request sets `_source` to `false` for document 1 to exclude the source entirely. It retrieves `field3` and `field4` from document 2. It retrieves the `user` field from document 3 but filters out the `user.location` field.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "_source": false
    },
    {
      "_index": "test",
      "_id": "2",
      "_source": [ "field3", "field4" ]
    },
    {
      "_index": "test",
      "_id": "3",
      "_source": {
        "include": [ "user" ],
        "exclude": [ "user.location" ]
      }
    }
  ]
}
Run `GET /_mget`. This request retrieves `field1` and `field2` from document 1 and `field3` and `field4` from document 2.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "stored_fields": [ "field1", "field2" ]
    },
    {
      "_index": "test",
      "_id": "2",
      "stored_fields": [ "field3", "field4" ]
    }
  ]
}
Run `GET /_mget?routing=key1`. If routing is used during indexing, you need to specify the routing value to retrieve documents. This request fetches `test/_doc/2` from the shard corresponding to routing key `key1`. It fetches `test/_doc/1` from the shard corresponding to routing key `key2`.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "routing": "key2"
    },
    {
      "_index": "test",
      "_id": "2"
    }
  ]
}

Get multiple documents Added in 1.3.0

POST /_mget

Get multiple JSON documents by ID from one or more indices. If you specify an index in the request URI, you only need to specify the document IDs in the request body. To ensure fast responses, this multi get (mget) API responds with partial results if one or more shards fail.

Filter source fields

By default, the _source field is returned for every document (if stored). Use the _source and _source_include or source_exclude attributes to filter what fields are returned for a particular document. You can include the _source, _source_includes, and _source_excludes query parameters in the request URI to specify the defaults to use when there are no per-document instructions.

Get stored fields

Use the stored_fields attribute to specify the set of stored fields you want to retrieve. Any requested fields that are not stored are ignored. You can include the stored_fields query parameter in the request URI to specify the defaults to use when there are no per-document instructions.

Query parameters

  • Specifies the node or shard the operation should be performed on. Random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • refresh boolean

    If true, the request refreshes relevant shards before retrieving documents.

  • routing string

    Custom value used to route operations to a specific shard.

  • _source boolean | string | array[string]

    True or false to return the _source field or not, or a list of fields to return.

  • _source_excludes string | array[string]

    A comma-separated list of source fields to exclude from the response. You can also use this parameter to exclude fields from the subset specified in _source_includes query parameter.

  • _source_includes string | array[string]

    A comma-separated list of source fields to include in the response. If this parameter is specified, only these source fields are returned. You can exclude fields from this subset using the _source_excludes query parameter. If the _source parameter is false, this parameter is ignored.

  • stored_fields string | array[string]

    If true, retrieves the document fields stored in the index rather than the document _source.

application/json

Body Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • docs array[object] Required

      The response includes a docs array that contains the documents in the order specified in the request. The structure of the returned documents is similar to that returned by the get API. If there is a failure getting a particular document, the error is included in place of the document.

      One of:
      Hide attributes Show attributes
      • _index string Required
      • fields object

        If the stored_fields parameter is set to true and found is true, it contains the document fields stored in the index.

        Hide fields attribute Show fields attribute object
        • * object Additional properties
      • _ignored array[string]
      • found boolean Required

        Indicates whether the document exists.

      • _id string Required
      • The primary term assigned to the document for the indexing operation.

      • _routing string

        The explicit routing, if set.

      • _seq_no number
      • _source object

        If found is true, it contains the document data formatted in JSON. If the _source parameter is set to false or the stored_fields parameter is set to true, it is excluded.

      • _version number
POST /_mget
curl \
 --request POST 'http://api.example.com/_mget' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"docs\": [\n    {\n      \"_id\": \"1\"\n    },\n    {\n      \"_id\": \"2\"\n    }\n  ]\n}"'
Run `GET /my-index-000001/_mget`. When you specify an index in the request URI, only the document IDs are required in the request body.
{
  "docs": [
    {
      "_id": "1"
    },
    {
      "_id": "2"
    }
  ]
}
Run `GET /_mget`. This request sets `_source` to `false` for document 1 to exclude the source entirely. It retrieves `field3` and `field4` from document 2. It retrieves the `user` field from document 3 but filters out the `user.location` field.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "_source": false
    },
    {
      "_index": "test",
      "_id": "2",
      "_source": [ "field3", "field4" ]
    },
    {
      "_index": "test",
      "_id": "3",
      "_source": {
        "include": [ "user" ],
        "exclude": [ "user.location" ]
      }
    }
  ]
}
Run `GET /_mget`. This request retrieves `field1` and `field2` from document 1 and `field3` and `field4` from document 2.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "stored_fields": [ "field1", "field2" ]
    },
    {
      "_index": "test",
      "_id": "2",
      "stored_fields": [ "field3", "field4" ]
    }
  ]
}
Run `GET /_mget?routing=key1`. If routing is used during indexing, you need to specify the routing value to retrieve documents. This request fetches `test/_doc/2` from the shard corresponding to routing key `key1`. It fetches `test/_doc/1` from the shard corresponding to routing key `key2`.
{
  "docs": [
    {
      "_index": "test",
      "_id": "1",
      "routing": "key2"
    },
    {
      "_index": "test",
      "_id": "2"
    }
  ]
}




















Get multiple term vectors

POST /{index}/_mtermvectors

Get multiple term vectors with a single request. You can specify existing documents by index and ID or provide artificial documents in the body of the request. You can specify the index in the request body or request URI. The response contains a docs array with all the fetched termvectors. Each element has the structure provided by the termvectors API.

Artificial documents

You can also use mtermvectors to generate term vectors for artificial documents provided in the body of the request. The mapping used is determined by the specified _index.

Path parameters

  • index string Required

    The name of the index that contains the documents.

Query parameters

  • ids array[string]

    A comma-separated list of documents ids. You must define ids as parameter or set "ids" or "docs" in the request body

  • fields string | array[string]

    A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the completion_fields or fielddata_fields parameters.

  • If true, the response includes the document count, sum of document frequencies, and sum of total term frequencies.

  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • The node or shard the operation should be performed on. It is random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • routing string

    A custom value used to route operations to a specific shard.

  • If true, the response includes term frequency and document frequency.

  • version number

    If true, returns the document version as part of a hit.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

application/json

Body

  • docs array[object]

    An array of existing or artificial documents.

    Hide docs attributes Show docs attributes object
    • _id string
    • _index string
    • doc object

      An artificial document (a document not present in the index) for which you want to retrieve term vectors.

    • fields string | array[string]
    • If true, the response includes the document count, sum of document frequencies, and sum of total term frequencies.

    • filter object
      Hide filter attributes Show filter attributes object
      • Ignore words which occur in more than this many docs. Defaults to unbounded.

      • The maximum number of terms that must be returned per field.

      • Ignore words with more than this frequency in the source doc. It defaults to unbounded.

      • The maximum word length above which words will be ignored. Defaults to unbounded.

      • Ignore terms which do not occur in at least this many docs.

      • Ignore words with less than this frequency in the source doc.

      • The minimum word length below which words will be ignored.

    • offsets boolean

      If true, the response includes term offsets.

    • payloads boolean

      If true, the response includes term payloads.

    • positions boolean

      If true, the response includes term positions.

    • routing string
    • If true, the response includes term frequency and document frequency.

    • version number
    • Values are internal, external, external_gte, or force.

  • ids array[string]

    A simplified syntax to specify documents by their ID if they're in the same index.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
POST /{index}/_mtermvectors
curl \
 --request POST 'http://api.example.com/{index}/_mtermvectors' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"docs\": [\n      {\n        \"_id\": \"2\",\n        \"fields\": [\n            \"message\"\n        ],\n        \"term_statistics\": true\n      },\n      {\n        \"_id\": \"1\"\n      }\n  ]\n}"'
Run `POST /my-index-000001/_mtermvectors`. When you specify an index in the request URI, the index does not need to be specified for each documents in the request body.
{
  "docs": [
      {
        "_id": "2",
        "fields": [
            "message"
        ],
        "term_statistics": true
      },
      {
        "_id": "1"
      }
  ]
}
Run `POST /my-index-000001/_mtermvectors`. If all requested documents are in same index and the parameters are the same, you can use a simplified syntax.
{
  "ids": [ "1", "2" ],
  "fields": [
    "message"
  ],
  "term_statistics": true
}
Run `POST /_mtermvectors` to generate term vectors for artificial documents provided in the body of the request. The mapping used is determined by the specified `_index`.
{
  "docs": [
      {
        "_index": "my-index-000001",
        "doc" : {
            "message" : "test test test"
        }
      },
      {
        "_index": "my-index-000001",
        "doc" : {
          "message" : "Another test ..."
        }
      }
  ]
}








Get term vector information

GET /{index}/_termvectors/{id}

Get information and statistics about terms in the fields of a particular document.

You can retrieve term vectors for documents stored in the index or for artificial documents passed in the body of the request. You can specify the fields you are interested in through the fields parameter or by adding the fields to the request body. For example:

GET /my-index-000001/_termvectors/1?fields=message

Fields can be specified using wildcards, similar to the multi match query.

Term vectors are real-time by default, not near real-time. This can be changed by setting realtime parameter to false.

You can request three types of values: term information, term statistics, and field statistics. By default, all term information and field statistics are returned for all fields but term statistics are excluded.

Term information

  • term frequency in the field (always returned)
  • term positions (positions: true)
  • start and end offsets (offsets: true)
  • term payloads (payloads: true), as base64 encoded bytes

If the requested information wasn't stored in the index, it will be computed on the fly if possible. Additionally, term vectors could be computed for documents not even existing in the index, but instead provided by the user.


Start and end offsets assume UTF-16 encoding is being used. If you want to use these offsets in order to get the original text that produced this token, you should make sure that the string you are taking a sub-string of is also encoded using UTF-16.

Behaviour

The term and field statistics are not accurate. Deleted documents are not taken into account. The information is only retrieved for the shard the requested document resides in. The term and field statistics are therefore only useful as relative measures whereas the absolute numbers have no meaning in this context. By default, when requesting term vectors of artificial documents, a shard to get the statistics from is randomly selected. Use routing only to hit a particular shard.

Path parameters

  • index string Required

    The name of the index that contains the document.

  • id string Required

    A unique identifier for the document.

Query parameters

  • fields string | array[string]

    A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the completion_fields or fielddata_fields parameters.

  • If true, the response includes:

    • The document count (how many documents contain this field).
    • The sum of document frequencies (the sum of document frequencies for all terms in this field).
    • The sum of total term frequencies (the sum of total term frequencies of each term in this field).
  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • The node or shard the operation should be performed on. It is random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • If true, the response includes:

    • The total term frequency (how often a term occurs in all documents).
    • The document frequency (the number of documents containing the current term).

    By default these values are not returned since term statistics can have a serious performance impact.

  • version number

    If true, returns the document version as part of a hit.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

application/json

Body

  • doc object

    An artificial document (a document not present in the index) for which you want to retrieve term vectors.

  • filter object
    Hide filter attributes Show filter attributes object
    • Ignore words which occur in more than this many docs. Defaults to unbounded.

    • The maximum number of terms that must be returned per field.

    • Ignore words with more than this frequency in the source doc. It defaults to unbounded.

    • The maximum word length above which words will be ignored. Defaults to unbounded.

    • Ignore terms which do not occur in at least this many docs.

    • Ignore words with less than this frequency in the source doc.

    • The minimum word length below which words will be ignored.

  • Override the default per-field analyzer. This is useful in order to generate term vectors in any fashion, especially when using artificial documents. When providing an analyzer for a field that already stores term vectors, the term vectors will be regenerated.

    Hide per_field_analyzer attribute Show per_field_analyzer attribute object
    • * string Additional properties
  • fields string | array[string]
  • If true, the response includes:

    • The document count (how many documents contain this field).
    • The sum of document frequencies (the sum of document frequencies for all terms in this field).
    • The sum of total term frequencies (the sum of total term frequencies of each term in this field).
  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • If true, the response includes:

    • The total term frequency (how often a term occurs in all documents).
    • The document frequency (the number of documents containing the current term).

    By default these values are not returned since term statistics can have a serious performance impact.

  • routing string
  • version number
  • Values are internal, external, external_gte, or force.

Responses

GET /{index}/_termvectors/{id}
curl \
 --request GET 'http://api.example.com/{index}/_termvectors/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"fields\" : [\"text\"],\n  \"offsets\" : true,\n  \"payloads\" : true,\n  \"positions\" : true,\n  \"term_statistics\" : true,\n  \"field_statistics\" : true\n}"'
Run `GET /my-index-000001/_termvectors/1` to return all information and statistics for field `text` in document 1.
{
  "fields" : ["text"],
  "offsets" : true,
  "payloads" : true,
  "positions" : true,
  "term_statistics" : true,
  "field_statistics" : true
}
Run `GET /my-index-000001/_termvectors/1` to set per-field analyzers. A different analyzer than the one at the field may be provided by using the `per_field_analyzer` parameter.
{
  "doc" : {
    "fullname" : "John Doe",
    "text" : "test test test"
  },
  "fields": ["fullname"],
  "per_field_analyzer" : {
    "fullname": "keyword"
  }
}
Run `GET /imdb/_termvectors` to filter the terms returned based on their tf-idf scores. It returns the three most "interesting" keywords from the artificial document having the given "plot" field value. Notice that the keyword "Tony" or any stop words are not part of the response, as their tf-idf must be too low.
{
  "doc": {
    "plot": "When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil."
  },
  "term_statistics": true,
  "field_statistics": true,
  "positions": false,
  "offsets": false,
  "filter": {
    "max_num_terms": 3,
    "min_term_freq": 1,
    "min_doc_freq": 1
  }
}
Run `GET /my-index-000001/_termvectors/1`. Term vectors which are not explicitly stored in the index are automatically computed on the fly. This request returns all information and statistics for the fields in document 1, even though the terms haven't been explicitly stored in the index. Note that for the field text, the terms are not regenerated.
{
  "fields" : ["text", "some_field_without_term_vectors"],
  "offsets" : true,
  "positions" : true,
  "term_statistics" : true,
  "field_statistics" : true
}
Run `GET /my-index-000001/_termvectors`. Term vectors can be generated for artificial documents, that is for documents not present in the index. If dynamic mapping is turned on (default), the document fields not in the original mapping will be dynamically created.
{
  "doc" : {
    "fullname" : "John Doe",
    "text" : "test test test"
  }
}
Response examples (200)
A successful response from `GET /my-index-000001/_termvectors/1`.
{
  "_index": "my-index-000001",
  "_id": "1",
  "_version": 1,
  "found": true,
  "took": 6,
  "term_vectors": {
    "text": {
      "field_statistics": {
        "sum_doc_freq": 4,
        "doc_count": 2,
        "sum_ttf": 6
      },
      "terms": {
        "test": {
          "doc_freq": 2,
          "ttf": 4,
          "term_freq": 3,
          "tokens": [
            {
              "position": 0,
              "start_offset": 0,
              "end_offset": 4,
              "payload": "d29yZA=="
            },
            {
              "position": 1,
              "start_offset": 5,
              "end_offset": 9,
              "payload": "d29yZA=="
            },
            {
              "position": 2,
              "start_offset": 10,
              "end_offset": 14,
              "payload": "d29yZA=="
            }
          ]
        }
      }
    }
  }
}
A successful response from `GET /my-index-000001/_termvectors` with `per_field_analyzer` in the request body.
{
  "_index": "my-index-000001",
  "_version": 0,
  "found": true,
  "took": 6,
  "term_vectors": {
    "fullname": {
      "field_statistics": {
          "sum_doc_freq": 2,
          "doc_count": 4,
          "sum_ttf": 4
      },
      "terms": {
          "John Doe": {
            "term_freq": 1,
            "tokens": [
                {
                  "position": 0,
                  "start_offset": 0,
                  "end_offset": 8
                }
            ]
          }
      }
    }
  }
}
A successful response from `GET /my-index-000001/_termvectors` with a `filter` in the request body.
{
  "_index": "imdb",
  "_version": 0,
  "found": true,
  "term_vectors": {
      "plot": {
        "field_statistics": {
            "sum_doc_freq": 3384269,
            "doc_count": 176214,
            "sum_ttf": 3753460
        },
        "terms": {
            "armored": {
              "doc_freq": 27,
              "ttf": 27,
              "term_freq": 1,
              "score": 9.74725
            },
            "industrialist": {
              "doc_freq": 88,
              "ttf": 88,
              "term_freq": 1,
              "score": 8.590818
            },
            "stark": {
              "doc_freq": 44,
              "ttf": 47,
              "term_freq": 1,
              "score": 9.272792
            }
        }
      }
  }
}

Get term vector information

POST /{index}/_termvectors/{id}

Get information and statistics about terms in the fields of a particular document.

You can retrieve term vectors for documents stored in the index or for artificial documents passed in the body of the request. You can specify the fields you are interested in through the fields parameter or by adding the fields to the request body. For example:

GET /my-index-000001/_termvectors/1?fields=message

Fields can be specified using wildcards, similar to the multi match query.

Term vectors are real-time by default, not near real-time. This can be changed by setting realtime parameter to false.

You can request three types of values: term information, term statistics, and field statistics. By default, all term information and field statistics are returned for all fields but term statistics are excluded.

Term information

  • term frequency in the field (always returned)
  • term positions (positions: true)
  • start and end offsets (offsets: true)
  • term payloads (payloads: true), as base64 encoded bytes

If the requested information wasn't stored in the index, it will be computed on the fly if possible. Additionally, term vectors could be computed for documents not even existing in the index, but instead provided by the user.


Start and end offsets assume UTF-16 encoding is being used. If you want to use these offsets in order to get the original text that produced this token, you should make sure that the string you are taking a sub-string of is also encoded using UTF-16.

Behaviour

The term and field statistics are not accurate. Deleted documents are not taken into account. The information is only retrieved for the shard the requested document resides in. The term and field statistics are therefore only useful as relative measures whereas the absolute numbers have no meaning in this context. By default, when requesting term vectors of artificial documents, a shard to get the statistics from is randomly selected. Use routing only to hit a particular shard.

Path parameters

  • index string Required

    The name of the index that contains the document.

  • id string Required

    A unique identifier for the document.

Query parameters

  • fields string | array[string]

    A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the completion_fields or fielddata_fields parameters.

  • If true, the response includes:

    • The document count (how many documents contain this field).
    • The sum of document frequencies (the sum of document frequencies for all terms in this field).
    • The sum of total term frequencies (the sum of total term frequencies of each term in this field).
  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • The node or shard the operation should be performed on. It is random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • If true, the response includes:

    • The total term frequency (how often a term occurs in all documents).
    • The document frequency (the number of documents containing the current term).

    By default these values are not returned since term statistics can have a serious performance impact.

  • version number

    If true, returns the document version as part of a hit.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

application/json

Body

  • doc object

    An artificial document (a document not present in the index) for which you want to retrieve term vectors.

  • filter object
    Hide filter attributes Show filter attributes object
    • Ignore words which occur in more than this many docs. Defaults to unbounded.

    • The maximum number of terms that must be returned per field.

    • Ignore words with more than this frequency in the source doc. It defaults to unbounded.

    • The maximum word length above which words will be ignored. Defaults to unbounded.

    • Ignore terms which do not occur in at least this many docs.

    • Ignore words with less than this frequency in the source doc.

    • The minimum word length below which words will be ignored.

  • Override the default per-field analyzer. This is useful in order to generate term vectors in any fashion, especially when using artificial documents. When providing an analyzer for a field that already stores term vectors, the term vectors will be regenerated.

    Hide per_field_analyzer attribute Show per_field_analyzer attribute object
    • * string Additional properties
  • fields string | array[string]
  • If true, the response includes:

    • The document count (how many documents contain this field).
    • The sum of document frequencies (the sum of document frequencies for all terms in this field).
    • The sum of total term frequencies (the sum of total term frequencies of each term in this field).
  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • If true, the response includes:

    • The total term frequency (how often a term occurs in all documents).
    • The document frequency (the number of documents containing the current term).

    By default these values are not returned since term statistics can have a serious performance impact.

  • routing string
  • version number
  • Values are internal, external, external_gte, or force.

Responses

POST /{index}/_termvectors/{id}
curl \
 --request POST 'http://api.example.com/{index}/_termvectors/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"fields\" : [\"text\"],\n  \"offsets\" : true,\n  \"payloads\" : true,\n  \"positions\" : true,\n  \"term_statistics\" : true,\n  \"field_statistics\" : true\n}"'
Run `GET /my-index-000001/_termvectors/1` to return all information and statistics for field `text` in document 1.
{
  "fields" : ["text"],
  "offsets" : true,
  "payloads" : true,
  "positions" : true,
  "term_statistics" : true,
  "field_statistics" : true
}
Run `GET /my-index-000001/_termvectors/1` to set per-field analyzers. A different analyzer than the one at the field may be provided by using the `per_field_analyzer` parameter.
{
  "doc" : {
    "fullname" : "John Doe",
    "text" : "test test test"
  },
  "fields": ["fullname"],
  "per_field_analyzer" : {
    "fullname": "keyword"
  }
}
Run `GET /imdb/_termvectors` to filter the terms returned based on their tf-idf scores. It returns the three most "interesting" keywords from the artificial document having the given "plot" field value. Notice that the keyword "Tony" or any stop words are not part of the response, as their tf-idf must be too low.
{
  "doc": {
    "plot": "When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil."
  },
  "term_statistics": true,
  "field_statistics": true,
  "positions": false,
  "offsets": false,
  "filter": {
    "max_num_terms": 3,
    "min_term_freq": 1,
    "min_doc_freq": 1
  }
}
Run `GET /my-index-000001/_termvectors/1`. Term vectors which are not explicitly stored in the index are automatically computed on the fly. This request returns all information and statistics for the fields in document 1, even though the terms haven't been explicitly stored in the index. Note that for the field text, the terms are not regenerated.
{
  "fields" : ["text", "some_field_without_term_vectors"],
  "offsets" : true,
  "positions" : true,
  "term_statistics" : true,
  "field_statistics" : true
}
Run `GET /my-index-000001/_termvectors`. Term vectors can be generated for artificial documents, that is for documents not present in the index. If dynamic mapping is turned on (default), the document fields not in the original mapping will be dynamically created.
{
  "doc" : {
    "fullname" : "John Doe",
    "text" : "test test test"
  }
}
Response examples (200)
A successful response from `GET /my-index-000001/_termvectors/1`.
{
  "_index": "my-index-000001",
  "_id": "1",
  "_version": 1,
  "found": true,
  "took": 6,
  "term_vectors": {
    "text": {
      "field_statistics": {
        "sum_doc_freq": 4,
        "doc_count": 2,
        "sum_ttf": 6
      },
      "terms": {
        "test": {
          "doc_freq": 2,
          "ttf": 4,
          "term_freq": 3,
          "tokens": [
            {
              "position": 0,
              "start_offset": 0,
              "end_offset": 4,
              "payload": "d29yZA=="
            },
            {
              "position": 1,
              "start_offset": 5,
              "end_offset": 9,
              "payload": "d29yZA=="
            },
            {
              "position": 2,
              "start_offset": 10,
              "end_offset": 14,
              "payload": "d29yZA=="
            }
          ]
        }
      }
    }
  }
}
A successful response from `GET /my-index-000001/_termvectors` with `per_field_analyzer` in the request body.
{
  "_index": "my-index-000001",
  "_version": 0,
  "found": true,
  "took": 6,
  "term_vectors": {
    "fullname": {
      "field_statistics": {
          "sum_doc_freq": 2,
          "doc_count": 4,
          "sum_ttf": 4
      },
      "terms": {
          "John Doe": {
            "term_freq": 1,
            "tokens": [
                {
                  "position": 0,
                  "start_offset": 0,
                  "end_offset": 8
                }
            ]
          }
      }
    }
  }
}
A successful response from `GET /my-index-000001/_termvectors` with a `filter` in the request body.
{
  "_index": "imdb",
  "_version": 0,
  "found": true,
  "term_vectors": {
      "plot": {
        "field_statistics": {
            "sum_doc_freq": 3384269,
            "doc_count": 176214,
            "sum_ttf": 3753460
        },
        "terms": {
            "armored": {
              "doc_freq": 27,
              "ttf": 27,
              "term_freq": 1,
              "score": 9.74725
            },
            "industrialist": {
              "doc_freq": 88,
              "ttf": 88,
              "term_freq": 1,
              "score": 8.590818
            },
            "stark": {
              "doc_freq": 44,
              "ttf": 47,
              "term_freq": 1,
              "score": 9.272792
            }
        }
      }
  }
}

Get term vector information

GET /{index}/_termvectors

Get information and statistics about terms in the fields of a particular document.

You can retrieve term vectors for documents stored in the index or for artificial documents passed in the body of the request. You can specify the fields you are interested in through the fields parameter or by adding the fields to the request body. For example:

GET /my-index-000001/_termvectors/1?fields=message

Fields can be specified using wildcards, similar to the multi match query.

Term vectors are real-time by default, not near real-time. This can be changed by setting realtime parameter to false.

You can request three types of values: term information, term statistics, and field statistics. By default, all term information and field statistics are returned for all fields but term statistics are excluded.

Term information

  • term frequency in the field (always returned)
  • term positions (positions: true)
  • start and end offsets (offsets: true)
  • term payloads (payloads: true), as base64 encoded bytes

If the requested information wasn't stored in the index, it will be computed on the fly if possible. Additionally, term vectors could be computed for documents not even existing in the index, but instead provided by the user.


Start and end offsets assume UTF-16 encoding is being used. If you want to use these offsets in order to get the original text that produced this token, you should make sure that the string you are taking a sub-string of is also encoded using UTF-16.

Behaviour

The term and field statistics are not accurate. Deleted documents are not taken into account. The information is only retrieved for the shard the requested document resides in. The term and field statistics are therefore only useful as relative measures whereas the absolute numbers have no meaning in this context. By default, when requesting term vectors of artificial documents, a shard to get the statistics from is randomly selected. Use routing only to hit a particular shard.

Path parameters

  • index string Required

    The name of the index that contains the document.

Query parameters

  • fields string | array[string]

    A comma-separated list or wildcard expressions of fields to include in the statistics. It is used as the default list unless a specific field list is provided in the completion_fields or fielddata_fields parameters.

  • If true, the response includes:

    • The document count (how many documents contain this field).
    • The sum of document frequencies (the sum of document frequencies for all terms in this field).
    • The sum of total term frequencies (the sum of total term frequencies of each term in this field).
  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • The node or shard the operation should be performed on. It is random by default.

  • realtime boolean

    If true, the request is real-time as opposed to near-real-time.

  • routing string

    A custom value that is used to route operations to a specific shard.

  • If true, the response includes:

    • The total term frequency (how often a term occurs in all documents).
    • The document frequency (the number of documents containing the current term).

    By default these values are not returned since term statistics can have a serious performance impact.

  • version number

    If true, returns the document version as part of a hit.

  • The version type.

    Supported values include:

    • internal: Use internal versioning that starts at 1 and increments with each update or delete.
    • external: Only index the document if the specified version is strictly higher than the version of the stored document or if there is no existing document.
    • external_gte: Only index the document if the specified version is equal or higher than the version of the stored document or if there is no existing document. NOTE: The external_gte version type is meant for special use cases and should be used with care. If used incorrectly, it can result in loss of data.
    • force: This option is deprecated because it can cause primary and replica shards to diverge.

    Values are internal, external, external_gte, or force.

application/json

Body

  • doc object

    An artificial document (a document not present in the index) for which you want to retrieve term vectors.

  • filter object
    Hide filter attributes Show filter attributes object
    • Ignore words which occur in more than this many docs. Defaults to unbounded.

    • The maximum number of terms that must be returned per field.

    • Ignore words with more than this frequency in the source doc. It defaults to unbounded.

    • The maximum word length above which words will be ignored. Defaults to unbounded.

    • Ignore terms which do not occur in at least this many docs.

    • Ignore words with less than this frequency in the source doc.

    • The minimum word length below which words will be ignored.

  • Override the default per-field analyzer. This is useful in order to generate term vectors in any fashion, especially when using artificial documents. When providing an analyzer for a field that already stores term vectors, the term vectors will be regenerated.

    Hide per_field_analyzer attribute Show per_field_analyzer attribute object
    • * string Additional properties
  • fields string | array[string]
  • If true, the response includes:

    • The document count (how many documents contain this field).
    • The sum of document frequencies (the sum of document frequencies for all terms in this field).
    • The sum of total term frequencies (the sum of total term frequencies of each term in this field).
  • offsets boolean

    If true, the response includes term offsets.

  • payloads boolean

    If true, the response includes term payloads.

  • positions boolean

    If true, the response includes term positions.

  • If true, the response includes:

    • The total term frequency (how often a term occurs in all documents).
    • The document frequency (the number of documents containing the current term).

    By default these values are not returned since term statistics can have a serious performance impact.

  • routing string
  • version number
  • Values are internal, external, external_gte, or force.

Responses

GET /{index}/_termvectors
curl \
 --request GET 'http://api.example.com/{index}/_termvectors' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"fields\" : [\"text\"],\n  \"offsets\" : true,\n  \"payloads\" : true,\n  \"positions\" : true,\n  \"term_statistics\" : true,\n  \"field_statistics\" : true\n}"'
Run `GET /my-index-000001/_termvectors/1` to return all information and statistics for field `text` in document 1.
{
  "fields" : ["text"],
  "offsets" : true,
  "payloads" : true,
  "positions" : true,
  "term_statistics" : true,
  "field_statistics" : true
}
Run `GET /my-index-000001/_termvectors/1` to set per-field analyzers. A different analyzer than the one at the field may be provided by using the `per_field_analyzer` parameter.
{
  "doc" : {
    "fullname" : "John Doe",
    "text" : "test test test"
  },
  "fields": ["fullname"],
  "per_field_analyzer" : {
    "fullname": "keyword"
  }
}
Run `GET /imdb/_termvectors` to filter the terms returned based on their tf-idf scores. It returns the three most "interesting" keywords from the artificial document having the given "plot" field value. Notice that the keyword "Tony" or any stop words are not part of the response, as their tf-idf must be too low.
{
  "doc": {
    "plot": "When wealthy industrialist Tony Stark is forced to build an armored suit after a life-threatening incident, he ultimately decides to use its technology to fight against evil."
  },
  "term_statistics": true,
  "field_statistics": true,
  "positions": false,
  "offsets": false,
  "filter": {
    "max_num_terms": 3,
    "min_term_freq": 1,
    "min_doc_freq": 1
  }
}
Run `GET /my-index-000001/_termvectors/1`. Term vectors which are not explicitly stored in the index are automatically computed on the fly. This request returns all information and statistics for the fields in document 1, even though the terms haven't been explicitly stored in the index. Note that for the field text, the terms are not regenerated.
{
  "fields" : ["text", "some_field_without_term_vectors"],
  "offsets" : true,
  "positions" : true,
  "term_statistics" : true,
  "field_statistics" : true
}
Run `GET /my-index-000001/_termvectors`. Term vectors can be generated for artificial documents, that is for documents not present in the index. If dynamic mapping is turned on (default), the document fields not in the original mapping will be dynamically created.
{
  "doc" : {
    "fullname" : "John Doe",
    "text" : "test test test"
  }
}
Response examples (200)
A successful response from `GET /my-index-000001/_termvectors/1`.
{
  "_index": "my-index-000001",
  "_id": "1",
  "_version": 1,
  "found": true,
  "took": 6,
  "term_vectors": {
    "text": {
      "field_statistics": {
        "sum_doc_freq": 4,
        "doc_count": 2,
        "sum_ttf": 6
      },
      "terms": {
        "test": {
          "doc_freq": 2,
          "ttf": 4,
          "term_freq": 3,
          "tokens": [
            {
              "position": 0,
              "start_offset": 0,
              "end_offset": 4,
              "payload": "d29yZA=="
            },
            {
              "position": 1,
              "start_offset": 5,
              "end_offset": 9,
              "payload": "d29yZA=="
            },
            {
              "position": 2,
              "start_offset": 10,
              "end_offset": 14,
              "payload": "d29yZA=="
            }
          ]
        }
      }
    }
  }
}
A successful response from `GET /my-index-000001/_termvectors` with `per_field_analyzer` in the request body.
{
  "_index": "my-index-000001",
  "_version": 0,
  "found": true,
  "took": 6,
  "term_vectors": {
    "fullname": {
      "field_statistics": {
          "sum_doc_freq": 2,
          "doc_count": 4,
          "sum_ttf": 4
      },
      "terms": {
          "John Doe": {
            "term_freq": 1,
            "tokens": [
                {
                  "position": 0,
                  "start_offset": 0,
                  "end_offset": 8
                }
            ]
          }
      }
    }
  }
}
A successful response from `GET /my-index-000001/_termvectors` with a `filter` in the request body.
{
  "_index": "imdb",
  "_version": 0,
  "found": true,
  "term_vectors": {
      "plot": {
        "field_statistics": {
            "sum_doc_freq": 3384269,
            "doc_count": 176214,
            "sum_ttf": 3753460
        },
        "terms": {
            "armored": {
              "doc_freq": 27,
              "ttf": 27,
              "term_freq": 1,
              "score": 9.74725
            },
            "industrialist": {
              "doc_freq": 88,
              "ttf": 88,
              "term_freq": 1,
              "score": 8.590818
            },
            "stark": {
              "doc_freq": 44,
              "ttf": 47,
              "term_freq": 1,
              "score": 9.272792
            }
        }
      }
  }
}

























Delete an enrich policy Added in 7.5.0

DELETE /_enrich/policy/{name}

Deletes an existing enrich policy and its enrich index.

Path parameters

  • name string Required

    Enrich policy to delete.

Query parameters

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_enrich/policy/{name}
curl \
 --request DELETE 'http://api.example.com/_enrich/policy/{name}' \
 --header "Authorization: $API_KEY"




Get an enrich policy Added in 7.5.0

GET /_enrich/policy

Returns information about an enrich policy.

Query parameters

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • policies array[object] Required
      Hide policies attribute Show policies attribute object
GET /_enrich/policy
curl \
 --request GET 'http://api.example.com/_enrich/policy' \
 --header "Authorization: $API_KEY"









Delete an async EQL search Added in 7.9.0

DELETE /_eql/search/{id}

Delete an async EQL search or a stored synchronous EQL search. The API also deletes results for the search.

Path parameters

  • id string Required

    Identifier for the search to delete. A search ID is provided in the EQL search API's response for an async search. A search ID is also provided if the request’s keep_on_completion parameter is true.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_eql/search/{id}
curl \
 --request DELETE 'http://api.example.com/_eql/search/{id}' \
 --header "Authorization: $API_KEY"












ES|QL

The Elasticsearch Query Language (ES|QL) provides a powerful way to filter, transform, and analyze data stored in Elasticsearch, and in the future in other runtimes.

Learn more about ES|QL




Get async ES|QL query results Added in 8.13.0

GET /_query/async/{id}

Get the current status and available results or stored results for an ES|QL asynchronous query. If the Elasticsearch security features are enabled, only the user who first submitted the ES|QL query can retrieve the results using this API.

External documentation

Path parameters

  • id string Required

    The unique identifier of the query. A query ID is provided in the ES|QL async query API response for a query that does not complete in the designated time. A query ID is also provided when the request was submitted with the keep_on_completion parameter set to true.

Query parameters

  • Indicates whether columns that are entirely null will be removed from the columns and values portion of the results. If true, the response will include an extra section under the name all_columns which has the name of all the columns.

  • The period for which the query and its results are stored in the cluster. When this period expires, the query and its results are deleted, even if the query is still ongoing.

  • The period to wait for the request to finish. By default, the request waits for complete query results. If the request completes during the period specified in this parameter, complete query results are returned. Otherwise, the response returns an is_running value of true and no results.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Time unit for milliseconds

    • is_partial boolean
    • all_columns array[object]
      Hide all_columns attributes Show all_columns attributes object
    • columns array[object] Required
      Hide columns attributes Show columns attributes object
    • values array[array] Required

      A field value.

      A field value.

    • Hide _clusters attributes Show _clusters attributes object
    • profile object

      Profiling information. Present if profile was true in the request. The contents of this field are currently unstable.

    • id string
    • is_running boolean Required
GET /_query/async/{id}
curl \
 --request GET 'http://api.example.com/_query/async/{id}' \
 --header "Authorization: $API_KEY"

























































Graph explore

The graph explore API enables you to extract and summarize information about the documents and terms in an Elasticsearch data stream or index.

Get started with Graph

Explore graph analytics

GET /{index}/_graph/explore

Extract and summarize information about the documents and terms in an Elasticsearch data stream or index. The easiest way to understand the behavior of this API is to use the Graph UI to explore connections. An initial request to the _explore API contains a seed query that identifies the documents of interest and specifies the fields that define the vertices and connections you want to include in the graph. Subsequent requests enable you to spider out from one more vertices of interest. You can exclude vertices that have already been returned.

External documentation

Path parameters

  • index string | array[string] Required

    Name of the index.

Query parameters

  • routing string

    Custom value used to route operations to a specific shard.

  • timeout string

    Specifies the period of time to wait for a response from each shard. If no response is received before the timeout expires, the request fails and returns an error. Defaults to no timeout.

application/json

Body

  • Hide connections attributes Show connections attributes object
    • query object

      An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

      External documentation
    • vertices array[object] Required

      Contains the fields you are interested in.

      Hide vertices attributes Show vertices attributes object
      • exclude array[string]

        Prevents the specified terms from being included in the results.

      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • include array[object]

        Identifies the terms of interest that form the starting points from which you want to spider out.

        Hide include attributes Show include attributes object
      • Specifies how many documents must contain a pair of terms before it is considered to be a useful connection. This setting acts as a certainty threshold.

      • Controls how many documents on a particular shard have to contain a pair of terms before the connection is returned for global consideration.

      • size number

        Specifies the maximum number of vertex terms returned for each field.

  • controls object
    Hide controls attributes Show controls attributes object
    • Hide sample_diversity attributes Show sample_diversity attributes object
      • field string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • max_docs_per_value number Required
    • Each hop considers a sample of the best-matching documents on each shard. Using samples improves the speed of execution and keeps exploration focused on meaningfully-connected terms. Very small values (less than 50) might not provide sufficient weight-of-evidence to identify significant connections between terms. Very large sample sizes can dilute the quality of the results and increase execution times.

    • timeout string

      A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

    • use_significance boolean Required

      Filters associated terms so only those that are significantly associated with your query are included.

  • query object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • vertices array[object]

    Specifies one or more fields that contain the terms you want to include in the graph as vertices.

    Hide vertices attributes Show vertices attributes object
    • exclude array[string]

      Prevents the specified terms from being included in the results.

    • field string Required

      Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

    • include array[object]

      Identifies the terms of interest that form the starting points from which you want to spider out.

      Hide include attributes Show include attributes object
    • Specifies how many documents must contain a pair of terms before it is considered to be a useful connection. This setting acts as a certainty threshold.

    • Controls how many documents on a particular shard have to contain a pair of terms before the connection is returned for global consideration.

    • size number

      Specifies the maximum number of vertex terms returned for each field.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
GET /{index}/_graph/explore
curl \
 --request GET 'http://api.example.com/{index}/_graph/explore' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"query\": {\n    \"match\": {\n      \"query.raw\": \"midi\"\n    }\n  },\n  \"vertices\": [\n    {\n      \"field\": \"product\"\n    }\n  ],\n  \"connections\": {\n    \"vertices\": [\n      {\n        \"field\": \"query.raw\"\n      }\n    ]\n  }\n}"'
Request example
Run `POST clicklogs/_graph/explore` for a basic exploration An initial graph explore query typically begins with a query to identify strongly related terms. Seed the exploration with a query. This example is searching `clicklogs` for people who searched for the term `midi`.Identify the vertices to include in the graph. This example is looking for product codes that are significantly associated with searches for `midi`. Find the connections. This example is looking for other search terms that led people to click on the products that are associated with searches for `midi`.
{
  "query": {
    "match": {
      "query.raw": "midi"
    }
  },
  "vertices": [
    {
      "field": "product"
    }
  ],
  "connections": {
    "vertices": [
      {
        "field": "query.raw"
      }
    ]
  }
}





















Check component templates Added in 7.8.0

HEAD /_component_template/{name}

Returns information about whether a particular component template exists.

Path parameters

  • name string | array[string] Required

    Comma-separated list of component template names used to limit the request. Wildcard (*) expressions are supported.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • local boolean

    If true, the request retrieves information from the local node only. Defaults to false, which means information is retrieved from the master node.

Responses

HEAD /_component_template/{name}
curl \
 --request HEAD 'http://api.example.com/_component_template/{name}' \
 --header "Authorization: $API_KEY"

Get component templates Added in 7.8.0

GET /_component_template

Get information about component templates.

Query parameters

  • If true, returns settings in flat format.

  • Return all default configurations for the component template (default: false)

  • local boolean

    If true, the request retrieves information from the local node only. If false, information is retrieved from the master node.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

GET /_component_template
curl \
 --request GET 'http://api.example.com/_component_template' \
 --header "Authorization: $API_KEY"




















Get tokens from text analysis

POST /_analyze

The analyze API performs analysis on a text string and returns the resulting tokens.

Generating excessive amount of tokens may cause a node to run out of memory. The index.analyze.max_token_count setting enables you to limit the number of tokens that can be produced. If more than this limit of tokens gets generated, an error occurs. The _analyze endpoint without a specified index will always use 10000 as its limit.

External documentation

Query parameters

  • index string

    Index used to derive the analyzer. If specified, the analyzer or field parameter overrides this value. If no index is specified or the index does not have a default analyzer, the analyze API uses the standard analyzer.

application/json

Body

Responses

POST /_analyze
curl \
 --request POST 'http://api.example.com/_analyze' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"analyzer\": \"standard\",\n  \"text\": \"this is a test\"\n}"'
You can apply any of the built-in analyzers to the text string without specifying an index.
{
  "analyzer": "standard",
  "text": "this is a test"
}
If the text parameter is provided as array of strings, it is analyzed as a multi-value field.
{
  "analyzer": "standard",
  "text": [
    "this is a test",
    "the second text"
  ]
}
You can test a custom transient analyzer built from tokenizers, token filters, and char filters. Token filters use the filter parameter.
{
  "tokenizer": "keyword",
  "filter": [
    "lowercase"
  ],
  "char_filter": [
    "html_strip"
  ],
  "text": "this is a <b>test</b>"
}
Custom tokenizers, token filters, and character filters can be specified in the request body.
{
  "tokenizer": "whitespace",
  "filter": [
    "lowercase",
    {
      "type": "stop",
      "stopwords": [
        "a",
        "is",
        "this"
      ]
    }
  ],
  "text": "this is a test"
}
Run `GET /analyze_sample/_analyze` to run an analysis on the text using the default index analyzer associated with the `analyze_sample` index. Alternatively, the analyzer can be derived based on a field mapping.
{
  "field": "obj1.field1",
  "text": "this is a test"
}
Run `GET /analyze_sample/_analyze` and supply a normalizer for a keyword field if there is a normalizer associated with the specified index.
{
  "normalizer": "my_normalizer",
  "text": "BaR"
}
If you want to get more advanced details, set `explain` to `true`. It will output all token attributes for each token. You can filter token attributes you want to output by setting the `attributes` option. NOTE: The format of the additional detail information is labelled as experimental in Lucene and it may change in the future.
{
  "tokenizer": "standard",
  "filter": [
    "snowball"
  ],
  "text": "detailed output",
  "explain": true,
  "attributes": [
    "keyword"
  ]
}
Response examples (200)
A successful response for an analysis with `explain` set to `true`.
{
  "detail": {
    "custom_analyzer": true,
    "charfilters": [],
    "tokenizer": {
      "name": "standard",
      "tokens": [
        {
          "token": "detailed",
          "start_offset": 0,
          "end_offset": 8,
          "type": "<ALPHANUM>",
          "position": 0
        },
        {
          "token": "output",
          "start_offset": 9,
          "end_offset": 15,
          "type": "<ALPHANUM>",
          "position": 1
        }
      ]
    },
    "tokenfilters": [
      {
        "name": "snowball",
        "tokens": [
          {
            "token": "detail",
            "start_offset": 0,
            "end_offset": 8,
            "type": "<ALPHANUM>",
            "position": 0,
            "keyword": false
          },
          {
            "token": "output",
            "start_offset": 9,
            "end_offset": 15,
            "type": "<ALPHANUM>",
            "position": 1,
            "keyword": false
          }
        ]
      }
    ]
  }
}




















Clone an index Added in 7.4.0

POST /{index}/_clone/{target}

Clone an existing index into a new index. Each original primary shard is cloned into a new primary shard in the new index.

IMPORTANT: Elasticsearch does not apply index templates to the resulting index. The API also does not copy index metadata from the original index. Index metadata includes aliases, index lifecycle management phase definitions, and cross-cluster replication (CCR) follower information. For example, if you clone a CCR follower index, the resulting clone will not be a follower index.

The clone API copies most index settings from the source index to the resulting index, with the exception of index.number_of_replicas and index.auto_expand_replicas. To set the number of replicas in the resulting index, configure these settings in the clone request.

Cloning works as follows:

  • First, it creates a new target index with the same definition as the source index.
  • Then it hard-links segments from the source index into the target index. If the file system does not support hard-linking, all segments are copied into the new index, which is a much more time consuming process.
  • Finally, it recovers the target index as though it were a closed index which had just been re-opened.

IMPORTANT: Indices can only be cloned if they meet the following requirements:

  • The index must be marked as read-only and have a cluster health status of green.
  • The target index must not exist.
  • The source index must have the same number of primary shards as the target index.
  • The node handling the clone process must have sufficient free disk space to accommodate a second copy of the existing index.

The current write index on a data stream cannot be cloned. In order to clone the current write index, the data stream must first be rolled over so that a new write index is created and then the previous write index can be cloned.

NOTE: Mappings cannot be specified in the _clone request. The mappings of the source index will be used for the target index.

Monitor the cloning process

The cloning process can be monitored with the cat recovery API or the cluster health API can be used to wait until all primary shards have been allocated by setting the wait_for_status parameter to yellow.

The _clone API returns as soon as the target index has been added to the cluster state, before any shards have been allocated. At this point, all shards are in the state unassigned. If, for any reason, the target index can't be allocated, its primary shard will remain unassigned until it can be allocated on that node.

Once the primary shard is allocated, it moves to state initializing, and the clone process begins. When the clone operation completes, the shard will become active. At that point, Elasticsearch will try to allocate any replicas and may decide to relocate the primary shard to another node.

Wait for active shards

Because the clone operation creates a new index to clone the shards to, the wait for active shards setting on index creation applies to the clone index action as well.

Path parameters

  • index string Required

    Name of the source index to clone.

  • target string Required

    Name of the target index to create.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

  • wait_for_active_shards number | string

    The number of shard copies that must be active before proceeding with the operation. Set to all or any positive integer up to the total number of shards in the index (number_of_replicas+1).

application/json

Body

  • aliases object

    Aliases for the resulting index.

    Hide aliases attribute Show aliases attribute object
  • settings object

    Configuration options for the target index.

    Hide settings attribute Show settings attribute object
    • * object Additional properties

Responses

POST /{index}/_clone/{target}
curl \
 --request POST 'http://api.example.com/{index}/_clone/{target}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"settings\": {\n    \"index.number_of_shards\": 5\n  },\n  \"aliases\": {\n    \"my_search_indices\": {}\n  }\n}"'
Request example
Clone `my_source_index` into a new index called `my_target_index` with `POST /my_source_index/_clone/my_target_index`. The API accepts `settings` and `aliases` parameters for the target index.
{
  "settings": {
    "index.number_of_shards": 5
  },
  "aliases": {
    "my_search_indices": {}
  }
}












Delete indices

DELETE /{index}

Deleting an index deletes its documents, shards, and metadata. It does not delete related Kibana components, such as data views, visualizations, or dashboards.

You cannot delete the current write index of a data stream. To delete the index, you must roll over the data stream so a new write index is created. You can then use the delete index API to delete the previous write index.

Path parameters

  • index string | array[string] Required

    Comma-separated list of indices to delete. You cannot specify index aliases. By default, this parameter does not support wildcards (*) or _all. To use wildcards or _all, set the action.destructive_requires_name cluster setting to false.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
DELETE /{index}
curl \
 --request DELETE 'http://api.example.com/{index}' \
 --header "Authorization: $API_KEY"

Check indices

HEAD /{index}

Check if one or more indices, index aliases, or data streams exist.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and aliases. Supports wildcards (*).

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If true, returns settings in flat format.

  • If false, the request returns an error if it targets a missing or closed index.

  • If true, return all default settings in the response.

  • local boolean

    If true, the request retrieves information from the local node only.

Responses

HEAD /{index}
curl \
 --request HEAD 'http://api.example.com/{index}' \
 --header "Authorization: $API_KEY"




Create or update an alias

PUT /{index}/_alias/{name}

Adds a data stream or index to an alias.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams or indices to add. Supports wildcards (*). Wildcard patterns that match both data streams and indices return an error.

  • name string Required

    Alias to update. If the alias doesn’t exist, the request creates it. Index alias names support date math.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

application/json

Body

  • filter object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • If true, sets the write index or data stream for the alias. If an alias points to multiple indices or data streams and is_write_index isn’t set, the alias rejects write requests. If an index alias points to one index and is_write_index isn’t set, the index automatically acts as the write index. Data stream aliases don’t automatically set a write data stream, even if the alias points to one data stream.

  • routing string

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /{index}/_alias/{name}
curl \
 --request PUT 'http://api.example.com/{index}/_alias/{name}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"actions\": [\n    {\n      \"add\": {\n        \"index\": \"my-data-stream\",\n        \"alias\": \"my-alias\"\n      }\n    }\n  ]\n}"'
Request example
{
  "actions": [
    {
      "add": {
        "index": "my-data-stream",
        "alias": "my-alias"
      }
    }
  ]
}
















Create or update an alias

POST /{index}/_aliases/{name}

Adds a data stream or index to an alias.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams or indices to add. Supports wildcards (*). Wildcard patterns that match both data streams and indices return an error.

  • name string Required

    Alias to update. If the alias doesn’t exist, the request creates it. Index alias names support date math.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

application/json

Body

  • filter object

    An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

    External documentation
  • If true, sets the write index or data stream for the alias. If an alias points to multiple indices or data streams and is_write_index isn’t set, the alias rejects write requests. If an index alias points to one index and is_write_index isn’t set, the index automatically acts as the write index. Data stream aliases don’t automatically set a write data stream, even if the alias points to one data stream.

  • routing string

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /{index}/_aliases/{name}
curl \
 --request POST 'http://api.example.com/{index}/_aliases/{name}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"actions\": [\n    {\n      \"add\": {\n        \"index\": \"my-data-stream\",\n        \"alias\": \"my-alias\"\n      }\n    }\n  ]\n}"'
Request example
{
  "actions": [
    {
      "add": {
        "index": "my-data-stream",
        "alias": "my-alias"
      }
    }
  ]
}




Delete data stream lifecycles Added in 8.11.0

DELETE /_data_stream/{name}/_lifecycle

Removes the data stream lifecycle from a data stream, rendering it not managed by the data stream lifecycle.

Path parameters

  • name string | array[string] Required

    A comma-separated list of data streams of which the data stream lifecycle will be deleted; use * to get all data streams

Query parameters

  • expand_wildcards string | array[string]

    Whether wildcard expressions should get expanded to open or closed indices (default: open)

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • Specify timeout for connection to master

  • timeout string

    Explicit timestamp for the document

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_data_stream/{name}/_lifecycle
curl \
 --request DELETE 'http://api.example.com/_data_stream/{name}/_lifecycle' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response for deleting a data stream lifecycle.
{
  "acknowledged": true
}








































Analyze the index disk usage Technical preview

POST /{index}/_disk_usage

Analyze the disk usage of each field of an index or data stream. This API might not support indices created in previous Elasticsearch versions. The result of a small index can be inaccurate as some parts of an index might not be analyzed by the API.

NOTE: The total size of fields of the analyzed shards of the index in the response is usually smaller than the index store_size value because some small metadata files are ignored and some parts of data files might not be scanned by the API. Since stored fields are stored together in a compressed format, the sizes of stored fields are also estimates and can be inaccurate. The stored size of the _id field is likely underestimated while the _source field is overestimated.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and aliases used to limit the request. It’s recommended to execute this API with a single index (or the latest backing index of a data stream) as the API consumes resources significantly.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • flush boolean

    If true, the API performs a flush before analysis. If false, the response may not include uncommitted data.

  • If true, missing or closed indices are not included in the response.

  • Analyzing field disk usage is resource-intensive. To use the API, this parameter must be set to true.

Responses

POST /{index}/_disk_usage
curl \
 --request POST 'http://api.example.com/{index}/_disk_usage' \
 --header "Authorization: $API_KEY"




















Flush data streams or indices

GET /{index}/_flush

Flushing a data stream or index is the process of making sure that any data that is currently only stored in the transaction log is also permanently stored in the Lucene index. When restarting, Elasticsearch replays any unflushed operations from the transaction log into the Lucene index to bring it back into the state that it was in before the restart. Elasticsearch automatically triggers flushes as needed, using heuristics that trade off the size of the unflushed transaction log against the cost of performing each flush.

After each operation has been flushed it is permanently stored in the Lucene index. This may mean that there is no need to maintain an additional copy of it in the transaction log. The transaction log is made up of multiple files, called generations, and Elasticsearch will delete any generation files when they are no longer needed, freeing up disk space.

It is also possible to trigger a flush on one or more indices using the flush API, although it is rare for users to need to call this API directly. If you call the flush API after indexing some documents then a successful response indicates that Elasticsearch has flushed all the documents that were indexed before the flush API was called.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and aliases to flush. Supports wildcards (*). To flush all data streams and indices, omit this parameter or use * or _all.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • force boolean

    If true, the request forces a flush even if there are no changes to commit to the index.

  • If false, the request returns an error if it targets a missing or closed index.

  • If true, the flush operation blocks until execution when another flush operation is running. If false, Elasticsearch returns an error if you request a flush when another flush operation is running.

Responses

GET /{index}/_flush
curl \
 --request GET 'http://api.example.com/{index}/_flush' \
 --header "Authorization: $API_KEY"




Force a merge Added in 2.1.0

POST /_forcemerge

Perform the force merge operation on the shards of one or more indices. For data streams, the API forces a merge on the shards of the stream's backing indices.

Merging reduces the number of segments in each shard by merging some of them together and also frees up the space used by deleted documents. Merging normally happens automatically, but sometimes it is useful to trigger a merge manually.

WARNING: We recommend force merging only a read-only index (meaning the index is no longer receiving writes). When documents are updated or deleted, the old version is not immediately removed but instead soft-deleted and marked with a "tombstone". These soft-deleted documents are automatically cleaned up during regular segment merges. But force merge can cause very large (greater than 5 GB) segments to be produced, which are not eligible for regular merges. So the number of soft-deleted documents can then grow rapidly, resulting in higher disk usage and worse search performance. If you regularly force merge an index receiving writes, this can also make snapshots more expensive, since the new documents can't be backed up incrementally.

Blocks during a force merge

Calls to this API block until the merge is complete (unless request contains wait_for_completion=false). If the client connection is lost before completion then the force merge process will continue in the background. Any new requests to force merge the same indices will also block until the ongoing force merge is complete.

Running force merge asynchronously

If the request contains wait_for_completion=false, Elasticsearch performs some preflight checks, launches the request, and returns a task you can use to get the status of the task. However, you can not cancel this task as the force merge task is not cancelable. Elasticsearch creates a record of this task as a document at _tasks/<task_id>. When you are done with a task, you should delete the task document so Elasticsearch can reclaim the space.

Force merging multiple indices

You can force merge multiple indices with a single request by targeting:

  • One or more data streams that contain multiple backing indices
  • Multiple indices
  • One or more aliases
  • All data streams and indices in a cluster

Each targeted shard is force-merged separately using the force_merge threadpool. By default each node only has a single force_merge thread which means that the shards on that node are force-merged one at a time. If you expand the force_merge threadpool on a node then it will force merge its shards in parallel

Force merge makes the storage for the shard being merged temporarily increase, as it may require free space up to triple its size in case max_num_segments parameter is set to 1, to rewrite all segments into a new one.

Data streams and time-based indices

Force-merging is useful for managing a data stream's older backing indices and other time-based indices, particularly after a rollover. In these cases, each index only receives indexing traffic for a certain period of time. Once an index receive no more writes, its shards can be force-merged to a single segment. This can be a good idea because single-segment shards can sometimes use simpler and more efficient data structures to perform searches. For example:

POST /.ds-my-data-stream-2099.03.07-000001/_forcemerge?max_num_segments=1
External documentation

Query parameters

  • Whether to ignore if a wildcard indices expression resolves into no concrete indices. (This includes _all string or when no indices have been specified)

  • expand_wildcards string | array[string]

    Whether to expand wildcard expression to concrete indices that are open, closed or both.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • flush boolean

    Specify whether the index should be flushed after performing the operation (default: true)

  • Whether specified concrete indices should be ignored when unavailable (missing or closed)

  • The number of segments the index should be merged into (default: dynamic)

  • Specify whether the operation should only expunge deleted documents

  • Should the request wait until the force merge is completed.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • _shards object
      Hide _shards attributes Show _shards attributes object
    • task string

      task contains a task id returned when wait_for_completion=false, you can use the task_id to get the status of the task at _tasks/

POST /_forcemerge
curl \
 --request POST 'http://api.example.com/_forcemerge' \
 --header "Authorization: $API_KEY"




Get aliases

GET /_alias

Retrieves information for one or more data stream or index aliases.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • * object Additional properties
      Hide * attribute Show * attribute object
      • aliases object Required
        Hide aliases attribute Show aliases attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • filter object

            An Elasticsearch Query DSL (Domain Specific Language) object that defines a query.

            External documentation
          • Value used to route indexing operations to a specific shard. If specified, this overwrites the routing value for indexing operations.

          • If true, the index is the write index for the alias.

          • routing string

            Value used to route indexing and search operations to a specific shard.

          • Value used to route search operations to a specific shard. If specified, this overwrites the routing value for search operations.

          • is_hidden boolean

            If true, the alias is hidden. All indices for the alias must have the same is_hidden value.

GET /_alias
curl \
 --request GET 'http://api.example.com/_alias' \
 --header "Authorization: $API_KEY"




















































































Refresh an index

POST /{index}/_refresh

A refresh makes recent operations performed on one or more indices available for search. For data streams, the API runs the refresh operation on the stream’s backing indices.

By default, Elasticsearch periodically refreshes indices every second, but only on indices that have received one search request or more in the last 30 seconds. You can change this default interval with the index.refresh_interval setting.

Refresh requests are synchronous and do not return a response until the refresh operation completes.

Refreshes are resource-intensive. To ensure good cluster performance, it's recommended to wait for Elasticsearch's periodic refresh rather than performing an explicit refresh when possible.

If your application workflow indexes documents and then runs a search to retrieve the indexed document, it's recommended to use the index API's refresh=wait_for query parameter option. This option ensures the indexing operation waits for a periodic refresh before running the search.

Path parameters

  • index string | array[string] Required

    Comma-separated list of data streams, indices, and aliases used to limit the request. Supports wildcards (*). To target all data streams and indices, omit this parameter or use * or _all.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

Responses

POST /{index}/_refresh
curl \
 --request POST 'http://api.example.com/{index}/_refresh' \
 --header "Authorization: $API_KEY"




























Get index segments

GET /_segments

Get low-level information about the Lucene segments in index shards. For data streams, the API returns information about the stream's backing indices.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden. Valid values are: all, open, closed, hidden, none.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If false, the request returns an error if it targets a missing or closed index.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
GET /_segments
curl \
 --request GET 'http://api.example.com/_segments' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response for creating a new index for a data stream.
{
  "acknowledged": true,
  "shards_acknowledged": true,
  "old_index": ".ds-my-data-stream-2099.05.06-000001",
  "new_index": ".ds-my-data-stream-2099.05.07-000002",
  "rolled_over": true,
  "dry_run": false,
  "lazy": false,
  "conditions": {
    "[max_age: 7d]": false,
    "[max_docs: 1000]": true,
    "[max_primary_shard_size: 50gb]": false,
    "[max_primary_shard_docs: 2000]": false
  }
}








Get index shard stores

GET /{index}/_shard_stores

Get store information about replica shards in one or more indices. For data streams, the API retrieves store information for the stream's backing indices.

The index shard stores API returns the following information:

  • The node on which each replica shard exists.
  • The allocation ID for each replica shard.
  • A unique ID for each replica shard.
  • Any errors encountered while opening the shard index or from an earlier failure.

By default, the API returns store information only for primary shards that are unassigned or have one or more unassigned replica shards.

Path parameters

  • index string | array[string] Required

    List of data streams, indices, and aliases used to limit the request.

Query parameters

  • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices.

  • expand_wildcards string | array[string]

    Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams.

    Supported values include:

    • all: Match any data stream or index, including hidden ones.
    • open: Match open, non-hidden indices. Also matches any non-hidden data stream.
    • closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed.
    • hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both.
    • none: Wildcard expressions are not accepted.
  • If true, missing or closed indices are not included in the response.

  • status string | array[string]

    List of shard health statuses used to limit the request.

    Supported values include:

    • green: The primary shard and all replica shards are assigned.
    • yellow: One or more replica shards are unassigned.
    • red: The primary shard is unassigned.
    • all: Return all shards, regardless of health status.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • indices object Required
      Hide indices attribute Show indices attribute object
      • * object Additional properties
        Hide * attribute Show * attribute object
        • shards object Required
          Hide shards attribute Show shards attribute object
          • * object Additional properties
            Hide * attribute Show * attribute object
GET /{index}/_shard_stores
curl \
 --request GET 'http://api.example.com/{index}/_shard_stores' \
 --header "Authorization: $API_KEY"
Response examples (200)
An abbreviated response from `GET /_shard_stores?status=green`.
{
  "indices": {
    "my-index-000001": {
      "shards": {
        "0": {
          "stores": [
            {
              "sPa3OgxLSYGvQ4oPs-Tajw": {
                "name": "node_t0",
                "ephemeral_id": "9NlXRFGCT1m8tkvYCMK-8A",
                "transport_address": "local[1]",
                "external_id": "node_t0",
                "attributes": {},
                "roles": [],
                "version": "8.10.0",
                "min_index_version": 7000099,
                "max_index_version": 8100099
              },
              "allocation_id": "2iNySv_OQVePRX-yaRH_lQ",
              "allocation": "primary",
              "store_exception": {}
            }
          ]
        }
      }
    }
  }
}








































































Delete a lifecycle policy Added in 6.6.0

DELETE /_ilm/policy/{policy}

You cannot delete policies that are currently in use. If the policy is being used to manage any indices, the request fails and returns an error.

Path parameters

  • policy string Required

    Identifier for the policy.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ilm/policy/{policy}
curl \
 --request DELETE 'http://api.example.com/_ilm/policy/{policy}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when deleting a lifecycle policy.
{
  "acknowledged": true
}

Explain the lifecycle state Added in 6.6.0

GET /{index}/_ilm/explain

Get the current lifecycle status for one or more indices. For data streams, the API retrieves the current lifecycle status for the stream's backing indices.

The response indicates when the index entered each lifecycle state, provides the definition of the running phase, and information about any failures.

Path parameters

  • index string Required

    Comma-separated list of data streams, indices, and aliases to target. Supports wildcards (*). To target all data streams and indices, use * or _all.

Query parameters

  • Filters the returned indices to only indices that are managed by ILM and are in an error state, either due to an encountering an error while executing the policy, or attempting to use a policy that does not exist.

  • Filters the returned indices to only indices that are managed by ILM.

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
GET /{index}/_ilm/explain
curl \
 --request GET 'http://api.example.com/{index}/_ilm/explain' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when retrieving the current ILM status for an index.
{
  "indices": {
    "my-index-000001": {
      "index": "my-index-000001",
      "index_creation_date_millis": 1538475653281,
      "index_creation_date": "2018-10-15T13:45:21.981Z",
      "time_since_index_creation": "15s",
      "managed": true,
      "policy": "my_policy",
      "lifecycle_date_millis": 1538475653281,
      "lifecycle_date": "2018-10-15T13:45:21.981Z",
      "age": "15s",
      "phase": "new",
      "phase_time_millis": 1538475653317,
      "phase_time": "2018-10-15T13:45:22.577Z",
      "action": "complete"
      "action_time_millis": 1538475653317,
      "action_time": "2018-10-15T13:45:22.577Z",
      "step": "complete",
      "step_time_millis": 1538475653317,
      "step_time": "2018-10-15T13:45:22.577Z"
    }
  }
}




Get the ILM status Added in 6.6.0

GET /_ilm/status

Get the current index lifecycle management status.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
GET /_ilm/status
curl \
 --request GET 'http://api.example.com/_ilm/status' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when retrieving the current ILM status.
{
  "operation_mode": "RUNNING"
}




Move to a lifecycle step Added in 6.6.0

POST /_ilm/move/{index}

Manually move an index into a specific step in the lifecycle policy and run that step.

WARNING: This operation can result in the loss of data. Manually moving an index into a specific step runs that step even if it has already been performed. This is a potentially destructive action and this should be considered an expert level API.

You must specify both the current step and the step to be executed in the body of the request. The request will fail if the current step does not match the step currently running for the index This is to prevent the index from being moved from an unexpected step into the next step.

When specifying the target (next_step) to which the index will be moved, either the name or both the action and name fields are optional. If only the phase is specified, the index will move to the first step of the first action in the target phase. If the phase and action are specified, the index will move to the first step of the specified action in the specified phase. Only actions specified in the ILM policy are considered valid. An index cannot move to a step that is not part of its policy.

Path parameters

  • index string Required

    The name of the index whose lifecycle step is to change

application/json

Body

  • current_step object Required
    Hide current_step attributes Show current_step attributes object
    • action string

      The optional action to which the index will be moved.

    • name string

      The optional step name to which the index will be moved.

    • phase string Required
  • next_step object Required
    Hide next_step attributes Show next_step attributes object
    • action string

      The optional action to which the index will be moved.

    • name string

      The optional step name to which the index will be moved.

    • phase string Required

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_ilm/move/{index}
curl \
 --request POST 'http://api.example.com/_ilm/move/{index}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"current_step\": {\n    \"phase\": \"new\",\n    \"action\": \"complete\",\n    \"name\": \"complete\"\n  },\n  \"next_step\": {\n    \"phase\": \"warm\",\n    \"action\": \"forcemerge\",\n    \"name\": \"forcemerge\"\n  }\n}"'
Request examples
Run `POST _ilm/move/my-index-000001` to move `my-index-000001` from the initial step to the `forcemerge` step.
{
  "current_step": {
    "phase": "new",
    "action": "complete",
    "name": "complete"
  },
  "next_step": {
    "phase": "warm",
    "action": "forcemerge",
    "name": "forcemerge"
  }
}
Run `POST _ilm/move/my-index-000001` to move `my-index-000001` from the end of hot phase into the start of warm.
{
  "current_step": {
    "phase": "hot",
    "action": "complete",
    "name": "complete"
  },
  "next_step": {
    "phase": "warm"
  }
}
Response examples (200)
A successful response when running a specific step in a lifecycle policy.
{
  "acknowledged": true
}





























Create an inference endpoint Added in 8.11.0

PUT /_inference/{inference_id}

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Mistral, Azure OpenAI, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.

Path parameters

application/json

Body Required

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    The service type

  • service_settings object Required

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"chunking_settings":{"max_chunk_size":42.0,"overlap":42.0,"sentence_overlap":42.0,"strategy":"string"},"service":"string","service_settings":{},"task_settings":{}}'

Perform inference on the service Added in 8.11.0

POST /_inference/{inference_id}

This API enables you to use machine learning models to perform specific tasks on data that you provide as an input. It returns a response with the results of the tasks. The inference endpoint you use can perform one specific task that has been defined when the endpoint was created with the create inference API.

For details about using this API with a service, such as Amazon Bedrock, Anthropic, or HuggingFace, refer to the service-specific documentation.


The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.

Path parameters

  • inference_id string Required

    The unique identifier for the inference endpoint.

Query parameters

  • timeout string

    The amount of time to wait for the inference request to complete.

application/json

Body

  • query string

    The query input, which is required only for the rerank task. It is not required for other tasks.

  • input string | array[string] Required

    The text on which you want to perform the inference task. It can be a single string or an array.


    Inference endpoints for the completion task type currently only support a single string as input.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide text_embedding_bytes attribute Show text_embedding_bytes attribute object
      • embedding array[number] Required

        Text Embedding results containing bytes are represented as Dense Vectors of bytes.

    • text_embedding_bits array[object]
      Hide text_embedding_bits attribute Show text_embedding_bits attribute object
      • embedding array[number] Required

        Text Embedding results containing bytes are represented as Dense Vectors of bytes.

    • text_embedding array[object]
      Hide text_embedding attribute Show text_embedding attribute object
      • embedding array[number] Required

        Text Embedding results are represented as Dense Vectors of floats.

    • sparse_embedding array[object]
      Hide sparse_embedding attribute Show sparse_embedding attribute object
      • embedding object Required

        Sparse Embedding tokens are represented as a dictionary of string to double.

        Hide embedding attribute Show embedding attribute object
        • * number Additional properties
    • completion array[object]
      Hide completion attribute Show completion attribute object
    • rerank array[object]
      Hide rerank attributes Show rerank attributes object
POST /_inference/{inference_id}
curl \
 --request POST 'http://api.example.com/_inference/{inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"query":"string","input":"string","task_settings":{}}'
























Create an AlibabaCloud AI Search inference endpoint Added in 8.16.0

PUT /_inference/{task_type}/{alibabacloud_inference_id}

Create an inference endpoint to perform an inference task with the alibabacloud-ai-search service.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform.

    Values are completion, rerank, space_embedding, or text_embedding.

  • The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is alibabacloud-ai-search.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string Required

      A valid API key for the AlibabaCloud AI Search API.

    • host string Required

      The name of the host address used for the inference task. You can find the host address in the API keys section of the documentation.

      External documentation
    • Hide rate_limit attribute Show rate_limit attribute object
    • service_id string Required

      The name of the model service to use for the inference task. The following service IDs are available for the completion task:

      • ops-qwen-turbo
      • qwen-turbo
      • qwen-plus
      • qwen-max ÷ qwen-max-longcontext

      The following service ID is available for the rerank task:

      • ops-bge-reranker-larger

      The following service ID is available for the sparse_embedding task:

      • ops-text-sparse-embedding-001

      The following service IDs are available for the text_embedding task:

      ops-text-embedding-001 ops-text-embedding-zh-001 ops-text-embedding-en-001 ops-text-embedding-002

    • workspace string Required

      The name of the workspace used for the inference task.

  • Hide task_settings attributes Show task_settings attributes object
    • For a sparse_embedding or text_embedding task, specify the type of input passed to the model. Valid values are:

      • ingest for storing document embeddings in a vector database.
      • search for storing embeddings of search queries run against a vector database to find relevant documents.
    • For a sparse_embedding task, it affects whether the token name will be returned in the response. It defaults to false, which means only the token ID will be returned in the response.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{alibabacloud_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{alibabacloud_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"alibabacloud-ai-search\",\n    \"service_settings\": {\n        \"host\" : \"default-j01.platform-cn-shanghai.opensearch.aliyuncs.com\",\n        \"api_key\": \"AlibabaCloud-API-Key\",\n        \"service_id\": \"ops-qwen-turbo\",\n        \"workspace\" : \"default\"\n    }\n}"'
Run `PUT _inference/completion/alibabacloud_ai_search_completion` to create an inference endpoint that performs a completion task.
{
    "service": "alibabacloud-ai-search",
    "service_settings": {
        "host" : "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
        "api_key": "AlibabaCloud-API-Key",
        "service_id": "ops-qwen-turbo",
        "workspace" : "default"
    }
}
Run `PUT _inference/rerank/alibabacloud_ai_search_rerank` to create an inference endpoint that performs a rerank task.
{
    "service": "alibabacloud-ai-search",
    "service_settings": {
        "api_key": "AlibabaCloud-API-Key",
        "service_id": "ops-bge-reranker-larger",
        "host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
        "workspace": "default"
    }
}
Run `PUT _inference/sparse_embedding/alibabacloud_ai_search_sparse` to create an inference endpoint that performs perform a sparse embedding task.
{
    "service": "alibabacloud-ai-search",
    "service_settings": {
        "api_key": "AlibabaCloud-API-Key",
        "service_id": "ops-text-sparse-embedding-001",
        "host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
        "workspace": "default"
    }
}
Run `PUT _inference/text_embedding/alibabacloud_ai_search_embeddings` to create an inference endpoint that performs a text embedding task.
{
    "service": "alibabacloud-ai-search",
    "service_settings": {
        "api_key": "AlibabaCloud-API-Key",
        "service_id": "ops-text-embedding-001",
        "host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
        "workspace": "default"
    }
}




Create an Anthropic inference endpoint Added in 8.16.0

PUT /_inference/{task_type}/{anthropic_inference_id}

Create an inference endpoint to perform an inference task with the anthropic service.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The task type. The only valid task type for the model to perform is completion.

    Value is completion.

  • anthropic_inference_id string Required

    The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is anthropic.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string Required

      A valid API key for the Anthropic API.

    • model_id string Required

      The name of the model to use for the inference task. Refer to the Anthropic documentation for the list of supported models.

    • Hide rate_limit attribute Show rate_limit attribute object
  • Hide task_settings attributes Show task_settings attributes object
    • max_tokens number Required

      For a completion task, it is the maximum number of tokens to generate before stopping.

    • For a completion task, it is the amount of randomness injected into the response. For more details about the supported range, refer to Anthropic documentation.

      External documentation
    • top_k number

      For a completion task, it specifies to only sample from the top K options for each subsequent token. It is recommended for advanced use cases only. You usually only need to use temperature.

    • top_p number

      For a completion task, it specifies to use Anthropic's nucleus sampling. In nucleus sampling, Anthropic computes the cumulative distribution over all the options for each subsequent token in decreasing probability order and cuts it off once it reaches the specified probability. You should either alter temperature or top_p, but not both. It is recommended for advanced use cases only. You usually only need to use temperature.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{anthropic_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{anthropic_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"anthropic\",\n    \"service_settings\": {\n        \"api_key\": \"Anthropic-Api-Key\",\n        \"model_id\": \"Model-ID\"\n    },\n    \"task_settings\": {\n        \"max_tokens\": 1024\n    }\n}"'
Request example
Run `PUT _inference/completion/anthropic_completion` to create an inference endpoint that performs a completion task.
{
    "service": "anthropic",
    "service_settings": {
        "api_key": "Anthropic-Api-Key",
        "model_id": "Model-ID"
    },
    "task_settings": {
        "max_tokens": 1024
    }
}




Create an Azure OpenAI inference endpoint Added in 8.14.0

PUT /_inference/{task_type}/{azureopenai_inference_id}

Create an inference endpoint to perform an inference task with the azureopenai service.

The list of chat completion models that you can choose from in your Azure OpenAI deployment include:

The list of embeddings models that you can choose from in your deployment can be found in the Azure models documentation.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform. NOTE: The chat_completion task type only supports streaming and only through the _stream API.

    Values are completion or text_embedding.

  • The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is azureopenai.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string

      A valid API key for your Azure OpenAI account. You must specify either api_key or entra_id. If you do not provide either or you provide both, you will receive an error when you try to create your model.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.

      External documentation
    • api_version string Required

      The Azure API version ID to use. It is recommended to use the latest supported non-preview version.

    • deployment_id string Required

      The deployment name of your deployed models. Your Azure OpenAI deployments can be found though the Azure OpenAI Studio portal that is linked to your subscription.

      External documentation
    • entra_id string

      A valid Microsoft Entra token. You must specify either api_key or entra_id. If you do not provide either or you provide both, you will receive an error when you try to create your model.

      External documentation
    • Hide rate_limit attribute Show rate_limit attribute object
    • resource_name string Required

      The name of your Azure OpenAI resource. You can find this from the list of resources in the Azure Portal for your subscription.

      External documentation
  • Hide task_settings attribute Show task_settings attribute object
    • user string

      For a completion or text_embedding task, specify the user issuing the request. This information can be used for abuse detection.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{azureopenai_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{azureopenai_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"azureopenai\",\n    \"service_settings\": {\n        \"api_key\": \"Api-Key\",\n        \"resource_name\": \"Resource-name\",\n        \"deployment_id\": \"Deployment-id\",\n        \"api_version\": \"2024-02-01\"\n    }\n}"'
Request examples
Run `PUT _inference/text_embedding/azure_openai_embeddings` to create an inference endpoint that performs a `text_embedding` task. You do not specify a model, as it is defined already in the Azure OpenAI deployment.
{
    "service": "azureopenai",
    "service_settings": {
        "api_key": "Api-Key",
        "resource_name": "Resource-name",
        "deployment_id": "Deployment-id",
        "api_version": "2024-02-01"
    }
}
Run `PUT _inference/completion/azure_openai_completion` to create an inference endpoint that performs a `completion` task.
{
    "service": "azureopenai",
    "service_settings": {
        "api_key": "Api-Key",
        "resource_name": "Resource-name",
        "deployment_id": "Deployment-id",
        "api_version": "2024-02-01"
    }
}

Create a Cohere inference endpoint Added in 8.13.0

PUT /_inference/{task_type}/{cohere_inference_id}

Create an inference endpoint to perform an inference task with the cohere service.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform.

    Values are completion, rerank, or text_embedding.

  • cohere_inference_id string Required

    The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is cohere.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string Required

      A valid API key for your Cohere account. You can find or create your Cohere API keys on the Cohere API key settings page.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.

      External documentation
    • Values are byte, float, or int8.

    • model_id string

      For a completion, rerank, or text_embedding task, the name of the model to use for the inference task.

      The default value for a text embedding task is embed-english-v2.0.

    • Hide rate_limit attribute Show rate_limit attribute object
    • Values are cosine, dot_product, or l2_norm.

  • Hide task_settings attributes Show task_settings attributes object
    • Values are classification, clustering, ingest, or search.

    • For a rerank task, return doc text within the results.

    • top_n number

      For a rerank task, the number of most relevant documents to return. It defaults to the number of the documents. If this inference endpoint is used in a text_similarity_reranker retriever query and top_n is set, it must be greater than or equal to rank_window_size in the query.

    • truncate string

      Values are END, NONE, or START.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{cohere_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{cohere_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"cohere\",\n    \"service_settings\": {\n        \"api_key\": \"Cohere-Api-key\",\n        \"model_id\": \"embed-english-light-v3.0\",\n        \"embedding_type\": \"byte\"\n    }\n}"'
Request examples
Run `PUT _inference/text_embedding/cohere-embeddings` to create an inference endpoint that performs a text embedding task.
{
    "service": "cohere",
    "service_settings": {
        "api_key": "Cohere-Api-key",
        "model_id": "embed-english-light-v3.0",
        "embedding_type": "byte"
    }
}
Run `PUT _inference/rerank/cohere-rerank` to create an inference endpoint that performs a rerank task.
{
    "service": "cohere",
    "service_settings": {
        "api_key": "Cohere-API-key",
        "model_id": "rerank-english-v3.0"
    },
    "task_settings": {
        "top_n": 10,
        "return_documents": true
    }
}




Create an ELSER inference endpoint Deprecated Added in 8.11.0

PUT /_inference/{task_type}/{elser_inference_id}

Create an inference endpoint to perform an inference task with the elser service. You can also deploy ELSER by using the Elasticsearch inference integration.


Your Elasticsearch deployment contains a preconfigured ELSER inference endpoint, you only need to create the enpoint using the API if you want to customize the settings.

The API request will automatically download and deploy the ELSER model if it isn't already downloaded.


You might see a 502 bad gateway error in the response when using the Kibana Console. This error usually just reflects a timeout, while the model downloads in the background. You can check the download progress in the Machine Learning UI. If using the Python client, you can set the timeout parameter to a higher value.

After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform.

    Value is sparse_embedding.

  • elser_inference_id string Required

    The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is elser.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • Hide adaptive_allocations attributes Show adaptive_allocations attributes object
      • enabled boolean

        Turn on adaptive_allocations.

      • The maximum number of allocations to scale to. If set, it must be greater than or equal to min_number_of_allocations.

      • The minimum number of allocations to scale to. If set, it must be greater than or equal to 0. If not defined, the deployment scales to 0.

    • num_allocations number Required

      The total number of allocations this model is assigned across machine learning nodes. Increasing this value generally increases the throughput. If adaptive allocations is enabled, do not set this value because it's automatically set.

    • num_threads number Required

      The number of threads used by each model allocation during inference. Increasing this value generally increases the speed per inference request. The inference process is a compute-bound process; threads_per_allocations must not exceed the number of available allocated processors per node. The value must be a power of 2. The maximum value is 32.


      If you want to optimize your ELSER endpoint for ingest, set the number of threads to 1. If you want to optimize your ELSER endpoint for search, set the number of threads to greater than 1.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{elser_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{elser_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"elser\",\n    \"service_settings\": {\n        \"num_allocations\": 1,\n        \"num_threads\": 1\n    }\n}"'
Request examples
Run `PUT _inference/sparse_embedding/my-elser-model` to create an inference endpoint that performs a `sparse_embedding` task. The request will automatically download the ELSER model if it isn't already downloaded and then deploy the model.
{
    "service": "elser",
    "service_settings": {
        "num_allocations": 1,
        "num_threads": 1
    }
}
Run `PUT _inference/sparse_embedding/my-elser-model` to create an inference endpoint that performs a `sparse_embedding` task with adaptive allocations. When adaptive allocations are enabled, the number of allocations of the model is set automatically based on the current load.
{
    "service": "elser",
    "service_settings": {
        "adaptive_allocations": {
            "enabled": true,
            "min_number_of_allocations": 3,
            "max_number_of_allocations": 10
        },
        "num_threads": 1
    }
}
Response examples (200)
A successful response when creating an ELSER inference endpoint.
{
  "inference_id": "my-elser-model",
  "task_type": "sparse_embedding",
  "service": "elser",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1
  },
  "task_settings": {}
}




















Create an OpenAI inference endpoint Added in 8.12.0

PUT /_inference/{task_type}/{openai_inference_id}

Create an inference endpoint to perform an inference task with the openai service or openai compatible APIs.

When you create an inference endpoint, the associated machine learning model is automatically deployed if it is not already running. After creating the endpoint, wait for the model deployment to complete before using it. To verify the deployment status, use the get trained model statistics API. Look for "state": "fully_allocated" in the response and ensure that the "allocation_count" matches the "target_allocation_count". Avoid creating multiple endpoints for the same model unless required, as each endpoint consumes significant resources.

Path parameters

  • task_type string Required

    The type of the inference task that the model will perform. NOTE: The chat_completion task type only supports streaming and only through the _stream API.

    Values are chat_completion, completion, or text_embedding.

  • openai_inference_id string Required

    The unique identifier of the inference endpoint.

application/json

Body

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    Value is openai.

  • service_settings object Required
    Hide service_settings attributes Show service_settings attributes object
    • api_key string Required

      A valid API key of your OpenAI account. You can find your OpenAI API keys in your OpenAI account under the API keys section.

      IMPORTANT: You need to provide the API key only once, during the inference model creation. The get inference endpoint API does not retrieve your API key. After creating the inference model, you cannot change the associated API key. If you want to use a different API key, delete the inference model and recreate it with the same name and the updated API key.

      External documentation
    • The number of dimensions the resulting output embeddings should have. It is supported only in text-embedding-3 and later models. If it is not set, the OpenAI defined default for the model is used.

    • model_id string Required

      The name of the model to use for the inference task. Refer to the OpenAI documentation for the list of available text embedding models.

      External documentation
    • The unique identifier for your organization. You can find the Organization ID in your OpenAI account under Settings > Organizations.

    • Hide rate_limit attribute Show rate_limit attribute object
    • url string

      The URL endpoint to use for the requests. It can be changed for testing purposes.

  • Hide task_settings attribute Show task_settings attribute object
    • user string

      For a completion or text_embedding task, specify the user issuing the request. This information can be used for abuse detection.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{openai_inference_id}
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{openai_inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n    \"service\": \"openai\",\n    \"service_settings\": {\n        \"api_key\": \"OpenAI-API-Key\",\n        \"model_id\": \"text-embedding-3-small\",\n        \"dimensions\": 128\n    }\n}"'
Request examples
Run `PUT _inference/text_embedding/openai-embeddings` to create an inference endpoint that performs a `text_embedding` task. The embeddings created by requests to this endpoint will have 128 dimensions.
{
    "service": "openai",
    "service_settings": {
        "api_key": "OpenAI-API-Key",
        "model_id": "text-embedding-3-small",
        "dimensions": 128
    }
}
Run `PUT _inference/completion/amazon_bedrock_completion` to create an inference endpoint to perform a completion task.
{
    "service": "amazonbedrock",
    "service_settings": {
        "access_key": "AWS-access-key",
        "secret_key": "AWS-secret-key",
        "region": "us-east-1",
        "provider": "amazontitan",
        "model": "amazon.titan-text-premier-v1:0"
    }
}








Path parameters

  • inference_id string Required

    The unique identifier for the inference endpoint.

Query parameters

  • timeout string

    The amount of time to wait for the inference request to complete.

application/json

Body

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
POST /_inference/rerank/{inference_id}
curl \
 --request POST 'http://api.example.com/_inference/rerank/{inference_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"input\": [\"luke\", \"like\", \"leia\", \"chewy\",\"r2d2\", \"star\", \"wars\"],\n  \"query\": \"star wars main character\"\n}"'
Request example
Run `POST _inference/rerank/cohere_rerank` to perform reranking on the example input.
{
  "input": ["luke", "like", "leia", "chewy","r2d2", "star", "wars"],
  "query": "star wars main character"
}
Response examples (200)
A successful response from `POST _inference/rerank/cohere_rerank`.
{
  "rerank": [
    {
      "index": "2",
      "relevance_score": "0.011597361",
      "text": "leia"
    },
    {
      "index": "0",
      "relevance_score": "0.006338922",
      "text": "luke"
    },
    {
      "index": "5",
      "relevance_score": "0.0016166499",
      "text": "star"
    },
    {
      "index": "4",
      "relevance_score": "0.0011695103",
      "text": "r2d2"
    },
    {
      "index": "1",
      "relevance_score": "5.614787E-4",
      "text": "like"
    },
    {
      "index": "6",
      "relevance_score": "3.7850367E-4",
      "text": "wars"
    },
    {
      "index": "3",
      "relevance_score": "1.2508839E-5",
      "text": "chewy"
    }
  ]
}




Perform streaming inference Added in 8.16.0

POST /_inference/completion/{inference_id}/_stream

Get real-time responses for completion tasks by delivering answers incrementally, reducing response times during computation. This API works only with the completion task type.

IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.

This API requires the monitor_inference cluster privilege (the built-in inference_admin and inference_user roles grant this privilege). You must use a client that supports streaming.

Path parameters

  • inference_id string Required

    The unique identifier for the inference endpoint.

application/json

Body

Responses

POST /_inference/completion/{inference_id}/_stream
curl \
 --request POST 'http://api.example.com/_inference/completion/{inference_id}/_stream' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"input\": \"What is Elastic?\"\n}"'
Request example
Run `POST _inference/completion/openai-completion/_stream` to perform a completion on the example question with streaming.
{
  "input": "What is Elastic?"
}








Update an inference endpoint Added in 8.17.0

PUT /_inference/{task_type}/{inference_id}/_update

Modify task_settings, secrets (within service_settings), or num_allocations for an inference endpoint, depending on the specific endpoint service and task_type.

IMPORTANT: The inference APIs enable you to use certain services, such as built-in machine learning models (ELSER, E5), models uploaded through Eland, Cohere, OpenAI, Azure, Google AI Studio, Google Vertex AI, Anthropic, Watsonx.ai, or Hugging Face. For built-in models and models uploaded through Eland, the inference APIs offer an alternative way to use and manage trained models. However, if you do not plan to use the inference APIs to use these models or if you want to use non-NLP models, use the machine learning trained model APIs.

Path parameters

  • task_type string Required

    The type of inference task that the model performs.

    Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

  • inference_id string Required

    The unique identifier of the inference endpoint.

application/json

Body Required

  • Hide chunking_settings attributes Show chunking_settings attributes object
    • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

    • overlap number

      The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

    • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

    • strategy string

      The chunking strategy: sentence or word.

  • service string Required

    The service type

  • service_settings object Required

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • Hide chunking_settings attributes Show chunking_settings attributes object
      • The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).

      • overlap number

        The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.

      • The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.

      • strategy string

        The chunking strategy: sentence or word.

    • service string Required

      The service type

    • service_settings object Required
    • inference_id string Required

      The inference Id

    • task_type string Required

      Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

PUT /_inference/{task_type}/{inference_id}/_update
curl \
 --request PUT 'http://api.example.com/_inference/{task_type}/{inference_id}/_update' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"chunking_settings":{"max_chunk_size":42.0,"overlap":42.0,"sentence_overlap":42.0,"strategy":"string"},"service":"string","service_settings":{},"task_settings":{}}'





Ingest

Ingest APIs enable you to manage tasks and resources related to ingest pipelines and processors.

Get GeoIP database configurations Added in 8.15.0

GET /_ingest/geoip/database/{id}

Get information about one or more IP geolocation database configurations.

Path parameters

  • id string | array[string] Required

    A comma-separated list of database configuration IDs to retrieve. Wildcard (*) expressions are supported. To get all database configurations, omit this parameter or use *.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • databases array[object] Required
      Hide databases attributes Show databases attributes object
      • id string Required
      • version number Required
      • Time unit for milliseconds

      • database object

        The configuration necessary to identify which IP geolocation provider to use to download a database, as well as any provider-specific configuration necessary for such downloading. At present, the only supported providers are maxmind and ipinfo, and the maxmind provider requires that an account_id (string) is configured. A provider (either maxmind or ipinfo) must be specified. The web and local providers can be returned as read only configurations.

        Hide database attributes Show database attributes object
GET /_ingest/geoip/database/{id}
curl \
 --request GET 'http://api.example.com/_ingest/geoip/database/{id}' \
 --header "Authorization: $API_KEY"








Path parameters

  • id string | array[string] Required

    Comma-separated list of database configuration IDs to retrieve. Wildcard (*) expressions are supported. To get all database configurations, omit this parameter or use *.

Query parameters

  • The period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error. A value of -1 indicates that the request should never time out.

Responses

GET /_ingest/ip_location/database/{id}
curl \
 --request GET 'http://api.example.com/_ingest/ip_location/database/{id}' \
 --header "Authorization: $API_KEY"

Path parameters

  • id string Required

    The database configuration identifier.

Query parameters

  • The period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error. A value of -1 indicates that the request should never time out.

  • timeout string

    The period to wait for a response from all relevant nodes in the cluster after updating the cluster metadata. If no response is received before the timeout expires, the cluster metadata update still applies but the response indicates that it was not completely acknowledged. A value of -1 indicates that the request should never time out.

application/json

Body Required

The configuration necessary to identify which IP geolocation provider to use to download a database, as well as any provider-specific configuration necessary for such downloading. At present, the only supported providers are maxmind and ipinfo, and the maxmind provider requires that an account_id (string) is configured. A provider (either maxmind or ipinfo) must be specified. The web and local providers can be returned as read only configurations.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

PUT /_ingest/ip_location/database/{id}
curl \
 --request PUT 'http://api.example.com/_ingest/ip_location/database/{id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"name":"string","maxmind":{"account_id":"string"},"ipinfo":{}}'

Path parameters

  • id string | array[string] Required

    A comma-separated list of IP location database configurations.

Query parameters

  • The period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error. A value of -1 indicates that the request should never time out.

  • timeout string

    The period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error. A value of -1 indicates that the request should never time out.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ingest/ip_location/database/{id}
curl \
 --request DELETE 'http://api.example.com/_ingest/ip_location/database/{id}' \
 --header "Authorization: $API_KEY"








Delete pipelines Added in 5.0.0

DELETE /_ingest/pipeline/{id}

Delete one or more ingest pipelines.

External documentation

Path parameters

  • id string Required

    Pipeline ID or wildcard expression of pipeline IDs used to limit the request. To delete all ingest pipelines in a cluster, use a value of *.

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ingest/pipeline/{id}
curl \
 --request DELETE 'http://api.example.com/_ingest/pipeline/{id}' \
 --header "Authorization: $API_KEY"

Get GeoIP statistics Added in 7.13.0

GET /_ingest/geoip/stats

Get download statistics for GeoIP2 databases that are used with the GeoIP processor.

External documentation

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • stats object Required
      Hide stats attributes Show stats attributes object
    • nodes object Required

      Downloaded GeoIP2 databases for each node.

      Hide nodes attribute Show nodes attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • databases array[object] Required

          Downloaded databases for the node.

          Hide databases attribute Show databases attribute object
        • files_in_temp array[string] Required

          Downloaded database files, including related license files. Elasticsearch stores these files in the node’s temporary directory: $ES_TMPDIR/geoip-databases/.

GET /_ingest/geoip/stats
curl \
 --request GET 'http://api.example.com/_ingest/geoip/stats' \
 --header "Authorization: $API_KEY"












Run a grok processor Added in 6.1.0

GET /_ingest/processor/grok

Extract structured fields out of a single text field within a document. You must choose which field to extract matched fields from, as well as the grok pattern you expect will match. A grok pattern is like a regular expression that supports aliased expressions that can be reused.

External documentation

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • patterns object Required
      Hide patterns attribute Show patterns attribute object
      • * string Additional properties
GET /_ingest/processor/grok
curl \
 --request GET 'http://api.example.com/_ingest/processor/grok' \
 --header "Authorization: $API_KEY"













































Delete the license

DELETE /_license

When the license expires, your subscription level reverts to Basic.

If the operator privileges feature is enabled, only operator users can use this API.

External documentation

Query parameters

  • The period to wait for a connection to the master node.

  • timeout string

    The period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_license
curl \
 --request DELETE 'http://api.example.com/_license' \
 --header "Authorization: $API_KEY"












Start a trial Added in 6.1.0

POST /_license/start_trial

Start a 30-day trial, which gives access to all subscription features.

NOTE: You are allowed to start a trial only if your cluster has not already activated a trial for the current major product version. For example, if you have already activated a trial for v8.0, you cannot start a new trial until v9.0. You can, however, request an extended trial at https://www.elastic.co/trialextension.

To check the status of your trial, use the get trial status API.

Query parameters

Responses

POST /_license/start_trial
curl \
 --request POST 'http://api.example.com/_license/start_trial' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response from `POST /_license/start_trial?acknowledge=true`.
{
  "trial_was_started": true,
  "acknowledged": true
}





















Get machine learning memory usage info Added in 8.2.0

GET /_ml/memory/{node_id}/_stats

Get information about how machine learning jobs and trained models are using memory, on each node, both within the JVM heap, and natively, outside of the JVM.

Path parameters

  • node_id string Required

    The names of particular nodes in the cluster to target. For example, nodeId1,nodeId2 or ml:true

Query parameters

  • Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

Responses

GET /_ml/memory/{node_id}/_stats
curl \
 --request GET 'http://api.example.com/_ml/memory/{node_id}/_stats' \
 --header "Authorization: $API_KEY"













Path parameters

  • calendar_id string Required

    A string that uniquely identifies a calendar. You can get information for multiple calendars by using a comma-separated list of ids or a wildcard expression. You can get information for all calendars by using _all or * or by omitting the calendar identifier.

Query parameters

  • from number

    Skips the specified number of calendars. This parameter is supported only when you omit the calendar identifier.

  • size number

    Specifies the maximum number of calendars to obtain. This parameter is supported only when you omit the calendar identifier.

application/json

Body

  • page object
    Hide page attributes Show page attributes object
    • from number

      Skips the specified number of items.

    • size number

      Specifies the maximum number of items to obtain.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • calendars array[object] Required
      Hide calendars attributes Show calendars attributes object
    • count number Required
GET /_ml/calendars/{calendar_id}
curl \
 --request GET 'http://api.example.com/_ml/calendars/{calendar_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"page":{"from":42.0,"size":42.0}}'
























Get datafeeds configuration info Added in 5.5.0

GET /_ml/datafeeds/{datafeed_id}

You can get information for multiple datafeeds in a single API request by using a comma-separated list of datafeeds or a wildcard expression. You can get information for all datafeeds by using _all, by specifying * as the <feed_id>, or by omitting the <feed_id>. This API returns a maximum of 10,000 datafeeds.

Path parameters

  • datafeed_id string | array[string] Required

    Identifier for the datafeed. It can be a datafeed identifier or a wildcard expression. If you do not specify one of these options, the API returns information about all datafeeds.

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no datafeeds that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    The default value is true, which returns an empty datafeeds array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

  • Indicates if certain fields should be removed from the configuration on retrieval. This allows the configuration to be in an acceptable format to be retrieved and then added to another cluster.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • datafeeds array[object] Required
      Hide datafeeds attributes Show datafeeds attributes object
      • Hide authorization attributes Show authorization attributes object
        • api_key object
          Hide api_key attributes Show api_key attributes object
          • id string Required

            The identifier for the API key.

          • name string Required

            The name of the API key.

        • roles array[string]

          If a user ID was used for the most recent update to the datafeed, its roles at the time of the update are listed in the response.

        • If a service account was used for the most recent update to the datafeed, the account name is listed in the response.

      • Hide chunking_config attributes Show chunking_config attributes object
        • mode string Required

          Values are auto, manual, or off.

        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • datafeed_id string Required
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • indices array[string] Required
      • indexes array[string]
      • job_id string Required
      • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

      • Hide script_fields attribute Show script_fields attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • script object Required
            Hide script attributes Show script attributes object
            • id string
            • params object

              Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            • options object
      • Hide delayed_data_check_config attributes Show delayed_data_check_config attributes object
        • A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • enabled boolean Required

          Specifies whether the datafeed periodically checks for delayed data.

      • Hide runtime_mappings attribute Show runtime_mappings attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • fields object

            For type composite

            Hide fields attribute Show fields attribute object
            • * object Additional properties
          • fetch_fields array[object]

            For type lookup

          • format string

            A custom format for date type runtime fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

          • script object
            Hide script attributes Show script attributes object
            • id string
            • params object

              Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.

            • options object
          • type string Required

            Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

      • Hide indices_options attributes Show indices_options attributes object
        • If false, the request returns an error if any wildcard expression, index alias, or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar.

        • expand_wildcards string | array[string]
        • If true, missing or closed indices are not included in the response.

        • If true, concrete, expanded or aliased indices are ignored when frozen.

      • query object Required

        The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. By default, this property has the following value: {"match_all": {"boost": 1}}.

        Query DSL
GET /_ml/datafeeds/{datafeed_id}
curl \
 --request GET 'http://api.example.com/_ml/datafeeds/{datafeed_id}' \
 --header "Authorization: $API_KEY"




Delete a datafeed Added in 5.4.0

DELETE /_ml/datafeeds/{datafeed_id}

Path parameters

  • datafeed_id string Required

    A numerical character string that uniquely identifies the datafeed. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.

Query parameters

  • force boolean

    Use to forcefully delete a started datafeed; this method is quicker than stopping and deleting the datafeed.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ml/datafeeds/{datafeed_id}
curl \
 --request DELETE 'http://api.example.com/_ml/datafeeds/{datafeed_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when deleting a datafeed.
{
  "acknowledged": true
}

Delete expired ML data Added in 5.4.0

DELETE /_ml/_delete_expired_data/{job_id}

Delete all job results, model snapshots and forecast data that have exceeded their retention days period. Machine learning state documents that are not associated with any job are also deleted. You can limit the request to a single or set of anomaly detection jobs by using a job identifier, a group name, a comma-separated list of jobs, or a wildcard expression. You can delete expired data for all anomaly detection jobs by using _all, by specifying * as the <job_id>, or by omitting the <job_id>.

Path parameters

  • job_id string Required

    Identifier for an anomaly detection job. It can be a job identifier, a group name, or a wildcard expression.

Query parameters

  • The desired requests per second for the deletion processes. The default behavior is no throttling.

  • timeout string

    How long can the underlying delete processes run until they are canceled.

application/json

Body

  • The desired requests per second for the deletion processes. The default behavior is no throttling.

  • timeout string

    A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
DELETE /_ml/_delete_expired_data/{job_id}
curl \
 --request DELETE 'http://api.example.com/_ml/_delete_expired_data/{job_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"requests_per_second":42.0,"timeout":"string"}'
Response examples (200)
A successful response when deleting expired and unused anomaly detection data.
{
  "deleted": true
}




Get filters Added in 5.5.0

GET /_ml/filters/{filter_id}

You can get a single filter or all filters.

Path parameters

  • filter_id string | array[string] Required

    A string that uniquely identifies a filter.

Query parameters

  • from number

    Skips the specified number of filters.

  • size number

    Specifies the maximum number of filters to obtain.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • filters array[object] Required
      Hide filters attributes Show filters attributes object
      • A description of the filter.

      • filter_id string Required
      • items array[string] Required

        An array of strings which is the filter item list.

GET /_ml/filters/{filter_id}
curl \
 --request GET 'http://api.example.com/_ml/filters/{filter_id}' \
 --header "Authorization: $API_KEY"

Create a filter Added in 5.4.0

PUT /_ml/filters/{filter_id}

A filter contains a list of strings. It can be used by one or more anomaly detection jobs. Specifically, filters are referenced in the custom_rules property of detector configuration objects.

Path parameters

  • filter_id string Required

    A string that uniquely identifies a filter.

application/json

Body Required

  • A description of the filter.

  • items array[string]

    The items of the filter. A wildcard * can be used at the beginning or the end of an item. Up to 10000 items are allowed in each filter.

Responses

PUT /_ml/filters/{filter_id}
curl \
 --request PUT 'http://api.example.com/_ml/filters/{filter_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"description":"string","items":["string"]}'

Delete a filter Added in 5.4.0

DELETE /_ml/filters/{filter_id}

If an anomaly detection job references the filter, you cannot delete the filter. You must update or delete the job before you can delete the filter.

Path parameters

  • filter_id string Required

    A string that uniquely identifies a filter.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ml/filters/{filter_id}
curl \
 --request DELETE 'http://api.example.com/_ml/filters/{filter_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when deleting a filter.
{
  "acknowledged": true
}




Delete forecasts from a job Added in 6.5.0

DELETE /_ml/anomaly_detectors/{job_id}/_forecast

By default, forecasts are retained for 14 days. You can specify a different retention period with the expires_in parameter in the forecast jobs API. The delete forecast API enables you to delete one or more forecasts before they expire.

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

Query parameters

  • Specifies whether an error occurs when there are no forecasts. In particular, if this parameter is set to false and there are no forecasts associated with the job, attempts to delete all forecasts return an error.

  • timeout string

    Specifies the period of time to wait for the completion of the delete operation. When this period of time elapses, the API fails and returns an error.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ml/anomaly_detectors/{job_id}/_forecast
curl \
 --request DELETE 'http://api.example.com/_ml/anomaly_detectors/{job_id}/_forecast' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when deleting a forecast from an anomaly detection job.
{
  "acknowledged": true
}




















Get model snapshots info Added in 5.4.0

POST /_ml/anomaly_detectors/{job_id}/model_snapshots/{snapshot_id}

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

  • snapshot_id string Required

    A numerical character string that uniquely identifies the model snapshot. You can get information for multiple snapshots by using a comma-separated list or a wildcard expression. You can get all snapshots by using _all, by specifying * as the snapshot ID, or by omitting the snapshot ID.

Query parameters

  • desc boolean

    If true, the results are sorted in descending order.

  • end string | number

    Returns snapshots with timestamps earlier than this time.

  • from number

    Skips the specified number of snapshots.

  • size number

    Specifies the maximum number of snapshots to obtain.

  • sort string

    Specifies the sort field for the requested snapshots. By default, the snapshots are sorted by their timestamp.

  • start string | number

    Returns snapshots with timestamps after this time.

application/json

Body

  • desc boolean

    Refer to the description for the desc query parameter.

  • end string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • page object
    Hide page attributes Show page attributes object
    • from number

      Skips the specified number of items.

    • size number

      Specifies the maximum number of items to obtain.

  • sort string

    Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

  • start string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

Responses

POST /_ml/anomaly_detectors/{job_id}/model_snapshots/{snapshot_id}
curl \
 --request POST 'http://api.example.com/_ml/anomaly_detectors/{job_id}/model_snapshots/{snapshot_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"desc":true,"":"string","page":{"from":42.0,"size":42.0},"sort":"string"}'

Delete a model snapshot Added in 5.4.0

DELETE /_ml/anomaly_detectors/{job_id}/model_snapshots/{snapshot_id}

You cannot delete the active model snapshot. To delete that snapshot, first revert to a different one. To identify the active model snapshot, refer to the model_snapshot_id in the results from the get jobs API.

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

  • snapshot_id string Required

    Identifier for the model snapshot.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • acknowledged boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

DELETE /_ml/anomaly_detectors/{job_id}/model_snapshots/{snapshot_id}
curl \
 --request DELETE 'http://api.example.com/_ml/anomaly_detectors/{job_id}/model_snapshots/{snapshot_id}' \
 --header "Authorization: $API_KEY"
Response examples (200)
A successful response when deleting an existing model snapshot.
{
  "acknowledged": true
}








































Get anomaly detection job results for categories Added in 5.4.0

GET /_ml/anomaly_detectors/{job_id}/results/categories/{category_id}

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

  • category_id string Required

    Identifier for the category, which is unique in the job. If you specify neither the category ID nor the partition_field_value, the API returns information about all categories. If you specify only the partition_field_value, it returns information about all categories for the specified partition.

Query parameters

  • from number

    Skips the specified number of categories.

  • Only return categories for the specified partition.

  • size number

    Specifies the maximum number of categories to obtain.

application/json

Body

  • page object
    Hide page attributes Show page attributes object
    • from number

      Skips the specified number of items.

    • size number

      Specifies the maximum number of items to obtain.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • categories array[object] Required
      Hide categories attributes Show categories attributes object
      • category_id number Required
      • examples array[string] Required

        A list of examples of actual values that matched the category.

      • job_id string Required
      • max_matching_length number Required
      • If per-partition categorization is enabled, this property identifies the field used to segment the categorization. It is not present when per-partition categorization is disabled.

      • If per-partition categorization is enabled, this property identifies the value of the partition_field_name for the category. It is not present when per-partition categorization is disabled.

      • regex string Required

        A regular expression that is used to search for values that match the category.

      • terms string Required

        A space separated list of the common tokens that are matched in values of the category.

      • The number of messages that have been matched by this category. This is only guaranteed to have the latest accurate count after a job _flush or _close

      • A list of category_id entries that this current category encompasses. Any new message that is processed by the categorizer will match against this category and not any of the categories in this list. This is only guaranteed to have the latest accurate list of categories after a job _flush or _close

      • p string
      • result_type string Required
      • mlcategory string Required
    • count number Required
GET /_ml/anomaly_detectors/{job_id}/results/categories/{category_id}
curl \
 --request GET 'http://api.example.com/_ml/anomaly_detectors/{job_id}/results/categories/{category_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"page":{"from":42.0,"size":42.0}}'

Get anomaly detection job results for categories Added in 5.4.0

POST /_ml/anomaly_detectors/{job_id}/results/categories/{category_id}

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

  • category_id string Required

    Identifier for the category, which is unique in the job. If you specify neither the category ID nor the partition_field_value, the API returns information about all categories. If you specify only the partition_field_value, it returns information about all categories for the specified partition.

Query parameters

  • from number

    Skips the specified number of categories.

  • Only return categories for the specified partition.

  • size number

    Specifies the maximum number of categories to obtain.

application/json

Body

  • page object
    Hide page attributes Show page attributes object
    • from number

      Skips the specified number of items.

    • size number

      Specifies the maximum number of items to obtain.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • categories array[object] Required
      Hide categories attributes Show categories attributes object
      • category_id number Required
      • examples array[string] Required

        A list of examples of actual values that matched the category.

      • job_id string Required
      • max_matching_length number Required
      • If per-partition categorization is enabled, this property identifies the field used to segment the categorization. It is not present when per-partition categorization is disabled.

      • If per-partition categorization is enabled, this property identifies the value of the partition_field_name for the category. It is not present when per-partition categorization is disabled.

      • regex string Required

        A regular expression that is used to search for values that match the category.

      • terms string Required

        A space separated list of the common tokens that are matched in values of the category.

      • The number of messages that have been matched by this category. This is only guaranteed to have the latest accurate count after a job _flush or _close

      • A list of category_id entries that this current category encompasses. Any new message that is processed by the categorizer will match against this category and not any of the categories in this list. This is only guaranteed to have the latest accurate list of categories after a job _flush or _close

      • p string
      • result_type string Required
      • mlcategory string Required
    • count number Required
POST /_ml/anomaly_detectors/{job_id}/results/categories/{category_id}
curl \
 --request POST 'http://api.example.com/_ml/anomaly_detectors/{job_id}/results/categories/{category_id}' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"page":{"from":42.0,"size":42.0}}'












Get datafeed stats Added in 5.5.0

GET /_ml/datafeeds/_stats

You can get statistics for multiple datafeeds in a single API request by using a comma-separated list of datafeeds or a wildcard expression. You can get statistics for all datafeeds by using _all, by specifying * as the <feed_id>, or by omitting the <feed_id>. If the datafeed is stopped, the only information you receive is the datafeed_id and the state. This API returns a maximum of 10,000 datafeeds.

Query parameters

  • Specifies what to do when the request:

    1. Contains wildcard expressions and there are no datafeeds that match.
    2. Contains the _all string or no identifiers and there are no matches.
    3. Contains wildcard expressions and there are only partial matches.

    The default value is true, which returns an empty datafeeds array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
GET /_ml/datafeeds/_stats
curl \
 --request GET 'http://api.example.com/_ml/datafeeds/_stats' \
 --header "Authorization: $API_KEY"








Get anomaly detection job results for influencers Added in 5.4.0

GET /_ml/anomaly_detectors/{job_id}/results/influencers

Influencers are the entities that have contributed to, or are to blame for, the anomalies. Influencer results are available only if an influencer_field_name is specified in the job configuration.

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

Query parameters

  • desc boolean

    If true, the results are sorted in descending order.

  • end string | number

    Returns influencers with timestamps earlier than this time. The default value means it is unset and results are not limited to specific timestamps.

  • If true, the output excludes interim results. By default, interim results are included.

  • Returns influencers with anomaly scores greater than or equal to this value.

  • from number

    Skips the specified number of influencers.

  • size number

    Specifies the maximum number of influencers to obtain.

  • sort string

    Specifies the sort field for the requested influencers. By default, the influencers are sorted by the influencer_score value.

  • start string | number

    Returns influencers with timestamps after this time. The default value means it is unset and results are not limited to specific timestamps.

application/json

Body

  • page object
    Hide page attributes Show page attributes object
    • from number

      Skips the specified number of items.

    • size number

      Specifies the maximum number of items to obtain.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • influencers array[object] Required

      Array of influencer objects

      Hide influencers attributes Show influencers attributes object
      • Time unit for seconds

      • influencer_score number Required

        A normalized score between 0-100, which is based on the probability of the influencer in this bucket aggregated across detectors. Unlike initial_influencer_score, this value is updated by a re-normalization process as new data is analyzed.

      • influencer_field_name string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • influencer_field_value string Required

        The entity that influenced, contributed to, or was to blame for the anomaly.

      • A normalized score between 0-100, which is based on the probability of the influencer aggregated across detectors. This is the initial value that was calculated at the time the bucket was processed.

      • is_interim boolean Required

        If true, this is an interim result. In other words, the results are calculated based on partial input data.

      • job_id string Required
      • probability number Required

        The probability that the influencer has this behavior, in the range 0 to 1. This value can be held to a high precision of over 300 decimal places, so the influencer_score is provided as a human-readable and friendly interpretation of this value.

      • result_type string Required

        Internal. This value is always set to influencer.

      • Time unit for milliseconds

      • foo string

        Additional influencer properties are added, depending on the fields being analyzed. For example, if it’s analyzing user_name as an influencer, a field user_name is added to the result document. This information enables you to filter the anomaly results more easily.

GET /_ml/anomaly_detectors/{job_id}/results/influencers
curl \
 --request GET 'http://api.example.com/_ml/anomaly_detectors/{job_id}/results/influencers' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"page":{"from":42.0,"size":42.0}}'

Get anomaly detection job results for influencers Added in 5.4.0

POST /_ml/anomaly_detectors/{job_id}/results/influencers

Influencers are the entities that have contributed to, or are to blame for, the anomalies. Influencer results are available only if an influencer_field_name is specified in the job configuration.

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

Query parameters

  • desc boolean

    If true, the results are sorted in descending order.

  • end string | number

    Returns influencers with timestamps earlier than this time. The default value means it is unset and results are not limited to specific timestamps.

  • If true, the output excludes interim results. By default, interim results are included.

  • Returns influencers with anomaly scores greater than or equal to this value.

  • from number

    Skips the specified number of influencers.

  • size number

    Specifies the maximum number of influencers to obtain.

  • sort string

    Specifies the sort field for the requested influencers. By default, the influencers are sorted by the influencer_score value.

  • start string | number

    Returns influencers with timestamps after this time. The default value means it is unset and results are not limited to specific timestamps.

application/json

Body

  • page object
    Hide page attributes Show page attributes object
    • from number

      Skips the specified number of items.

    • size number

      Specifies the maximum number of items to obtain.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • count number Required
    • influencers array[object] Required

      Array of influencer objects

      Hide influencers attributes Show influencers attributes object
      • Time unit for seconds

      • influencer_score number Required

        A normalized score between 0-100, which is based on the probability of the influencer in this bucket aggregated across detectors. Unlike initial_influencer_score, this value is updated by a re-normalization process as new data is analyzed.

      • influencer_field_name string Required

        Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.

      • influencer_field_value string Required

        The entity that influenced, contributed to, or was to blame for the anomaly.

      • A normalized score between 0-100, which is based on the probability of the influencer aggregated across detectors. This is the initial value that was calculated at the time the bucket was processed.

      • is_interim boolean Required

        If true, this is an interim result. In other words, the results are calculated based on partial input data.

      • job_id string Required
      • probability number Required

        The probability that the influencer has this behavior, in the range 0 to 1. This value can be held to a high precision of over 300 decimal places, so the influencer_score is provided as a human-readable and friendly interpretation of this value.

      • result_type string Required

        Internal. This value is always set to influencer.

      • Time unit for milliseconds

      • foo string

        Additional influencer properties are added, depending on the fields being analyzed. For example, if it’s analyzing user_name as an influencer, a field user_name is added to the result document. This information enables you to filter the anomaly results more easily.

POST /_ml/anomaly_detectors/{job_id}/results/influencers
curl \
 --request POST 'http://api.example.com/_ml/anomaly_detectors/{job_id}/results/influencers' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"page":{"from":42.0,"size":42.0}}'












Get anomaly detection job model snapshot upgrade usage info Added in 7.16.0

GET /_ml/anomaly_detectors/{job_id}/model_snapshots/{snapshot_id}/_upgrade/_stats

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

  • snapshot_id string Required

    A numerical character string that uniquely identifies the model snapshot. You can get information for multiple snapshots by using a comma-separated list or a wildcard expression. You can get all snapshots by using _all, by specifying * as the snapshot ID, or by omitting the snapshot ID.

Query parameters

  • Specifies what to do when the request:

    • Contains wildcard expressions and there are no jobs that match.
    • Contains the _all string or no identifiers and there are no matches.
    • Contains wildcard expressions and there are only partial matches.

    The default value is true, which returns an empty jobs array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.

Responses

GET /_ml/anomaly_detectors/{job_id}/model_snapshots/{snapshot_id}/_upgrade/_stats
curl \
 --request GET 'http://api.example.com/_ml/anomaly_detectors/{job_id}/model_snapshots/{snapshot_id}/_upgrade/_stats' \
 --header "Authorization: $API_KEY"
























Open anomaly detection jobs Added in 5.4.0

POST /_ml/anomaly_detectors/{job_id}/_open

An anomaly detection job must be opened to be ready to receive and analyze data. It can be opened and closed multiple times throughout its lifecycle. When you open a new job, it starts with an empty model. When you open an existing job, the most recent model state is automatically loaded. The job is ready to resume its analysis from where it left off, once new data is received.

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job.

Query parameters

  • timeout string

    Controls the time to wait until a job has opened.

application/json

Body

  • timeout string

    A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
POST /_ml/anomaly_detectors/{job_id}/_open
curl \
 --request POST 'http://api.example.com/_ml/anomaly_detectors/{job_id}/_open' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '"{\n  \"timeout\": \"35m\"\n}"'
Request example
A request to open anomaly detection jobs. The timeout specifies to wait 35 minutes for the job to open.
{
  "timeout": "35m"
}
Response examples (200)
A successful response when opening an anomaly detection job.
{
  "opened": true,
  "node": "node-1"
}

Send data to an anomaly detection job for analysis Deprecated Added in 5.4.0

POST /_ml/anomaly_detectors/{job_id}/_data

IMPORTANT: For each job, data can be accepted from only a single connection at a time. It is not currently possible to post data to multiple jobs using wildcards or a comma-separated list.

Path parameters

  • job_id string Required

    Identifier for the anomaly detection job. The job must have a state of open to receive and process the data.

Query parameters

  • reset_end string | number

    Specifies the end of the bucket resetting range.

  • reset_start string | number

    Specifies the start of the bucket resetting range.

application/json

Body Required

object object

Responses

POST /_ml/anomaly_detectors/{job_id}/_data
curl \
 --request POST 'http://api.example.com/_ml/anomaly_detectors/{job_id}/_data' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '[{}]'
























Start datafeeds Added in 5.5.0

POST /_ml/datafeeds/{datafeed_id}/_start

A datafeed must be started in order to retrieve data from Elasticsearch. A datafeed can be started and stopped multiple times throughout its lifecycle.

Before you can start a datafeed, the anomaly detection job must be open. Otherwise, an error occurs.

If you restart a stopped datafeed, it continues processing input data from the next millisecond after it was stopped. If new data was indexed for that exact millisecond between stopping and starting, it will be ignored.

When Elasticsearch security features are enabled, your datafeed remembers which roles the last user to create or update it had at the time of creation or update and runs the query using those same roles. If you provided secondary authorization headers when you created or updated the datafeed, those credentials are used instead.

Path parameters

  • datafeed_id string Required

    A numerical character string that uniquely identifies the datafeed. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.

Query parameters

  • end string | number

    The time that the datafeed should end, which can be specified by using one of the following formats:

    • ISO 8601 format with milliseconds, for example 2017-01-22T06:00:00.000Z
    • ISO 8601 format without milliseconds, for example 2017-01-22T06:00:00+00:00
    • Milliseconds since the epoch, for example 1485061200000

    Date-time arguments using either of the ISO 8601 formats must have a time zone designator, where Z is accepted as an abbreviation for UTC time. When a URL is expected (for example, in browsers), the + used in time zone designators must be encoded as %2B. The end time value is exclusive. If you do not specify an end time, the datafeed runs continuously.

  • start string | number

    The time that the datafeed should begin, which can be specified by using the same formats as the end parameter. This value is inclusive. If you do not specify a start time and the datafeed is associated with a new anomaly detection job, the analysis starts from the earliest time for which data is available. If you restart a stopped datafeed and specify a start value that is earlier than the timestamp of the latest processed record, the datafeed continues from 1 millisecond after the timestamp of the latest processed record.

  • timeout string

    Specifies the amount of time to wait until a datafeed starts.

application/json

Body

  • end string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • start string | number

    A date and time, either as a string whose format can depend on the context (defaulting to ISO 8601), or a number of milliseconds since the Epoch. Elasticsearch accepts both as input, but will generally output a string representation.

  • timeout string

    A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • node string | array[string] Required
    • started boolean Required

      For a successful response, this value is always true. On failure, an exception is returned instead.

POST /_ml/datafeeds/{datafeed_id}/_start
curl \
 --request POST 'http://api.example.com/_ml/datafeeds/{datafeed_id}/_start' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"":"string","timeout":"string"}'

Stop datafeeds Added in 5.4.0

POST /_ml/datafeeds/{datafeed_id}/_stop

A datafeed that is stopped ceases to retrieve data from Elasticsearch. A datafeed can be started and stopped multiple times throughout its lifecycle.

Path parameters

  • datafeed_id string Required

    Identifier for the datafeed. You can stop multiple datafeeds in a single API request by using a comma-separated list of datafeeds or a wildcard expression. You can close all datafeeds by using _all or by specifying * as the identifier.

Query parameters

  • Specifies what to do when the request:

    • Contains wildcard expressions and there are no datafeeds that match.
    • Contains the _all string or no identifiers and there are no matches.
    • Contains wildcard expressions and there are only partial matches.

    If true, the API returns an empty datafeeds array when there are no matches and the subset of results when there are partial matches. If false, the API returns a 404 status code when there are no matches or only partial matches.

  • force boolean

    If true, the datafeed is stopped forcefully.

  • timeout string

    Specifies the amount of time to wait until a datafeed stops.

application/json

Body

  • Refer to the description for the allow_no_match query parameter.

  • force boolean

    Refer to the description for the force query parameter.

  • timeout string

    A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
POST /_ml/datafeeds/{datafeed_id}/_stop
curl \
 --request POST 'http://api.example.com/_ml/datafeeds/{datafeed_id}/_stop' \
 --header "Authorization: $API_KEY" \
 --header "Content-Type: application/json" \
 --data '{"allow_no_match":true,"force":true,"timeout":"string"}'