Elasticsearch Tutorial
Elasticsearch Tutorial
In this brief tutorial, we will be explaining the basics of Elasticsearch and its features.
Audience
This tutorial is designed for software professionals who want to learn the basics of
Elasticsearch and its programming concepts in simple and easy steps. It describes the
components of Elasticsearch with suitable examples.
Prerequisites
You should have a basic understanding of Java, JSON, search engines, and web
technologies. The interaction with Elasticsearch is through RESTful API; therefore, it is
always recommended to have knowledge of RESTful API.
All the content and graphics published in this e-book are the property of Tutorials Point (I)
Pvt. Ltd. The user of this e-book is prohibited to reuse, retain, copy, distribute or republish
any contents or a part of contents of this e-book in any manner without written consent
of the publisher.
We strive to update the contents of our website and tutorials as timely and as precisely as
possible, however, the contents may contain inaccuracies or errors. Tutorials Point (I) Pvt.
Ltd. provides no guarantee regarding the accuracy, timeliness or completeness of our
website or its contents including this tutorial. If you discover any errors on our website or
in this tutorial, please notify us at [email protected]
i
Elasticsearch
Table of Contents
About the Tutorial .................................................................................................................................... i
Audience .................................................................................................................................................. i
Prerequisites ............................................................................................................................................ i
ii
Elasticsearch
Wildcards ( * , + , ) .............................................................................................................................. 16
allow_no_indices .................................................................................................................................. 18
expand_wildcards ................................................................................................................................. 19
Versioning ............................................................................................................................................. 23
Timeout................................................................................................................................................. 25
iii
Elasticsearch
Multi-Index ........................................................................................................................................... 29
Multi-Type ............................................................................................................................................ 29
Index Settings........................................................................................................................................ 43
Analyze ................................................................................................................................................. 43
Flush ..................................................................................................................................................... 45
Refresh .................................................................................................................................................. 45
Node Stats............................................................................................................................................. 48
Nodes hot_threads................................................................................................................................ 49
Match query.......................................................................................................................................... 51
v
Elasticsearch
Analyzers............................................................................................................................................... 64
Tokenizers ............................................................................................................................................. 65
Token Filters.......................................................................................................................................... 66
Discovery .............................................................................................................................................. 69
Gateway ................................................................................................................................................ 70
HTTP...................................................................................................................................................... 70
Indices ................................................................................................................................................... 71
Node ..................................................................................................................................................... 72
Assertions ............................................................................................................................................. 75
vi
1. Elasticsearch Basic Concepts Elasticsearch
Elasticsearch is a real-time distributed and open source full-text search and analytics
engine. It is accessible from RESTful web service interface and uses schema less JSON
(JavaScript Object Notation) documents to store data. It is built on Java programming
language, which enables Elasticsearch to run on different platforms. It enables users to
explore very large amount of data at very high speed.
Elasticsearch is open source and available under the Apache license version 2.0.
1
Elasticsearch
networking application, and then there can be a specific type for user profile data,
another type for messaging data and another for comments data.
Shard: Indexes are horizontally subdivided into shards. This means each shard
contains all the properties of document, but contains less number of JSON objects
than index. The horizontal separation makes shard an independent node, which
can be store in any node. Primary shard is the original horizontal part of an index
and then these primary shards are replicated into replica shards.
Replicas: Elasticsearch allows a user to create replicas of their indexes and shards.
Replication not only helps in increasing the availability of data in case of failure,
but also improves the performance of searching by carrying out a parallel search
operation in these replicas.
Elasticsearch Advantages
Elasticsearch is developed on Java, which makes it compatible on almost every
platform.
Elasticsearch is real time, in other words after one second the added document is
searchable in this engine.
Elasticsearch is distributed, which makes it easy to scale and integrate in any big
organization.
Creating full backups are easy by using the concept of gateway, which is present
in Elasticsearch.
Elasticsearch supports almost every document type except those that do not
support text rendering.
Elasticsearch Disadvantages
Elasticsearch does not have multi-language support in terms of handling request
and response data (only possible in JSON) unlike in Apache Solr, where it is possible
in CSV, XML and JSON formats.
Elasticsearch also have a problem of Split brain situations, but in rare cases.
2
Elasticsearch
Elasticsearch RDBMS
Index Database
Shard Shard
Mapping Table
Field Field
3
2. Elasticsearch Installation Elasticsearch
Step 1: Check the minimum version of your java in installed your computer, it should be
java 7 or more updated version. You can check by doing the following:
$ echo $JAVA_HOME
For Red Hat and other Linux distributions download RPN file.
APT and Yum utilities can also be used to install Elasticsearch in many Linux
distributions.
Step 3: Installation process for Elasticsearch is very easy and described below for
different OS:
Windows OS: Unzip the zip package and the Elasticsearch is installed.
UNIX OS: Extract tar file in any location and the Elasticsearch is installed.
4
Elasticsearch
$ echo "deb
http://packages.elastic.co/elasticsearch/2.x/debian stable
main" | sudo tee -a /etc/apt/sources.list.d/elasticsearch-
2.x.list
o Run update:
o ADD the below text in the file with .repo suffix in your
/etc/yum.repos.d/ directory. For example, elasticsearch.repo
[elasticsearch-2.x]
name=Elasticsearch repository for2.x packages
baseurl=http://packages.elastic.co/elasticsearch/2.x/centos
gpgcheck=1
gpgkey=http://packages.elastic.co/GPG-KEY-elasticsearch
enabled=1
5
Elasticsearch
Step 4: Go to the Elasticsearch home directory and inside the bin folder. Run the
elasticsearch.bat file in case of windows or you can do the same using command prompt
and through terminal in case of UNIX rum Elasticsearch file.
In Windows:
> cd elasticsearch-2.1.0/bin
> elasticsearch
In Linux:
$ cd elasticsearch-2.1.0/bin
$ ./elasticsearch
Note: in case of windows, you might get error stating JAVA_HOME is not set, please set
it in environment variables to C:\Program Files\Java\jre1.8.0_31 or the location where
you installed java.
Step 5: Default port for Elasticsearch web interface is 9200 or you can change it by
changing http.port inside elasticsearch.yml file present in bin directory. You can check if
the server is up and running by browsing http://localhost:9200. It will return a JSON
object, which contains the information about the installed Elasticsearch in the following
way:
{
"name" : "Brain-Child",
"cluster_name" : "elasticsearch",
"version" : {
"number" : "2.1.0",
"build_hash" : "72cd1f1a3eee09505e036106146dc1949dc5dc87",
"build_timestamp" : "2015-11-18T22:40:03Z",
"build_snapshot" : false,
"lucene_version" : "5.3.1"
},
"tagline" : "You Know, for Search"
}
6
Elasticsearch
In the configure window of fiddler2, you can hit the address of Elasticsearch
adding an index and if you want, then the type/mapping also using HTTP POST
method, for example:
Address bar
http://localhost:9200/schools/school
Request body
You can add JSON object, which will get store into that index.
You can use the same for searching anything by just adding _search keyword
at the end of URL and sent a query in request body for example:
Address bar
POST http://localhost:9200/city/schools/_search
Request body
{
"query":{
"match_all":{}
}
}
This query will return everything from that index, which belongs to that
particular type.
You can delete a particular index or type by just putting the URL of the same in
address bar and hit it with HTTP DELETE method.
7
3. Elasticsearch Populate Elasticsearch Elasticsearch
In this section, we will add some index, mapping and data to Elasticsearch. This data will
be used in the examples explained in this tutorial.
Create Index
POST http://localhost:9200/schools
Request Body
It can contain index specific settings, but for now, it is empty for default settings.
Response
{"acknowledged": true}
POST http://localhost:9200/schools/_bulk
Request Body
{"index":{"_index":"schools", "_type":"school", "_id":"1"}}
{"name":"Central School", "description":"CBSE Affiliation", "street":"Nagan",
"city":"paprola", "state":"HP", "zip":"176115",
"location":[31.8955385,76.8380405], "fees":2000,
"tags":["Senior Secondary", "beautiful campus"],"rating":"3.5"}
8
Elasticsearch
"zip":"176114","location":[26.8535922,75.7923988],"fees":2500,"tags":["Well
equipped labs"],"rating":"4.5"}
Response
{"took":328,"errors":false,"items":[{"index":{"_index":"schools","_type":"schoo
l","_id":"1","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"stat
us":201}},{"index":{"_index":"schools","_type":"school","_id":"2","_version":1,
"_shards":{"total":2,"successful":1,"failed":0},"status":201}},{"index":{"_inde
x":"schools","_type":"school","_id":"3","_version":1,"_shards":{"total":2,"succ
essful":1,"failed":0},"status":201}}]}
Create Index
POST http://localhost:9200/schools_gov
Request Body
It can contain index specific settings, but for now its empty for default settings.
Response
{"acknowledged": true} (This means index is created)
Request Body
{"index":{"_index":"schools_gov", "_type":"school", "_id":"1"}}
{"name":"Model School", "description":"CBSE Affiliation",
"street":"silk city", "city":"Hyderabad", "state":"AP", "zip":"500030",
"location":[17.3903703,78.4752129], "fees":200,
"tags":["Senior Secondary", "beautiful campus"],"rating":"3"}
9
Elasticsearch
Response
{"took":179,"errors":false,"items":[{"index":{"_index":"schools_gov","_type":"s
chool","_id":"1","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"
status":201}},{"index":{"_index":"schools_gov","_type":"school","_id":"2","_ver
sion":1,"_shards":{"total":2,"successful":1,"failed":0},"status":201}}]}
10
4. Elasticsearch Migration between Versions Elasticsearch
In any system or software, when we are upgrading to newer version, we need to follow a
few steps to maintain the application settings, configurations, data and other things. These
steps are required to make the application stable in new system or to maintain the integrity
of data (prevent data from getting corrupt).
Test the upgraded version in your non production environments like in UAT, E2E,
SIT or DEV environment.
We can upgrade using full cluster restart or rolling upgrade. Rolling upgrade is for
new versions (for 2.x and newer). There is no service outage, when you are using
rolling upgrade method for migration.
Take data backup before migration and follow the instructions to carry out the
backup process. The snapshot and restore module can be used to take backup. This
module can be used to take a snapshot of index or full cluster and can be stored in
remote repository.
PUT /_snapshot/backup1
{
"type": "fs",
"settings": {
... repository settings ...
11
Elasticsearch
}}
We use shared file system (type: fs) for backup; it needs to be registered in every master
and data nodes. We just need to add the path.repo variable having path of backup
repository as a value.
After we add the repository path, we need to restart the nodes and then registration can
be carried out by executing the following command:
PUT http://localhost:9200/_snapshot/backup1
{
"type": "fs",
"settings": {
"location": "/mount/backups/backup1",
"compress": true
}
}
Step 1: Disable shard allocation process and turn off the node.
PUT http://localhost:9200/_cluster/settings
{
"persistent": {
"cluster.routing.allocation.enable": "none"
}
}
PUT http://localhost:9200/_cluster/settings
{
"persistent": {
"cluster.routing.allocation.disable_allocation": false,
"cluster.routing.allocation.enable": "none"
12
Elasticsearch
}
}
POST http://localhost:9200/_flush/synced
In Debian or Red Hat Node: rmp or dpkg can be used to upgrade the node by
installing new packages. Do not overwrite config files.
In Windows (zip file) or UNIX (tar file): Extract the new version without
overwriting the config directory. You can copy the files from old installation or can
change path.conf or path.data.
Step 5: Initiate the nodes again starting with the master node (nodes with node.master
set to true and node.data set to false) in the cluster. Wait for some time to establish a
cluster. You can check by monitoring the logs or using the following request:
Step 6: Monitor the progress of formation of cluster by using GET _cat/health request
and wait for the yellow in response, the response will be something like this:
Step 7: Enable the shard allocation process, which was disabled in Step 1, by using the
following request:
PUT http://localhost:9200/_cluster/settings
{
"persistent": {
"cluster.routing.allocation.enable": "all"
}
}
13
Elasticsearch
PUT http://localhost:9200/_cluster/settings
{
"persistent": {
"cluster.routing.allocation.disable_allocation": true,
"cluster.routing.allocation.enable": "all"
}
}
Rolling Upgrades
It is same like Full cluster restart, except Step 3. Here, you stop one node and upgrade.
After upgrading, restart the node and repeat these for the all nodes. After enabling the
shard allocation process, it can be monitored by the following request:
GET http://localhost:9200/_cat/recovery
14
5. Elasticsearch API Conventions Elasticsearch
Elasticsearch provides a REST API, which is accessed by JSON over HTTP. Elasticsearch
uses the following conventions:
Multiple Indices
Most of the operations, mainly searching and other operations, in APIs are for one or more
than one indices. This helps the user to search in multiple places or all the available data
by just executing a query once. Many different notations are used to perform operations
in multiple indices. We will discuss a few of them here in this section.
Request Body
{
"query":{
"query_string":{
"query":"any_string"
}
}
}
Response
JSON objects from index1, index2, index3 having any_string in it.
15
Elasticsearch
Request Body
{
"query":{
"query_string":{
"query":"any_string"
}
}
}
Response
JSON objects from all indices and having any_string in it.
Wildcards ( * , + , )
POST http://localhost:9200/school*/_search
Request Body
{
"query":{
"query_string":{
"query":"CBSE"
}
}
}
Response
JSON objects from all indices which start with school having CBSE in it.
16
Elasticsearch
Request Body
{
"query":{
"query_string":{
"query":"CBSE"
}
}
}
Response
JSON objects from all indices which start with school but not from schools_gov and
having CBSE in it.
POST http://localhost:9200/school*,book_shops/_search
Request Body
{
"query":{
"query_string":{
"query":"CBSE"
}
}
}
Response
{"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such
index","resource.type":"index_or_alias","resource.id":"book_shops","index":"boo
k_shops"}],"type":"index_not_found_exception","reason":"no such
index","resource.type":"index_or_alias","resource.id":"book_shops","index":"boo
k_shops"},"status":404}
17
Elasticsearch
POST http://localhost:9200/school*,book_shops/_search?ignore_unavailable=true
Request Body
{
"query":{
"query_string":{
"query":"CBSE"
}
}
}
allow_no_indices
true value of this parameter will prevent error, if a URL with wildcard results in no indices.
POST
http://localhost:9200/schools_pri*/_search?allow_no_indices=true
Request Body
{
"query":{
"match_all":{}
}
}
18
Elasticsearch
expand_wildcards
This parameter decides whether the wildcards need to be expanded to open indices or
closed indices or both. The value of this parameter can be open and closed or none and
all.
POST http://localhost:9200/schools/_close
Response
{"acknowledged":true}
POST http://localhost:9200/school*/_search?expand_wildcards=closed
Request Body
{
"query":{
"match_all":{}
}
}
Response
{"error":{"root_cause":[{"type":"index_closed_exception","reason":"closed","ind
ex":"schools"}],"type":"index_closed_exception","reason":"closed","index":"scho
ols"},"status":403}
19
Elasticsearch
<static_name{date_math_expr{date_format|time_zone}}>
http://localhost:9200/<accountdetail-{now-2d{YYYY.MM.dd|utc}}>/_search
static_name is a part of expression which remains same in every date math index like
account detail. date_math_expr contains the mathematical expression that determines the
date and time dynamically like now-2d. date_format contains the format in which the date
is written in index like YYYY.MM.dd. If todays date is 30th December 2015, then
<accountdetail-{now-2d{YYYY.MM.dd}}> will return accountdetail-2015.12.28.
Expression Resolves to
<accountdetail-{now-d}> accountdetail-2015.12.29
<accountdetail-{now-M}> accountdetail-2015.11.30
<accountdetail-{now{YYYY.MM}}> accountdetail-2015.12
We will now see some of the common options available in Elasticsearch that can be used
to get the response in a specified format.
Pretty Results
We can get response in a well-formatted JSON object by just appending a URL query
parameter, i.e., pretty=true.
POST http://localhost:9200/schools/_search?pretty=true
Request Body
{
"query":{
"match_all":{}
}
}
Response
..
{
"_index" : "schools",
"_type" : "school",
20
Elasticsearch
"_id" : "1",
"_score" : 1.0,
"_source":{"name":"Central School", "description":"CBSE Affiliation",
"street":"Nagan", "city":"paprola", "state":"HP", "zip":"176115","location":
[31.8955385,76.8380405], "fees":2000,"tags":["Senior Secondary", "beautiful
campus"],"rating":"3.5"}
}
.
Response Filtering
We can filter the response to less fields by adding them in the field_path parameter. For
example,
POST http://localhost:9200/schools/_search?filter_path=hits.total
Request Body
{
"query":{
"match_all":{}
}
}
Response
{"hits":{"total":3}}
21
6. Elasticsearch Document APIs Elasticsearch
Elasticsearch provides single document APIs and multi-document APIs, where the API call
is targeting single document and multiple documents respectively.
Index API
It helps to add or updates the JSON document in an index when a request is made to that
respective index with specific mapping. For example, the below request will add the JSON
object to index schools and under school mapping.
POST http://localhost:9200/schools/school/4
Request Body
{
"name":"City School",
"description":"ICSE",
"street":"West End", "city":"Meerut", "state":"UP", "zip":"250002",
"location":[28.9926174,77.692485],
"fees":3500,
"tags":["fully computerized"],
"rating":"4.5"
}
Response
{"_index":"schools","_type":"school","_id":"4","_version":1,"_shards":{"total":
2,"successful":1,"failed":0},"created":true}
action.auto_create_index:false
index.mapper.dynamic:false
You can also restrict the auto creation of index, where only index name with specific
patterns are allowed by changing the value of the following parameter:
22
Elasticsearch
action.auto_create_index:+acc*,-bank*
Versioning
Elasticsearch also provides version control facility. We can use a version query parameter
to specify the version of a particular document. For example,
POST http://localhost:9200/schools/school/1?version=1
Request Body
{
"name":"Central School", "description":"CBSE Affiliation", "street":"Nagan",
"city":"paprola", "state":"HP", "zip":"176115",
"location":[31.8955385,76.8380405],"fees":2200,
"tags":["Senior Secondary", "beautiful campus"],
"rating":"3.3"
}
Response
{"_index":"schools","_type":"school","_id":"1","_version":2,"_shards":{"total":
2,"successful":1,"failed":0},"created":false}
There are two most important types of versioning; internal versioning is the default version
that starts with 1 and increments with each update, deletes included. The version number
can be set externally. To enable this functionality, we need to set version_type to external.
Versioning is a real-time process and it is not affected by the real time search operations.
Operation Type
The operation type is used to force a create operation, this helps to avoid the overwriting
of existing document.
POST http://localhost:9200/tutorials/chapter/1?op_type=create
Request Body
{
"Text":"this is chapter one"
}
23
Elasticsearch
Response
{"_index":"tutorials","_type":"chapter","_id":"1","_version":1,"_shards":{"tota
l":2,"successful":1,"failed":0},"created":true}
Automatic ID generation
When ID is not specified in index operation, then Elasticsearch automatically generates id
for that document.
POST http://localhost:9200/tutorials/article/1?parent=1
Request Body
{
"Text":"This is article 1 of chapter 1"
}
Note: If you get exception while executing this example, please recreate the index by
adding the following in the index.
{
"mappings": {
"chapter": {},
"article": {
"_parent": {
"type": "chapter"
}
}
}
}
24
Elasticsearch
Timeout
By default, the index operation will wait on the primary shard to become available for up
to 1 minute before failing and responding with an error. This timeout value can be changed
explicitly by passing a value to timeout parameter.
POST http://localhost:9200/tutorials/chapter/2?timeout=3m
Request Body
{
"Text":"This is chapter 2 waiting for primary shard for 3 minutes"
}
Get API
API helps to extract type JSON object by performing a get request for a particular
document. For example,
GET http://localhost:9200/schools/school/1
Response
{
"_index":"schools","_type":"school","_id":"1","_version":2,"found":true,"_sourc
e":{"name":"Central School", "description":"CBSE Affiliation",
"street":"Nagan", "city":"paprola", "state":"HP", "zip":"176115",
"location":[31.8955385,76.8380405], "fees":2200,
"tags":["Senior Secondary", "beautiful campus"],"rating":"3.3"}
}
This operation is real time and does not get affected by the refresh rate of Index.
You can also specify the version, then Elasticsearch will fetch that version of
document only.
You can also specify the _all in the request, so that the Elasticsearch can search for
that document id in every type and it will return the first matched document.
You can also specify the fields you want in your result from that particular document.
GET http://localhost:9200/schools/school/1?fields=name,fees
25
Elasticsearch
Response
..
"fields":{
"name":["Central School"],
"fees":[2200]
}
..
You can also fetch the source part in your result by just adding _source part in your
get request.
GET http://localhost:9200/schools/school/1/_source
Response
{
"name":"Central School","description":"CBSE Afiliation", "street":"Nagan",
"city":"paprola", "state":"HP", "zip":"176115",
"location":[31.8955385,76.8380405], "fees":2200, "tags":["Senior
Secondary","beatiful campus"], "rating":"3.3"
}
You can also refresh the shard before doing get operation by set refresh parameter to
true.
Delete API
You can delete a particular index, mapping or a document by sending a HTTP DELETE
request to Elasticsearch. For example,
DELETE http://localhost:9200/schools/school/4
Response
{
"found":true,
"_index":"schools","_type":"school",
"_id":"4","_version":2,
"_shards":{"total":2,"successful":1,"failed":0}
}
26
Elasticsearch
Routing parameter can be specified to delete the document from a particular user
and the operation fails if the document does not belong to that particular user.
In this operation, you can specify refresh and timeout option same like GET API.
Update API
Script is used for performing this operation and versioning is used to make sure that no
updates have happened during the get and re-index. For example, update the fees of
school using script:
POST http://localhost:9200/schools_gov/school/1/_update
Request Body
{
"script":{
"inline": "ctx._source.fees+=inc",
"params":{
"inc": 500
}
}
}
Response
{"_index":"schools_gov","_type":"school","_id":"1","_version":2,"_shards":{"tot
al":2,"successful":1,"failed":0}}
Note: If you get script exception, it is recommended to add the following lines in
elastcisearch.yml
script.inline: on
script.indexed: on
You can check the update by sending get request to the updated document.
GET http://localhost:9200/schools_gov/school/1
27
Elasticsearch
POST http://localhost:9200/_mget
Request Body
{
"docs":[{
"_index": "schools",
"_type": "school",
"_id": "1"
},
{
"_index":"schools_gev",
"_type":"school",
"_id": "2"
}]
}
Response
{"docs":[{"_index":"schools","_type":"school","_id":"1","_version":1,"found":tr
ue,"_source":{"name":"Central School","description":"CBSE
Afiliation","street":"Nagan","city":"paprola",
"state":"HP","zip":"176115","location":[31.8955385,76.8380405],
"fees":2000,"tags":["Senior Secondary","beatiful campus"],"rating":"3.5"}
},{"_index":"schools_gev","_type":"school","_id":"2","error":{"root_cause":[{"t
ype":"index_not_found_exception","reason":"no such
index","index":"schools_gev"}],"type":"index_not_found_exception","reason":"no
such index","index":"schools_gev"}}]}
Bulk API
This API is used to upload or delete the JSON objects in bulk by making multiple
index/delete operations in a single request. We need to add _bulk keyword to call this
API. The example of this API is already performed in populate Elasticsearch article. All
other functionalities are same as of GET API.
28
7. Elasticsearch Search APIs Elasticsearch
This API is used to search content in Elasticsearch. Either a user can search by sending a
get request with query string as a parameter or a query in the message body of post
request. Mainly all the search APIS are multi-index, multi-type.
Multi-Index
Elasticsearch allows us to search for the documents present in all the indices or in some
specific indices. For example, if we need to search all the documents with a name that
contains central.
GET http://localhost:9200/_search?q=name:central
Response
{"took":78,"timed_out":false,"_shards":{"total":10,"successful":10,"failed":0},
"hits":{"total":1,"max_score":0.19178301,"hits":[{"_index":"schools","_type":"s
chool","_id":"1","_score":0.19178301,"_source":{"name":"Central School",
"description":"CBSE Affiliation", "street":"Nagan", "city":"paprola",
"state":"HP", "zip":"176115","location":[31.8955385,76.8380405],
"fees":2000,"tags":["Senior Secondary", "beautiful campus"],"rating":"3.5"}}]}}
GET http://localhost:9200/schools,schools_gov/_search?q=name:model
Multi-Type
We can also search all the documents in an index across all types or in some specified
type. For example,
Get http://localhost:9200/schools/_search?q=tags:sports
Response
{"took":16,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"h
its":{"total":1,"max_score":0.5,"hits":[{"_index":"schools","_type":"school","_
id":"2","_score":0.5,"_source":{"name":"Saint Paul School", "description":"ICSE
Afiliation", "street":"Dawarka", "city":"Delhi", "state":"Delhi",
"zip":"110075", "location":[28.5733056,77.0122136], "fees":5000,"tags":["Good
Faculty", "Great Sports"],
"rating":"4.5"}}]}}
29
Elasticsearch
URI Search
Many parameters can be passed in a search operation using Uniform Resource Identifier:
Name Description
POST http://localhost:9200/schools/_search
Request Body
{
"query":{
"query_string":{
"query":"up"
}
}
}
30
Elasticsearch
Response
.
"_source":{"name":"City School", "description":"ICSE", "street":"West End",
"city":"Meerut", "state":"UP", "zip":"250002",
"location":[28.9926174,77.692485], "fees":3500, "tags":["Well equipped
labs"],"rating":"4.5"}}
.
31
8. Elasticsearch Aggregations Elasticsearch
This framework collects all the data selected by the search query. This framework consists
of many building blocks, which help in building complex summaries of the data. The basic
structure of aggregation is presented below:
"aggregations" : {
"<aggregation_name>" : {
"<aggregation_type>" : {
<aggregation_body>
}
[,"meta" : { [<meta_data_body>] } ]?
[,"aggregations" : { [<sub_aggregation>]+ } ]?
}
}
There are different types of aggregations, each with its own purpose:
Metrics Aggregations
These aggregations help in computing matrices from the fields values of the aggregated
documents and sometime some values can be generated from scripts.
Avg Aggregation
This aggregation is used to get the average of any numeric field present in the aggregated
documents. For example,
POST http://localhost:9200/schools/_search
Request Body
{
"aggs":{
"avg_fees":{"avg":{"field":"fees"}}
}
}
32
Elasticsearch
Response
{"took":44,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"h
its":{"total":3,"max_score":1.0,"hits":[{"_index":"schools","_type":"school","_
id":"2","_score":1.0,"_source":{"name":"Saint Paul School", "description":"ICSE
Affiliation", "street":"Dawarka", "city":"Delhi", "state":"Delhi",
"zip":"110075", "location":[28.5733056,77.0122136], "fees":5000, "tags":["Good
Faculty", "Great Sports"], "rating":"4.5"}
},{"_index":"schools","_type":"school","_id":"1","_score":1.0,"_source":{"name"
:"Central School", "description":"CBSE Affiliation", "street":"Nagan",
"city":"paprola", "state":"HP",
"zip":"176115","location":[31.8955385,76.8380405], "fees":2200,"tags":["Senior
Secondary", "beautiful
campus"],"rating":"3.3"}},{"_index":"schools","_type":"school","_id":"3","_scor
e":1.0,"_source":{"name":"Crescent School", "description":"State Board
Affiliation", "street":"Tonk Road", "city":"Jaipur",
"state":"RJ","zip":"176114", "location":[26.8535922,75.7923988],
"fees":2500,"tags":["Well equipped labs"], "rating":"4.5"}
}]},"aggregations":{"avg_fees":{"value":3233.3333333333335}}}
If the value is not present in one or more aggregated documents, it gets ignored by default.
You can add a missing field in the aggregation for treating missing value as default.
{
"aggs":{
"avg_fees":{"avg":{
"field":"fees"
"missing":0}
}
}
}
Cardinality Aggregation
This aggregation gives the count of distinct values of a particular field. For example,
POST http://localhost:9200/schools*/_search
Request Body
{
"aggs":{ "distinct_name_count":{"cardinality":{"field":"name"}}
}
33
Elasticsearch
}}
Response
{"name":"Government School", "description":"State Board
Afiliation","street":"Hinjewadi","city":"Pune","state":"MH","zip":"411057","loc
ation":[18.599752,73.6821995],"fees":500,"tags":["Great Sports"],"rating":"4"}
}, {"_index":"schools_gov", "_type": "school","_id":"1",
"_score":1.0,"_source":{"name":"Model School", "description":"CBSE
Affiliation", "street":"silk
city","city":"Hyderabad","state":"AP","zip":"500030","location":[17.3903703,78.
4752129],"fees":700,"tags":["Senior Secondary", "beautiful
campus"],"rating":"3"}}]}, "aggregations":{"disticnt_name_count":{"value":3}}}
Note: The value of cardinality is 3 because there are three distinct values in name
Government, School and Model.
POST http://localhost:9200/schools/school/_search
Request Body
{
"aggs" : {
"fees_stats" : { "extended_stats" : { "field" : "fees" } }
}
}
Response
"aggregations":{"fees_stats":{"count":3,"min":2200.0,"max":5000.0,"avg":3233.33
33333333335,"sum":9700.0,"sum_of_squares":3.609E7,"variance":1575555.555555556,
"std_deviation":1255.2113589175156,"std_deviation_bounds":{"upper":5743.7560511
68364,"lower":722.9106154983024}}}}
34
Elasticsearch
Max Aggregation
This aggregation finds the max value of a specific numeric field in aggregated documents.
For example,
POST http://localhost:9200/schools*/_search
Request Body
{
"aggs" : {
"max_fees" : { "max" : { "field" : "fees" } }
}
}
Response
aggregations":{"max_fees":{"value":5000.0}}
Min Aggregation
This aggregation finds the max value of a specific numeric field in aggregated documents.
For example,
POST http://localhost:9200/schools*/_search
Request Body
{
"aggs" : {
"min_fees" : { "min" : { "field" : "fees" } }
}
}
Response
"aggregations":{"min_fees":{"value":500.0}}
35
Elasticsearch
Sum Aggregation
This aggregation calculates the sum of a specific numeric field in aggregated documents.
For example,
POST http://localhost:9200/schools*/_search
Request Body
{
"aggs" : {
"total_fees" : { "sum" : { "field" : "fees" } }
}
}
Response
"aggregations":{"total_fees":{"value":10900.0}}
There are some other metrics aggregations which are used in special cases like geo bounds
aggregation and geo centroid aggregation for the purpose of geo location.
Bucket Aggregations
These aggregations contain many buckets for different types of aggregations having a
criterion, which determines whether a document belongs to that bucket or not. The bucket
aggregations have been described below:
Children Aggregation
This bucket aggregation makes a collection of documents, which are mapped to parent
bucket. A type parameter is used to define the parent index. For example, we have a brand
and its different models, and then the model type will have the following _parent field:
{
"model" : {
"_parent" : {
"type" : "brand"
}
}
36
Elasticsearch
There are many other special bucket aggregations, which are useful in many other cases,
those are:
Filter Aggregation
Filters Aggregation
Global Aggregation
Histogram Aggregation
Missing Aggregation
Nested Aggregation
Range Aggregation
Sampler Aggregation
Terms Aggregation
Aggregation Metadata
You can add some data about the aggregation at the time of request by using meta tag
and can get that in response. For example,
POST http://localhost:9200/school*/report/_search
37
Elasticsearch
Request Body
{
"aggs" : {
"min_fees" : { "avg" : { "field" : "fees" } ,
"meta" :{
"dsc" :"Lowest Fees"
}}
}
}
Response
38
9. Elasticsearch Index APIs Elasticsearch
These APIs are responsible for managing all the aspects of index like settings, aliases,
mappings, index templates.
Create Index
This API helps you to create index. Index can be created automatically when a user is
passing JSON objects to any index or it can be created before that. To create an index,
you just need to send a post request with settings, mappings and aliases or just a simple
request without body. For example,
POST http://localhost:9200/colleges
Response
{"acknowledged":true}
POST http://localhost:9200/colleges
Request Body
{
"settings" : {
"index" : {
"number_of_shards" : 5,
"number_of_replicas" : 3
}
}
}
Response
{"acknowledged":true}
39
Elasticsearch
Or with mapping
POST http://localhost:9200/colleges
Request Body
{
"settings" : {
"number_of_shards" : 3
},
"mappings" : {
"type1" : {
"_source" : { "enabled" : false },
"properties" : {
"college_name" : { "type" : "string" },
"college type" : {"type":"string"}
}
}
}
}
Response
{"acknowledged":true}
POST http://localhost:9200/colleges
Request Body
{
"aliases" : {
"alias_1" : {},
"alias_2" : {
"filter" : {
"term" : {"user" : "manu" }
},
"routing" : "manu"
40
Elasticsearch
}
}
}
}
Response
{"acknowledged":true}
Delete Index
This API helps you to delete any index. You just need to pass a delete request with the
URL of that particular Index. For example,
DELETE http://localhost:9200/colleges
Get Index
This API can be called by just sending get request to one or more than one indices. This
returns the information about index.
GET http://localhost:9200/schools
Response
{"schools":{"aliases":{},"mappings":{"school":{"properties":{"city":{"type":"st
ring"},"description":{"type":"string"},"fees":{"type":"long"},"location":{"type
":"double"},"name":{"type":"string"},"rating":{"type":"string"},"state":{"type"
:"string"},"street":{"type":"string"},"tags":{"type":"string"},"zip":{"type":"s
tring"}}}},"settings":{"index":{"creation_date":"1454409831535","number_of_shar
ds":"5","number_of_replicas":"1","uuid":"iKdjTtXQSMCW4xZMhpsOVA","version":{"cr
eated":"2010199"}}},"warmers":{}}}
You can get the information of all the indices by using _all or *.
Index Exist
Existence of an index can be determined by just sending a get request to that index. If
the HTTP response is 200, it exists; if it is 404, it does not exist.
41
Elasticsearch
POST http://localhost:9200/schools/_close
Or
POST http://localhost:9200/schools/_open
Index Aliases
This API helps to give an alias to any index by using _aliases keyword. Single alias can be
mapped to more than one and alias cannot have the same name as index. For example,,
POST http://localhost:9200/_aliases
Request Body
{
"actions" : [
{ "add" : { "index" : "schools", "alias" : "schools_pri" } }
]
}
Response
{"acknowledged":true}
Then,
GET http://localhost:9200/schools_pri
Response
{"schools":{"aliases":{"schools_pri":{}},"
42
Elasticsearch
Index Settings
You can get the index settings by just appending _settings keyword at the end of URL. For
example,
GET http://localhost:9200/schools/_settings
Response
{"schools":{"settings":{"index":{"creation_date":"1454409831535","number_of_sha
rds":"5","number_of_replicas":"1","uuid":"iKdjTtXQSMCW4xZMhpsOVA","version":{"c
reated":"2010199"}}}}}
Analyze
This API helps to analyze the text and send the tokens with offset value and data type.
For example,
POST http://localhost:9200/_analyze
Request Body
{
"analyzer" : "standard",
"text" : "you are reading this at tutorials point"
}
Response
{"tokens":[{"token":"you","start_offset":0,"end_offset":3,"type":"<ALPHANUM>","
position":0},
{"token":"are","start_offset":4,"end_offset":7,"type":"<ALPHANUM>","position":1
},
{"token":"reading","start_offset":8,"end_offset":15,"type":"<ALPHANUM>","positi
on":2},
{"token":"this","start_offset":16,"end_offset":20,"type":"<ALPHANUM>","position
":3},
{"token":"at","start_offset":21,"end_offset":23,"type":"<ALPHANUM>","position":
4},
{"token":"tutorials","start_offset":24,"end_offset":33,"type":"<ALPHANUM>","pos
ition":5},
{"token":"point","start_offset":34,"end_offset":39,"type":"<ALPHANUM>","positio
n":6}]}
43
Elasticsearch
You can also analyze a text with any index, and then the text will be analyzed according
to the analyzer associated with that index.
Index Templates
You can also create index templates with mappings, which can be applied to new indices.
For example,
POST http://localhost:9200/_template/template_a
Request Body
{
"template" : "tu*",
"settings" : {
"number_of_shards" : 3
},
"mappings" : {
"chapter" : {
"_source" : { "enabled" : false }
}
}
}
Any index that starts with tu will have the same settings as template_a.
Index Stats
This API helps you to extract statistics about a particular index. You just need to send a
get request with the index URL and _stats keyword at the end.
GET http://localhost:9200/schools/_stats
Response
{"_shards":{"total":10,"successful":5,"failed":0},"_all":{"primaries":{"docs":{
"count":3,"deleted":0},"store":{"size_in_bytes":16653,"throttle_time_in_millis"
:0},
44
Elasticsearch
Flush
This API helps to clean the data from index memory and migrate it to index storage and
also cleans internal transaction log. For example,
GET http://localhost:9200/schools/_flush
Response
{"_shards":{"total":10,"successful":5,"failed":0}}
Refresh
Refresh is scheduled by default in Elasticsearch, but you can refresh one or more indices
explicitly by using _refresh. For example,
GET http://localhost:9200/schools/_refresh
Response
{"_shards":{"total":10,"successful":5,"failed":0}}
45
10. Elasticsearch Cluster APIs Elasticsearch
This API is used for getting information about cluster and its nodes and making changes
in them. For calling this API, we need to specify the node name, address or _local. For
example,
GET http://localhost:9200/_nodes/_local
Response
{"cluster_name":"elasticsearch","nodes":{"Vy3KxqcHQdm4cIM22U1ewA":{"name":"Red
Guardian","transport_address":"127.0.0.1:9300","host":"127.0.0.1","ip":"127.0.0
.1","version":"2.1.1","build":"40e2c53","http_address":"127.0.0.1:9200",
Or
Get http://localhost:9200/_nodes/127.0.0.1
Response
Same as in the above example.
Cluster Health
This API is used to get the status on the health of the cluster by appending health keyword.
For example,
GET http://localhost:9200/_cluster/health
Response
{"cluster_name":"elasticsearch","status":"yellow","timed_out":false,"number_of_
nodes":1,"number_of_data_nodes":1,"active_primary_shards":23,"active_shards":23
,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":23,"delayed_
unassigned_shards":0,"number_of_pending_tasks":0,"number_of_in_flight_fetch":0,
"task_max_waiting_in_queue_millis":0,"active_shards_percent_as_number":50.0}
Cluster State
This API is used to get state information about a cluster by appending state keyword URL.
The state information contains version, master node, other nodes, routing table, metadata
and blocks. For example,
46
Elasticsearch
GET http://localhost:9200/_cluster/state
Response
{"cluster_name":"elasticsearch","version":27,"state_uuid":"B3P7uHGKQUGsSsiX2rGY
UQ","master_node":"Vy3KxqcHQdm4cIM22U1ewA",
Cluster Stats
This API helps to retrieve statistics about cluster by using stats keyword. This API returns
shard number, store size, memory usage, number of nodes, roles, OS, and file system.
For example,
GET http://localhost:9200/_cluster/stats
Response
{"timestamp":1454496710020,"cluster_name":"elasticsearch","status":"yellow","in
dices":{"count":5,"shards":{"total":23,"primaries":23,"replication":0.0,"
GET http://localhost:9200/_cluster/pending_tasks
Cluster Reroute
This API is used for moving shard from one node to another or to cancel any allocation or
allocate any unassigned shard. For example,
POST http://localhost:9200/_cluster/reroute
47
Elasticsearch
Request Body
{
"commands" : [ {
"move" :
{
"index" : "schools", "shard" : 2,
"from_node" : "nodea", "to_node" : "nodeb"
}
},
{
"allocate" : {
"index" : "test", "shard" : 1, "node" : "nodec"
}
}
]
}
Node Stats
This API is used to retrieve the statistics of one more nodes of the cluster. Node stats are
almost the same as cluster. For example,
GET http://localhost:9200/_nodes/stats
Response
{"cluster_name":"elasticsearch","nodes":{"Vy3KxqcHQdm4cIM22U1ewA":{"timestamp":
1454497097572,"name":"Red
Guardian","transport_address":"127.0.0.1:9300","host":"127.0.0.1","ip":["127.0.
0.1:9300",
48
Elasticsearch
Nodes hot_threads
This API helps you to retrieve information about the current hot threads on each node in
cluster. For example,
GET http://localhost:9200/_nodes/hot_threads
Response
::: {Red Guardian} {Vy3KxqcHQdm4cIM22U1ewA} {127.0.0.1}{127.0.0.1:9300}Hot
threads at 2016-02-03T10:59:48.856Z, interval=500ms, busiestThreads=3,
ignoreIdleThreads=true:0.0% (0s out of 500ms) cpu usage by thread 'Attach
Listener'
unique snapshot
unique snapshot
49
11. Elasticsearch Query DSL Elasticsearch
In Elasticsearch, searching is carried out by using query based on JSON. Query is made
up of two clauses:
Leaf Query Clauses These clauses are match, term or range, which look for a
specific value in specific field.
Elasticsearch supports a large number of queries. A query starts with a query key word
and then has conditions and filters inside in the form of JSON object. The different types
of queries have been described below:
POST http://localhost:9200/schools*/_search
Request Body
{
"query":{
"match_all":{}
}
}
Response
{"took":1,"timed_out":false,"_shards":{"total":10,"successful":10,"failed":0},"
hits":{"total":5,"max_score":1.0,"hits":[{"_index":"schools","_type":"school","
_id":"2","_score":1.0,"_source":{"name":"Saint Paul School",
"description":"ICSE Affiliation", "street":"Dawarka",
"city":"Delhi","state":"Delhi","zip":"110075","location":[28.5733056,77.0122136
],"fees":5000,"tags":["Good Faculty", "Great Sports"],"rating":"4.5"}
},{"_index":"schools_gov", "_type":"school", "_id":"2","_score":1.0,
"_source":{"name":"Government School", "description":"State Board Affiliation",
"street":"Hinjewadi", "city":"Pune", "state":"MH",
"zip":"411057","location":[18.599752,73.6821995],"fees":500,"tags":["Great
Sports"],"rating":"4"}
},{"_index":"schools","_type":"school","_id":"1","_score":1.0,"_source":{"name"
:"Central School", "description":"CBSE Affiliation", "street":"Nagan",
"city":"paprola", "state":"HP",
50
Elasticsearch
"zip":"176115","location":[31.8955385,76.8380405], "fees":2200,"tags":["Senior
Secondary", "beautiful
campus"],"rating":"3.3"}},{"_index":"schools_gov","_type":"school","_id":"1","_
score":1.0,"_source":{"name":"Model School", "description":"CBSE Affiliation",
"street":"silk city", "city":"Hyderabad", "state":"AP", "zip":"500030",
"location":[17.3903703,78.4752129], "fees":700,"tags":["Senior Secondary",
"beautiful
campus"],"rating":"3"}},{"_index":"schools","_type":"school","_id":"3","_score"
:1.0,"_source":{"name":"Crescent School", "description":"State Board
Affiliation", "street":"Tonk Road",
"city":"Jaipur","state":"RJ","zip":"176114","location":[26.8535922,75.7923988],
"fees":2500,"tags":["Well equipped labs"], "rating":"4.5"}}]}}
Match query
This query matches a text or phrase with the values of one or more fields. For example,
POST http://localhost:9200/schools*/_search
Request Body
{
"query":{
"match" : {
"city":"pune"
}
}
}
Response
{"took":1,"timed_out":false,"_shards":{"total":10,"successful":10,"failed":0},"
hits":{"total":1,"max_score":0.30685282,"hits":[{"_index":"schools_gov","_type"
51
Elasticsearch
:"school","_id":"2","_score":0.30685282,"_source":{"name":"Government School",
"description":"State Board
Afiliation","street":"Hinjewadi","city":"Pune","state":"MH","zip":"411057","loc
ation":[18.599752,73.6821995],"fees":500,"tags":["Great Sports"],"rating":"4"}
}]}}
multi_match query
This query matches a text or phrase with more than one field. For example,
POST http://localhost:9200/schools*/_search
Request Body
{
"query":{
"multi_match" : {
"query": "hyderabad",
"fields": [ "city", "state" ]
}
}
}
Response
{"took":16,"timed_out":false,"_shards":{"total":10,"successful":10,"failed":0},
"hits":{"total":1,"max_score":0.09415865,"hits":[{"_index":"schools_gov","_type
":"school","_id":"1","_score":0.09415865,"_source":{"name":"Model School","
description":"CBSE Affiliation", "street":"silk
city","city":"Hyderabad","state":"AP","zip":"500030","location":[17.3903703,78.
4752129],"fees":700,"tags":["Senior Secondary", "beautiful
campus"],"rating":"3"}}]}}
POST http://localhost:9200/schools/_search
Request Body
{
"query":{
52
Elasticsearch
"query_string":{
"query":"good faculty"
}
}
}
Response
{"took":16,"timed_out":false,"_shards":{"total":10,"successful":10,"failed":0},
"hits":{"total":1,"max_score":0.09492774,"hits":[{"_index":"schools","_type":"s
chool","_id":"2","_score":0.09492774,"_source":{"name":"Saint Paul School",
"description":"ICSE Affiliation", "street":"Dawarka", "city":"Delhi",
"state":"Delhi", "zip":"110075", "location":[28.5733056,77.0122136],
"fees":5000, "tags":["Good Faculty", "Great Sports"], "rating":"4.5" }}]}}
POST http://localhost:9200/schools/_search
Request Body
{
"query":{
"term":{"zip":"176115"}
}
}
Response
{"took":1,"timed_out":false,"_shards":{"total":10,"successful":10,"failed":0},"
hits":{"total":1,"max_score":0.30685282,"hits":[{"_index":"schools","_type":"sc
hool","_id":"1","_score":0.30685282,"_source":{"name":"Central School",
"description":"CBSE Affiliation", "street":"Nagan", "city":"paprola",
"state":"HP", "zip":"176115", "location":[31.8955385,76.8380405],
"fees":2200,"tags":["Senior Secondary", "beautiful campus"],"rating":"3.3"}}]}}
Range Query
This query is used to find the objects having values between the ranges of values. For this,
we need to use operators like
53
Elasticsearch
For example,
POST http://localhost:9200/schools*/_search
Request Body
{
"query":{
"range":{
"rating":{
"gte":3.5
}
}
}
}
Response
{"took":31,"timed_out":false,"_shards":{"total":10,"successful":10,"failed":0},
"hits":{"total":3,"max_score":1.0,"hits":[{"_index":"schools","_type":"school",
"_id":"2","_score":1.0,"_source":{"name":"Saint Paul School",
"description":"ICSE Affiliation", "street":"Dawarka",
"city":"Delhi","state":"Delhi","zip":"110075","location":[28.5733056,77.0122136
],"fees":5000,"tags":["Good Faculty", "Great Sports"],"rating":"4.5"}
},{"_index":"schools_gov", "_type":"school",
"_id":"2","_score":1.0,"_source":{"name":"Government School",
"description":"State Board Affiliation", "street":"Hinjewadi", "city":"Pune",
"state":"MH", "zip":"411057", "location":[18.599752,73.6821995]
"fees":500,"tags":["Great Sports"],"rating":"4"}}, {"_index":"schools",
"_type":"school", "_id":"3", "_score":1.0,"_source":{"name":"Crescent School",
"description":"State Board Affiliation", "street":"Tonk Road", "city":"Jaipur",
"state":"RJ", "zip":"176114",
"location":[26.8535922,75.7923988],"fees":2500,"tags":["Well equipped
labs"],"rating":"4.5"}}]}}
54
Elasticsearch
Missing query: This is completely opposite to exists query, this query searches
for objects without specific fields or fields having null value.
Wildcard or regexp query: This query uses regular expressions to find patterns
in the objects.
POST http://localhost:9200/schools*/_search
Request Body
{
"query":{
"type" : {
"value" : "school"
}
}
}
Response
All the school JSON objects present in the specified indices.
Compound Queries
These queries are a collection of different queries merged with each other by using Boolean
operators like and, or, not or for different indices or having function calls etc. For example,
POST http://localhost:9200/schools*/_search
Request Body
{
"query":{
"filtered":{
"query":{
"match":{
"state":"UP"
}
},
"filter":{
55
Elasticsearch
"range":{
"rating":{
"gte":4.0
}
}
}
}
}
}
Response
{"took":16,"timed_out":false,"_shards":{"total":10,"successful":10,"failed":0},
"hits":{"total":0,"max_score":null,"hits":[]}}
Joining Queries
These queries are used where more than one mapping or document is included. There are
two types of joining queries:
Nested Query
These queries deal with nested mapping (you will read more about it in the next chapter).
POST http://localhost:9200/tutorials/_search
Request Body
{
"query":
{
"has_child" : {
"type" : "article",
"query" : {
"match" : {
"Text" : "This is article 1 of chapter 1"
56
Elasticsearch
}
}
}
}
}
Response
{"took":21,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"h
its":{"total":1,"max_score":1.0,"hits":[{"_index":"tutorials","_type":"chapter"
,"_id":"1","_score":1.0,"_source":{
"Text":"this is chapter one"}}]}}
Geo Queries
These queries deal with geo locations and geo points. These queries help to find out schools
or any other geographical object near to any location. You need to use geo point data
type. For example,
POST http://localhost:9200/schools*/_search
Request Body
{
"query":{
"filtered":{
"filter":{
"geo_distance":{
"distance":"100km",
"location":[32.052098, 76.649294]
}
}
}
}
}
Response
57
Elasticsearch
{"took":6,"timed_out":false,"_shards":{"total":10,"successful":10,"failed":0},"
hits":{"total":2,"max_score":1.0,"hits":[{"_index":"schools","_type":"school","
_id":"2","_score":1.0,"_source":{"name":"Saint Paul School",
"description":"ICSE Affiliation",
"street":"Dawarka","city":"Delhi","state":"Delhi","zip":"110075","location":[28
.5733056,77.0122136],"fees":5000,"tags":["Good Faculty", "Great
Sports"],"rating":"4.5"}
},{"_index":"schools", "_type":"school", "_id":"1","_score":1.0,
"_source":{"name":"Central School", "description":"CBSE Affiliation",
"street":"Nagan", "city":"paprola", "state":"HP", "zip":"176115",
"location":[31.8955385,76.8380405], "fees":2000,"tags":["Senior Secondary",
"beautiful campus"],"rating":"3.5"}}]}}
Note: If you get an exception while performing the above example, please add the
following mapping to your index.
{
"mappings":{
"school":{
"_all":{
"enabled":true
},
"properties":{
"location":{
"type":"geo_point"
}
}
}}}
58
12. Elasticsearch Mapping Elasticsearch
Mapping is the outline of the documents stored in an index. It defines the data type like
geo_point or string and format of the fields present in the documents and rules to control
the mapping of dynamically added fields. For example,
POST http://localhost:9200/bankaccountdetails
Request Body
{
"mappings":{
"report":{
"_all":{
"enabled":true
},
"properties":{
"name":{ "type":"string"},
"date":{ "type":"date"},
"balance":{ "type":"double"},
"liability":{ "type":"double"}
}
}
}
}
Response
{"acknowledged":true}
59
Elasticsearch
POST http://localhost:9200/tabletennis/team/1
Request Body
{
"group" : "players",
"user" : [
{
"first" : "dave",
"last" : "jones"
},
{
"first" : "kevin",
"last" : "morris"
}
]
}
Response
{"_index":"tabletennis","_type":"team","_id":"1","_version":1,"_shards":{"total
":2,"successful":1,"failed":0},"created":true}
60
Elasticsearch
Mapping Types
Each index has one or more mapping types, which are used to divide the documents of an
index into logical groups. Mapping can be different from each other on the basis of the
following parameters:
Meta-Fields
These fields provide information about the mappings and the other objects associated with
it. Like _index, _type, _id, and _source fields.
Fields
Different mapping contains different number of fields and fields with different data types.
Dynamic Mapping
Elasticsearch provides a user-friendly mechanism for the automatic creation of mapping.
A user can post the data directly to any undefined mapping and Elasticsearch will
automatically create the mapping, which is called dynamic mapping. For example,
POST http://localhost:9200/accountdetails/tansferreport
Request Body
{
"from_acc":"7056443341",
"to_acc":"7032460534",
"date":"11/1/2016",
"amount":10000
}
Response
{
"_index":"accountdetails",
"_type":"tansferreport",
"_id":"AVI3FeH0icjGpNBI4ake",
"_version":1,
"_shards":{"total":2,"successful":1,"failed":0},
"created":true
}
61
Elasticsearch
Mapping Parameters
The mapping parameters define the structure of mapping, information about fields and
about storage and how the mapped data will be analyzed at the time of searching. These
are the following mapping parameters:
analyzer
boost
coerce
copy_to
doc_values
dynamic
enabled
fielddata
geohash
geohash_precision
geohash_prefix
format
ignore_above
ignore_malformed
include_in_all
index_options
lat_lon
index
fields
norms
null_value
position_increment_gap
properties
search_analyzer
similarity
store
term_vector
62
13. Elasticsearch Analysis Elasticsearch
When a query is processed during a search operation , the content in any index is analyzed
by analysis module. This module consists of analyzer, tokenizer, tokenfilters and
charfilters. If no analyzer is defined, then by default the built in analyzers, token, filters
and tokenizers get registered with analysis module. For example,
POST http://localhost:9200/pictures
Request Body
{
"settings": {
"analysis": {
"analyzer": {
"index_analyzer": {
"tokenizer": "standard",
"filter": ["standard", "my_delimiter", "lowercase", "stop",
"asciifolding", "porter_stem"]
},
"search_analyzer": {
"tokenizer": "standard",
"filter": ["standard", "lowercase", "stop", "asciifolding",
"porter_stem"]
}
},
"filter": {
"my_delimiter": {
"type": "word_delimiter",
"generate_word_parts": true,
"catenate_words": true,
"catenate_numbers": true,
"catenate_all": true,
"split_on_case_change": true,
"preserve_original": true,
"split_on_numerics": true,
"stem_english_possessive": true
}
63
Elasticsearch
}
}
}
}
Analyzers
An analyzer consists of a tokenizer and optional token filters. These analyzers are
registered in analysis module with logical names, which can be referenced either in
mapping definitions or in some APIs. There are a number of default analyzers as follows:
64
Elasticsearch
Tokenizers
Tokenizers are used for generating tokens from a text in Elasticsearch. Text can be broken
down into tokens by taking whitespace or other punctuations into account. Elasticsearch
has plenty of built-in tokenizers, which can be used in custom analyzer.
UAX Email URL Tokenizer This works same lie standard tokenizer but it
9 treats email and URL as single token.
(uax_url_email)
This tokenizer generated all the possible paths
present in the input directory path. Settings
Path hierarchy tokenizer available for this tokenizer are delimiter (defaults
10 to /), replacement, buffer_size (defaults to
(path_hierarchy)
1024), reverse (defaults to false) and skip
(defaults to 0).
65
Elasticsearch
Token Filters
Token filters receive input from tokenizers and then these filters can modify, delete or add
text in that input. Elasticsearch offers plenty of built-in token filters. Most of them have
already been explained in previous sections.
Character Filters
These filters process the text before tokenizers. Character filters look for special characters
or html tags or specified pattern and then either delete then or change them to appropriate
words like & to and, delete html markup tags. Here is an example of analyzer with
synonym specified in synonym.txt:
{
"settings":{
"index":{
"analysis":{
"analyzer":{
"synonym":{
"tokenizer":"whitespace",
"filter":[
"synonym"
]
}
},
"filter":{
"synonym":{
"type":"synonym",
"synonyms_path":"synonym.txt",
"ignore_case":"true"
}
}
}
}
}
}
66
14. Elasticsearch Modules Elasticsearch
We will discuss the different modules of Elasticsearch in the following sections of this
chapter.
67
Elasticsearch
cluster.routing.allocation.balance Float value (by This defines the weight factor for
.shard default 0.45f) shards allocated on every node.
68
Elasticsearch
Discovery
This module helps a cluster to discover and maintain the state of all the nodes in it. The
state of cluster changes when a node is added or deleted from a cluster. The cluster name
setting is used to create logical difference between different clusters. There are some
modules which help you to use the APIs provided by cloud vendors and those are:
Azure discovery
EC2 discovery
Google compute engine discovery
Zen discovery
69
Elasticsearch
Gateway
This module maintains the cluster state and the shard data across full cluster restarts.
Following are the static settings of this module:
HTTP
This module manages the communication between HTTP client and Elasticsearch APIs. This
module can be disabled by changing the value of http.enabled to false. The following are
the settings (configured in elasticsearch.yml) to control this module:
Setting Description
70
Elasticsearch
This is the maximum http header size and its default value
http.max_header_size is 8kb.
Indices
This module maintains the settings, which are set globally for every index. The following
settings are mainly related to memory usage:
Circuit Breaker
This is used for preventing operation from causing an OutOfMemroyError. The setting
mainly restricts the JVM heap size. For example, indices.breaker.total.limit setting, which
defaults to 70% of JVM heap.
Fielddata Cache
This is used mainly when aggregating on a field. It is recommended to have enough
memory to allocate it. The amount of memory used for the field data cache can be
controlled using indices.fielddata.cache.size setting.
Indexing Buffer
This buffer stores the newly created documents in the index and flushes them when the
buffer is full. Setting like indices.memory.index_buffer_size control the amount of heap
allocated for this buffer.
71
Elasticsearch
Indices Recovery
It controls the resources during recovery process. The following are the settings:
indices.recovery.concurrent_streams 3
indices.recovery.concurrent_small_file_streams 2
indices.recovery.file_chunk_size 512kb
indices.recovery.translog_ops 1000
indices.recovery.translog_size 512kb
indices.recovery.compress true
indices.recovery.max_bytes_per_sec 40mb
TTL Interval
Time to Live (TTL) interval defines the time of a document, after which the document gets
deleted. The following are the dynamic settings for controlling this process:
indices.ttl.interval 60s
indices.ttl.bulk_size 1000
Node
Each node has an option to be data node or not. You can change this property by changing
node.data setting. Setting the value as false defines that the node is not a data node.
72
15. Elasticsearch Testing Elasticsearch
Elasticsearch provides a jar file, which can be added to any java IDE and can be used to
test the code which is related to Elasticsearch. A range of tests can be performed by using
the framework provided by Elasticsearch:
Unit testing
Integration testing
Randomized testing
To start with testing, you need to add the Elasticsearch testing dependency to your
program. You can use maven for this purpose and can add the following in pom.xml.
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>2.1.0</version>
</dependency>
EsSetup has been initialized to start and stop Elasticsearch node and also to create indices.
esSetup.execute() function with createIndex will create the indices, you need to specify
the settings, type and data.
Unit Testing
Unit test is carried out by using JUnit and Elasticsearch test framework. Node and indices
can be created using Elasticsearch classes and in test method can be used to perform the
testing. ESTestCase and ESTokenStreamTestCase classes are used for this testing.
Integration Testing
Integration testing uses multiple nodes in a cluster. ESIntegTestCase class is used for this
testing. There are various methods which make the job of preparing a test case easier.
Methods Description
73
Elasticsearch
Methods Description
Accessing Clients
Client is used to access different nodes in a cluster and carry out some action.
ESIntegTestCase.client() method is used for getting a random client. Elasticsearch offers
other methods also to access client and those methods can be accessed using
ESIntegTestCase.internalCluster() method.
Methods Description
74
Elasticsearch
Randomized Testing
This testing is used to test the users code with every possible data, so that there will be
no failure in future with any type of data. Random data is the best option to carry out this
testing.
Assertions
ElasticsearchAssertions and ElasticsearchGeoAssertions classes contain assertions, which
are used for performing some common checks at the time of testing. For example,
75