If you want to get serious about Elasticsearch, you'll have to learn about hardware. It might be an unpopular opinion in 2017, but don't run Elasticsearch in the cloud. It has nothing to do with latency or losing your AWS spot instances because Netflix has just released a new show, it has to do with picking up the right hardware for your needs. Cloud providers such as AWS provides vCPU but there's no way you know what you're going to get.
Because they have trouble with Java garbage collection, the first thing people ask advice about is memory management. What you should actually care about is, in no particular order:
- CPU
- Memory
- Network
- Storage
CPU
Running complex filtered queries, intensive indexing, percolation and queries against non latin charsets have a strong impact on the CPU, so picking up the right one is critical.
Dive into the CPU specs to understand how they behave with Java. It's been more than a decade since I last read Intel specs --- I even used to have them as physical books --- but it prove itself critical to pick up the right hardware.
For example, Xeon E5 v4 provides 60% better performances than the v3 version when running Java. Xeon D works well when you want to scale your cluster horizontally as soon as heavy indexing is split evenly amongst the cluster nodes. Prepare to get into trouble with nodes popping out of the cluster like popcorn otherwise. So picking up the right CPU and horizontal design is critical.
Speaking of CPU, Elasticsearch divides the CPU use into thread pools of various types:
genericfor standard operations such as discoveryindexfor indexinggetfor get operations, obviouslybulkfor bulk operations such as bulk indexing
These are the most important ones you'll have to deal with, RTFM for everything else
Each pool runs a number of threads, which can be configured, and has a queue, which can be configured too. The default number of threads is defined using a system variable called allocated_cpu, which is never greater than 32, even though you have 48 core and the system variable available_cpu shows 48.
I wouldn't recommend changing the thread pool size unless you really know what you do, the defaults settings are quite sensible. You might

最低0.47元/天 解锁文章






