You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you are in an indexing-heavy environment,((("indexing", "performance tips")))((("post-deployment", "indexing performance tips"))) such as indexing infrastructure
5
-
logs, you may be willing to sacrifice some search performance for faster indexing
6
-
rates. In these scenarios, searches tend to be relatively rare and performed
7
-
by people internal to your organization. They are willing to wait several
8
-
seconds for a search, as opposed to a consumer facing a search that must
Performance testing is always difficult, so try to be as scientific as possible
28
-
in your approach.((("performance testing")))((("indexing", "performance tips", "performance testing"))) Randomly fiddling with knobs and turning on ingestion is not
29
-
a good way to tune performance. If there are too many _causes_, it is impossible
30
-
to determine which one had the best _effect_. A reasonable approach to testing is as follows:
This should be fairly obvious, but use bulk indexing requests for optimal performance.((("indexing", "performance tips", "bulk requests, using and sizing")))((("bulk API", "using and sizing bulk requests")))
46
-
Bulk sizing is dependent on your data, analysis, and cluster configuration, but
47
-
a good starting point is 5–15 MB per bulk. Note that this is physical size.
48
-
Document count is not a good metric for bulk size. For example, if you are
49
-
indexing 1,000 documents per bulk, keep the following in mind:
26
+
显而易见的,优化性能应该使用批量请求。((("indexing", "performance tips", "bulk requests, using and sizing")))((("bulk API", "using and sizing bulk requests")))批量的大小则取决于你的数据、分析和集群配置,不过每次批量数据 5–15 MB 大是个不错的起始点。注意这里说的是物理字节数大小。文档计数对批量大小来说不是一个好指标。比如说,如果你每次批量索引 1000 个文档,记住下面的事实:
50
27
51
-
- 1,000 documents at 1 KB each is 1 MB.
52
-
- 1,000 documents at 100 KB each is 100 MB.
28
+
- 1000 个 1 KB 大小的文档加起来是 1 MB 大。
29
+
- 1000 个 100 KB 大小的文档加起来是 100 MB 大。
53
30
54
-
Those are drastically different bulk sizes. Bulks need to be loaded into memory
55
-
at the coordinating node, so it is the physical size of the bulk that is more
Monitor your nodes with Marvel and/or tools such as `iostat`, `top`, and `ps` to see
63
-
when resources start to bottleneck. If you start to receive `EsRejectedExecutionException`,
64
-
your cluster can no longer keep up: at least one resource has reached capacity. Either reduce concurrency, provide more of the limited resource (such as switching from spinning disks to SSDs), or add more nodes.
Disks are usually the bottleneck of any modern server. Elasticsearch heavily uses disks, and the more throughput your disks can handle, the more stable your nodes will be. Here are some tips for optimizing disk I/O:
Segment merging is computationally expensive,((("indexing", "performance tips", "segments and merging")))((("merging segments")))((("segments", "merging"))) and can eat up a lot of disk I/O.
92
-
Merges are scheduled to operate in the background because they can take a long
93
-
time to finish, especially large segments. This is normally fine, because the
94
-
rate of large segment merges is relatively rare.
55
+
段合并的计算量庞大,((("indexing", "performance tips", "segments and merging")))((("merging segments")))((("segments", "merging")))而且还要吃掉大量磁盘 I/O。合并在后台定期操作,因为他们可能要很长时间才能完成,尤其是比较大的段。这个通常来说都没问题,因为大规模段合并的概率是很小的。
95
56
96
-
But sometimes merging falls behind the ingestion rate. If this happens, Elasticsearch
97
-
will automatically throttle indexing requests to a single thread. This prevents
98
-
a _segment explosion_ problem, in which hundreds of segments are generated before
99
-
they can be merged. Elasticsearch will log `INFO`-level messages stating `now
100
-
throttling indexing` when it detects merging falling behind indexing.
Spinning media has a harder time with concurrent I/O, so we need to decrease
145
-
the number of threads that can concurrently access the disk per index. This setting
146
-
will allow `max_thread_count + 2` threads to operate on the disk at one time,
147
-
so a setting of `1` will allow three threads.
148
-
149
-
For SSDs, you can ignore this setting. The default is
150
-
`Math.min(3, Runtime.getRuntime().availableProcessors() / 2)`, which works well
151
-
for SSD.
152
-
153
-
Finally, you can increase `index.translog.flush_threshold_size` from the default
154
-
512 MB to something larger, such as 1 GB. This allows larger segments to accumulate
155
-
in the translog before a flush occurs. By letting larger segments build, you
156
-
flush less often, and the larger segments merge less often. All of this adds up
157
-
to less disk I/O overhead and better indexing rates. Of course, you will need
158
-
the corresponding amount of heap memory free to accumulate the extra buffering
159
-
space, so keep that in mind when adjusting this setting.
160
-
161
-
==== Other
162
-
163
-
Finally, there are some other considerations to keep in mind:
164
-
165
-
- If you don't need near real-time accuracy on your search results, consider
166
-
dropping the `index.refresh_interval` of((("indexing", "performance tips", "other considerations")))((("refresh_interval setting"))) each index to `30s`. If you are doing
167
-
a large import, you can disable refreshes by setting this value to `-1` for the
168
-
duration of the import. Don't forget to reenable it when you are finished!
169
-
170
-
- If you are doing a large bulk import, consider disabling replicas by setting
171
-
`index.number_of_replicas: 0`.((("replicas, disabling during large bulk imports"))) When documents are replicated, the entire document
172
-
is sent to the replica node and the indexing process is repeated verbatim. This
173
-
means each replica will perform the analysis, indexing, and potentially merging
- 如果你在做大批量导入,考虑通过设置 `index.number_of_replicas: 0`((("replicas, disabling during large bulk imports")))关闭副本。文档在复制的时候,整个文档内容都被发往副本节点,然后逐字的把索引过程重复一遍。这意味着每个副本也会执行分析、索引以及可能的合并过程。
175
106
+
176
-
In contrast, if you index with zero replicas and then enable replicas when ingestion
177
-
is finished, the recovery process is essentially a byte-for-byte network transfer.
178
-
This is much more efficient than duplicating the indexing process.
179
-
180
-
- If you don't have a natural ID for each document, use Elasticsearch's auto-ID
181
-
functionality.((("id", "auto-ID functionality of Elasticsearch"))) It is optimized to avoid version lookups, since the autogenerated
182
-
ID is unique.
183
-
184
-
- If you are using your own ID, try to pick an ID that is http://blog.mikemccandless.com/2014/05/choosing-fast-unique-identifier-uuid.html[friendly to Lucene]. ((("UUIDs (universally unique identifiers)"))) Examples include zero-padded
185
-
sequential IDs, UUID-1, and nanotime; these IDs have consistent, sequential
186
-
patterns that compress well. In contrast, IDs such as UUID-4 are essentially
187
-
random, which offer poor compression and slow down Lucene.
0 commit comments