Skip to content

Commit 757a196

Browse files
authored
Merge pull request #145 from elasticsearch-cn/revert-141-chapter/chapter8_part4
Revert "chapter8_part4: /056_Sorting/95_Fielddata.asciidoc"
2 parents 5c7e9a9 + d818ea4 commit 757a196

File tree

1 file changed

+37
-34
lines changed

1 file changed

+37
-34
lines changed

056_Sorting/95_Fielddata.asciidoc

Lines changed: 37 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,51 +1,54 @@
1-
[[字段数据介绍]]
2-
=== 字段数据
1+
[[fielddata-intro]]
2+
=== Fielddata
33

4+
Our final topic in this chapter is about an internal aspect of Elasticsearch.
5+
While we don't demonstrate any new techniques here, fielddata is an
6+
important topic that we will refer to repeatedly, and is something that you
7+
should be aware of.((("fielddata")))
48

9+
When you sort on a field, Elasticsearch needs access to the value of that
10+
field for every document that matches the query.((("inverted index", "sorting and"))) The inverted index, which
11+
performs very well when searching, is not the ideal structure for sorting on
12+
field values:
513

6-
我们这章的终极目标是关于Elasticsearch的一个内部的方面,且我们在这里并不会阐述任何新的技术,字段数据是我们将会重复提到的一个重要话题,并且你应当明确它。((("fielddata")))
14+
* When searching, we need to be able to map a term to a list of documents.
715

16+
* When sorting, we need to map a document to its terms. In other words, we
17+
need to ``uninvert'' the inverted index.
818

19+
To make sorting efficient, Elasticsearch loads all the values for
20+
the field that you want to sort on into memory. This is referred to as
21+
_fielddata_.
922

10-
当你以字段进行排序, Elasticsearch需要访问符合查询的每个文档的该字段的值。((("inverted index", "sorting and")))反转的索引(这会对搜索更加友好)在以字段值排序时不是理想的结构。
23+
WARNING: Elasticsearch doesn't just load the values for the documents that matched a
24+
particular query. It loads the values from _every document in your index_,
25+
regardless of the document `type`.
1126

27+
The reason that Elasticsearch loads all values into memory is that uninverting the index
28+
from disk is slow. Even though you may need the values for only a few docs
29+
for the current request, you will probably need access to the values for other
30+
docs on the next request, so it makes sense to load all the values into memory
31+
at once, and to keep them there.
1232

13-
* 当搜索时,我们需要能将一个文档列表映射到某一项上。
33+
Fielddata is used in several places in Elasticsearch:
1434

35+
* Sorting on a field
36+
* Aggregations on a field
37+
* Certain filters (for example, geolocation filters)
38+
* Scripts that refer to fields
1539

40+
Clearly, this can consume a lot of memory, especially for high-cardinality
41+
string fields--string fields that have many unique values--like the body
42+
of an email. Fortunately, insufficient memory is a problem that can be solved
43+
by horizontal scaling, by adding more nodes to your cluster.
1644

17-
* 当排序时, 我们需要映射一个文档到它的某项。 换句话说, 我们需要 ``反向反转`` 已经反转的索引。
45+
For now, all you need to know is what fielddata is, and to be aware that it
46+
can be memory hungry. Later, we will show you how to determine the amount of memory that fielddata
47+
is using, how to limit the amount of memory that is available to it, and
48+
how to preload fielddata to improve the user experience.
1849

1950

2051

2152

22-
为了使得排序效率更高, Elasticsearch 会在内存中加载你想要以之排序的所有字段的值。 这便是提到的 _字段数据_
23-
24-
25-
26-
27-
WARNING: Elasticsearch 并不仅仅加载匹配特定查询的文档的值。 他会加载 _你的数据库中的每个文档_ , 无论这个文档的 `type`
28-
29-
30-
31-
32-
Elasticsearch在内存中加载所有的值的原因是在硬盘中逆反向索引是很慢的。虽然你当前的请求可能仅仅需要很少文档的值,你仍然可能在下次请求时需要可以访问其他文档的值,所以在内存中立即加载所有的值并驻留是有意义的。
33-
34-
35-
36-
37-
字段数据在Elasticsearch中被用于以下地方:
38-
39-
* 按照字段排序
40-
* 按照字段聚合
41-
* 一些特定的筛选(例如,地理筛选)
42-
* 引入字段的脚本
43-
44-
45-
显然的,这会消耗大量的内存,特别是对于高基数的字符串字段--字符串字段有很多独特的值--例如email的body体。幸运的是,内存效率低的问题可以通过增加集群的节点进行水平扩展来解决。
46-
47-
现在,所有你需要知道和明确的是它是极度需要内存的。稍后,我们会给你演示如何确定字段数据所占用的内存,如何限制可用的内存,和如何预加载字段数据来提高用户体验。
48-
49-
5053

5154

0 commit comments

Comments
 (0)