Skip to content

chapter8_part2: /056_Sorting/88_String_sorting.asciidoc #140

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jul 26, 2016
56 changes: 29 additions & 27 deletions 056_Sorting/88_String_sorting.asciidoc
Original file line number Diff line number Diff line change
@@ -1,28 +1,30 @@
[[multi-fields]]
=== String Sorting and Multifields
[[多字段]]
=== 字符串排序与多字段

Analyzed string fields are also multivalue fields,((("strings", "sorting on string fields")))((("analyzed fields", "string fields")))((("sorting", "string sorting and multifields"))) but sorting on them seldom
gives you the results you want. If you analyze a string like `fine old art`,
it results in three terms. We probably want to sort alphabetically on the
first term, then the second term, and so forth, but Elasticsearch doesn't have this
information at its disposal at sort time.

You could use the `min` and `max` sort modes (it uses `min` by default), but
that will result in sorting on either `art` or `old`, neither of which was the
intent.
被解析的字符串字段也是多值字段,((("strings", "sorting on string fields")))((("analyzed fields", "string fields")))((("sorting", "string sorting and multifields"))) 但是很少会按照你想要的方式进行排序。如果你想分析一个字符串,如 `fine old art` ,
这包含3项。我们很坑想要按第一项的字母排序,然后按第二项的字母排序,诸如此类,但是Elasticsearch在排序过程中没有这样的信息。

In order to sort on a string field, that field should contain one term only:
the whole `not_analyzed` string.((("not_analyzed string fields", "sorting on"))) But of course we still need the field to be
`analyzed` in order to be able to query it as full text.

The naive approach to indexing the same string in two ways would be to include
two separate fields in the document: one that is `analyzed` for searching,
and one that is `not_analyzed` for sorting.
你可以使用 `min` 和 `max` 排序模式(默认是 `min` ),但是这会导致排序以 `art` 或是 `old` ,任何一个都不是所希望的



为了以字符串字段进行排序, 这个字段应仅包含一项:
整个 `not_analyzed` 字符串。((("not_analyzed string fields", "sorting on"))) 但是我们仍需要 `analyzed` 字段,这样才能以全文进行查询



一个简单的方法是用两种方式对同一个字符串进行索引,这将在文档中包括两个字段 : `analyzed` 用于搜索, `not_analyzed` 用于排序



但是保存相同的字符串两次在 `_source` 字段是浪费空间的。
我们真正想要做的是传递一个 _单字段_ 但是 却用两种方式索引它。所有的 _core_field 类型 (strings, numbers, Booleans, dates) 接收一个 `字段s` 参数((("mapping (types)", "transforming simple mapping to multifield mapping")))((("types", "core simple field types", "accepting fields parameter")))((("fields parameter")))((("multifield mapping")))

该参数允许你转化一个简单的映射如


But storing the same string twice in the `_source` field is waste of space.
What we really want to do is to pass in a _single field_ but to _index it in two different ways_. All of the _core_ field types (strings, numbers,
Booleans, dates) accept a `fields` parameter ((("mapping (types)", "transforming simple mapping to multifield mapping")))((("types", "core simple field types", "accepting fields parameter")))((("fields parameter")))((("multifield mapping")))that allows you to transform a
simple mapping like

[source,js]
--------------------------------------------------
Expand All @@ -32,7 +34,7 @@ simple mapping like
}
--------------------------------------------------

into a _multifield_ mapping like this:
为一个多字段映射如:

[source,js]
--------------------------------------------------
Expand All @@ -49,12 +51,12 @@ into a _multifield_ mapping like this:
--------------------------------------------------
// SENSE: 056_Sorting/88_Multifield.json

<1> The main `tweet` field is just the same as before: an `analyzed` full-text
field.
<2> The new `tweet.raw` subfield is `not_analyzed`.
<1> `tweet` 主字段与之前的一样: 是一个 `analyzed` 全文字段。
<2> 新的 `tweet.raw` 子字段是 `not_analyzed`.


现在, 至少我们已经重新索引了我们的数据,使用 `tweet` 字段用于搜索,`tweet.raw` 字段用于排序:

Now, or at least as soon as we have reindexed our data, we can use the `tweet`
field for search and the `tweet.raw` field for sorting:

[source,js]
--------------------------------------------------
Expand All @@ -70,6 +72,6 @@ GET /_search
--------------------------------------------------
// SENSE: 056_Sorting/88_Multifield.json

WARNING: Sorting on a full-text `analyzed` field can use a lot of memory. See
WARNING: 以全文 `analyzed` 字段排序会消耗大量的内存. See
<<fielddata-intro>> for more information.