Skip to content

Commit 138a24e

Browse files
committed
/ 056_Sorting/ 88_String_sorting.asciidoc
1 parent 1e50bd0 commit 138a24e

File tree

1 file changed

+55
-43
lines changed

1 file changed

+55
-43
lines changed

056_Sorting/85_Sorting.asciidoc

Lines changed: 55 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,21 @@
11
[[sorting]]
2-
== 排序与相关性
2+
== Sorting and Relevance
33

4+
By default, results are returned sorted by _relevance_—with the most
5+
relevant docs first.((("sorting", "by relevance")))((("relevance", "sorting results by"))) Later in this chapter, we explain what we mean by
6+
_relevance_ and how it is calculated, but let's start by looking at the `sort`
7+
parameter and how to use it.
48

5-
默认的是,返回的结果是按照 _相关性_ 进行排序的—相关性最强的文档在最前。((("sorting", "by relevance")))((("relevance", "sorting results by"))) 在本章的稍后,我们会解释 _相关性_ 意味着什么和它是如何计算的,让我们开始的时候着眼于 `sort` 参数和如何使用它吧。
9+
=== Sorting
610

11+
In order to sort by relevance, we need to represent relevance as a value. In
12+
Elasticsearch, the _relevance score_ is represented by the floating-point
13+
number returned in the search results as the `_score`, ((("relevance scores", "returned in search results score")))((("score", "relevance score of search results")))so the default sort
14+
order is `_score` descending.
715

8-
9-
=== 排序
10-
11-
12-
13-
为了按照相关性来排序,需要将相关性表示为一个值。在elasticsearch中, _relevance score_ 是作为一个浮点数,并在结果中的 `_score` 返回,((("relevance scores", "returned in search results score")))((("score", "relevance score of search results")))因此默认排序是 `_score` 降序的。
14-
15-
16-
有些时候,尽管你并没有一个有意义的相关性系数。例如,下面的查询返回所有 `user_id` 字段包含 `1` 的结果
17-
16+
Sometimes, though, you don't have a meaningful relevance score. For instance,
17+
the following query just returns all tweets whose `user_id` field has the
18+
value `1`:
1819

1920
[source,js]
2021
--------------------------------------------------
@@ -32,13 +33,14 @@ GET /_search
3233
}
3334
--------------------------------------------------
3435

35-
筛选不与 `_score` 相关,并且((("score", seealso="relevance; relevance scores")))((("match_all query", "score as neutral 1")))((("filters", "score and")))默认的隐式的 `match_all` 查询仅将所有文档的 `_score` 设置为中性的 `1` 。即为,所有的文档被认定是同等相关性的。
36-
36+
Filters have no bearing on `_score`, and the((("score", seealso="relevance; relevance scores")))((("match_all query", "score as neutral 1")))((("filters", "score and"))) missing-but-implied `match_all`
37+
query just sets the `_score` to a neutral value of `1` for all documents. In
38+
other words, all documents are considered to be equally relevant.
3739

38-
==== 按照字段的值排序
40+
==== Sorting by Field Values
3941

40-
41-
在这个案例中,通过最近修改来排序是有意义的,最新的排在最前。((("sorting", "by field values")))((("fields", "sorting search results by field values")))((("sort parameter")))我们可以使用 `sort` 参数
42+
In this case, it probably makes sense to sort tweets by recency, with the most
43+
recent tweets first.((("sorting", "by field values")))((("fields", "sorting search results by field values")))((("sort parameter"))) We can do this with the `sort` parameter:
4244

4345
[source,js]
4446
--------------------------------------------------
@@ -54,7 +56,7 @@ GET /_search
5456
--------------------------------------------------
5557
// SENSE: 056_Sorting/85_Sort_by_date.json
5658

57-
你将注意结果中的两个不同点:
59+
You will notice two differences in the results:
5860

5961
[source,js]
6062
--------------------------------------------------
@@ -75,33 +77,39 @@ GET /_search
7577
...
7678
}
7779
--------------------------------------------------
78-
<1> `_score` 不是被计算的, 因为它并没有用于排序。
79-
<2> `date` 字段的值将转化为unix时间戳毫秒数,然后返回`sort`字段的值
80-
81-
82-
第一点是我们在每个结果中有((("date field, sorting search results by")))一个新的名为 `sort` 的元素,它包含了我们用于排序的值。在这个案例中,我们按照 `date` 进行排序(这由unix时间戳毫秒数得到)。长数 `1411516800000` 等价于时间戳字符串 `2014-09-24 00:00:00
83-
UTC`。
84-
85-
86-
第二点是 `_score` 和 `max_score` 字段都是 `null` 。((("score", "not calculating")))计算 `_score` 的花销巨大,通常仅用于排序;我们并不根据相关性排序,所以保留 `_score` 的痕迹是没有意义的。如果无论如何你都要计算 `_score` ,你可以将((("track_scores parameter"))) `track_scores` 参数设置为 `true`.
87-
80+
<1> The `_score` is not calculated, because it is not being used for sorting.
81+
<2> The value of the `date` field, expressed as milliseconds since the epoch,
82+
is returned in the `sort` values.
83+
84+
The first is that we have ((("date field, sorting search results by")))a new element in each result called `sort`, which
85+
contains the value(s) that was used for sorting. In this case, we sorted on
86+
`date`, which internally is((("milliseconds-since-the-epoch (date)"))) indexed as _milliseconds since the epoch_. The long
87+
number `1411516800000` is equivalent to the date string `2014-09-24 00:00:00
88+
UTC`.
89+
90+
The second is that the `_score` and `max_score` are both `null`. ((("score", "not calculating"))) Calculating
91+
the `_score` can be quite expensive, and usually its only purpose is for
92+
sorting; we're not sorting by relevance, so it doesn't make sense to keep
93+
track of the `_score`. If you want the `_score` to be calculated regardless,
94+
you can set((("track_scores parameter"))) the `track_scores` parameter to `true`.
8895

8996
[TIP]
9097
====
91-
一个简便方法是, 你可以 ((("sorting", "specifying just the field name to sort on")))指定定一个字段用来排序
98+
As a shortcut, you can ((("sorting", "specifying just the field name to sort on")))specify just the name of the field to sort on:
9299
93100
[source,js]
94101
--------------------------------------------------
95102
"sort": "number_of_children"
96103
--------------------------------------------------
97104
98-
字段将会默认升序排序 ((("sorting", "default ordering"))), 而 `_score` 的值将会降序
105+
Fields will be sorted in ((("sorting", "default ordering")))ascending order by default, and
106+
the `_score` value in descending order.
99107
====
100108

101109
==== Multilevel Sorting
102110

103-
104-
也许我们想要结合使用 `date` 和 `_score` 进行查询,并且匹配的结果首先按照日期排序,然后按照相关性排序
111+
Perhaps we want to combine the `_score` from a((("sorting", "multilevel")))((("multilevel sorting"))) query with the `date`, and
112+
show all matching results sorted first by date, then by relevance:
105113

106114
[source,js]
107115
--------------------------------------------------
@@ -121,30 +129,34 @@ GET /_search
121129
--------------------------------------------------
122130
// SENSE: 056_Sorting/85_Multilevel_sort.json
123131

132+
Order is important. Results are sorted by the first criterion first. Only
133+
results whose first `sort` value is identical will then be sorted by the
134+
second criterion, and so on.
124135

125-
顺序是重要的。结果首先被第一个规则排序,仅当同时满足第一个规则时才会按照第二个规则进行排序,其余类似。
126-
127-
128-
多重排序和 `_score` 并无不相关。你可以根据一些不同的字段进行排序,((("fields", "sorting by multiple fields"))),如地理距离或是脚本计算的特定值。
136+
Multilevel sorting doesn't have to involve the `_score`. You could sort
137+
by using several different fields,((("fields", "sorting by multiple fields"))) on geo-distance or on a custom value
138+
calculated in a script.
129139

130140
[NOTE]
131141
====
132-
133-
字符串查询((("sorting", "in query string searches")))((("sort parameter", "using in query strings")))((("query strings", "sorting search results for")))也支持特定排序,可以在查询字符串中使用 `sort` 参数
134-
142+
Query-string search((("sorting", "in query string searches")))((("sort parameter", "using in query strings")))((("query strings", "sorting search results for"))) also supports custom sorting, using the `sort` parameter
143+
in the query string:
135144
136145
[source,js]
137146
--------------------------------------------------
138147
GET /_search?sort=date:desc&sort=_score&q=search
139148
--------------------------------------------------
140149
====
141150

142-
==== 字段多值的排序
143-
144-
一种情形是字段有多个值的排序,((("sorting", "on multivalue fields")))((("fields", "multivalue", "sorting on"))) 需要记住这些值并没有固有的顺序;一个多值的字段仅仅是多个值的包装,这时应道选择那个进行排序呢?
151+
==== Sorting on Multivalue Fields
145152

146-
对于数字或事日期,你可以将多值字段减为单值,这可以通过使用 `min`, `max` , `avg` , 或是 `sum` _sort modes_ 。 ((("sum sort mode")))((("avg sort mode")))((("max sort mode")))((("min sort mode")))((("sort modes")))((("dates field, sorting on earliest value")))例如你可以按照每个 `date` 字段中的最早日期进行排序,如下:
153+
When sorting on fields with more than one value,((("sorting", "on multivalue fields")))((("fields", "multivalue", "sorting on"))) remember that the values do
154+
not have any intrinsic order; a multivalue field is just a bag of values.
155+
Which one do you choose to sort on?
147156

157+
For numbers and dates, you can reduce a multivalue field to a single value
158+
by using the `min`, `max`, `avg`, or `sum` _sort modes_. ((("sum sort mode")))((("avg sort mode")))((("max sort mode")))((("min sort mode")))((("sort modes")))((("dates field, sorting on earliest value")))For instance, you
159+
could sort on the earliest date in each `dates` field by using the following:
148160

149161
[role="pagebreak-before"]
150162
[source,js]

0 commit comments

Comments
 (0)