Skip to content

Commit 6acb4e4

Browse files
authored
Revert "chapter05_part2:/050_Search/10_Multi_index_multi_type.asciidoc" (#326)
1 parent 1521c5c commit 6acb4e4

4 files changed

+155
-62
lines changed

050_Search/05_Empty_search.asciidoc

Lines changed: 42 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,17 @@
11
[[empty-search]]
22
=== The Empty Search
33

4-
搜索API的最基础的形式是没有指定任何查询的空搜索,它简单地返回集群中所有目录中的所有文档:
4+
The most basic form of the((("searching", "empty search")))((("empty search"))) search API is the _empty search_, which doesn't
5+
specify any query but simply returns all documents in all indices in the
6+
cluster:
57

68
[source,js]
79
--------------------------------------------------
810
GET /_search
911
--------------------------------------------------
12+
// SENSE: 050_Search/05_Empty_search.json
1013

11-
返回的结果(为了解决编辑过的)像这种这样子:
14+
The response (edited for brevity) looks something like this:
1215

1316
[source,js]
1417
--------------------------------------------------
@@ -45,39 +48,66 @@ GET /_search
4548

4649
==== hits
4750

48-
返回结果中最重的部分是 `hits` ,它包含与我们查询相匹配的文档总数 `total` ,并且一个 `hits` 数组包含所查询结果的前十个文档。
51+
The most important section of the response is `hits`, which((("searching", "empty search", "hits")))((("hits"))) contains the
52+
`total` number of documents that matched our query, and a `hits` array
53+
containing the first 10 of those matching documents--the results.
4954

50-
在 `hits` 数组中每个结果包含文档的 `_index` 、 `_type` 、 `_id` ,加上 `_source` 字段。这意味着我们可以直接从返回的搜索结果中使用整个文档。这不像其他的搜索引擎,仅仅返回文档的ID,获取对应的文档需要在单独的步骤。
55+
Each result in the `hits` array contains the `_index`, `_type`, and `_id` of
56+
the document, plus the `_source` field. This means that the whole document is
57+
immediately available to us directly from the search results. This is unlike
58+
other search engines, which return just the document ID, requiring you to fetch
59+
the document itself in a separate step.
5160

52-
每个结果还有一个 `_score` ,这是衡量文档与查询匹配度的关联性分数。默认情况下,首先返回最相关的文档结果,就是说,返回的文档是按照 `_score` 降序排列的。在这个例子中,我们没有指定任何查询,故所有的文档具有相同的相关性,因此对所有的结果而言 `1` 是中性的 `_score` 。
61+
Each element also ((("score", "for empty search")))((("relevance scores")))has a `_score`. This is the _relevance score_, which is a
62+
measure of how well the document matches the query. By default, results are
63+
returned with the most relevant documents first; that is, in descending order
64+
of `_score`. In this case, we didn't specify any query, so all documents are
65+
equally relevant, hence the neutral `_score` of `1` for all results.
5366

54-
`max_score` 值是与查询所匹配文档的最高 `_score` 。
67+
The `max_score` value is the highest `_score` of any document that matches our
68+
query.((("max_score value")))
5569

5670
==== took
5771

58-
`took` 值告诉我们执行整个搜索请求耗费了多少毫秒。
72+
The `took` value((("took value (empty search)"))) tells us how many milliseconds the entire search request took
73+
to execute.
5974

6075
==== shards
6176

62-
`_shards` 部分告诉我们在查询中参与分片的总数,以及这些分片成功了多少个失败了多少个。正常情况下我们不希望分片失败,但是分片失败是可能发生的。如果我们遭遇到一种较常见的灾难,在这个灾难中丢失了相同分片的原始数据和副本,那么对这个分片将没有可用副本来对搜索请求作出响应。假若这样,Elasticsearch 将报告这个分片是失败的,但是会继续返回剩余分片的结果。
77+
The `_shards` element((("shards", "number involved in an empty search"))) tells us the `total` number of shards that were involved
78+
in the query and,((("failed shards (in a search)")))((("successful shards (in a search)"))) of them, how many were `successful` and how many `failed`.
79+
We wouldn't normally expect shards to fail, but it can happen. If we were to
80+
suffer a major disaster in which we lost both the primary and the replica copy
81+
of the same shard, there would be no copies of that shard available to respond
82+
to search requests. In this case, Elasticsearch would report the shard as
83+
`failed`, but continue to return results from the remaining shards.
6384

6485
==== timeout
6586

66-
`timed_out` 值告诉我们查询是否超时。默认情况下,搜索请求不会超时。如果低响应时间比完成结果更重要,你可以指定 `timeout` 为10或者10ms(10毫秒),或者1s(1秒):
87+
The `timed_out` value tells((("timed_out value in search results"))) us whether the query timed out. By
88+
default, search requests do not time out.((("timeout parameter", "specifying in a request"))) If low response times are more
89+
important to you than complete results, you can specify a `timeout` as `10`
90+
or `10ms` (10 milliseconds), or `1s` (1 second):
6791

6892
[source,js]
6993
--------------------------------------------------
7094
GET /_search?timeout=10ms
7195
--------------------------------------------------
7296

73-
在请求超时之前,Elasticsearch 将返回从每个分片聚集来的结果。
97+
98+
Elasticsearch will return any results that it has managed to gather from
99+
each shard before the requests timed out.
74100

75101
[WARNING]
76102
================================================
77103
78-
应当注意的是 `timeout` 不是停止执行查询,它仅仅是告知正在协调的节点返回到目前为止收集的结果并且关闭连接。在后台,其他的分片可能仍在执行查询即使是结果已经被发送了。
104+
It should be noted that this `timeout` does not((("timeout parameter", "not halting query execution"))) halt the execution of the
105+
query; it merely tells the coordinating node to return the results collected
106+
_so far_ and to close the connection. In the background, other shards may
107+
still be processing the query even though results have been sent.
79108
80-
使用超时是因为对你的SLA是重要的,不是因为想去中止长时间运行的查询。
109+
Use the time-out because it is important to your SLA, not because you want
110+
to abort the execution of long-running queries.
81111
82112
================================================
83113

Lines changed: 26 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,42 +1,54 @@
11
[[multi-index-multi-type]]
2-
=== 多索引,多类型
2+
=== Multi-index, Multitype
33

4-
你有没有注意到之前的 <<empty-search,empty search>> 的结果,不同类型的文档((("searching", "multi-index, multi-type search")))&#x2014; `user` 和 `tweet` 来自不同的索引&#x2014; `us` 和 `gb` ?
4+
Did you notice that the results from the preceding <<empty-search,empty search>>
5+
contained documents ((("searching", "multi-index, multi-type search")))of different types&#x2014;`user` and `tweet`&#x2014;from two
6+
different indices&#x2014;`us` and `gb`?
57

6-
如果不对某一特殊的索引或者类型做限制,就会搜索集群中的所有文档。Elasticsearch 转发搜索请求到每一个主分片或者副本分片,汇集查询出的前10个结果,并且返回给我们。
8+
By not limiting our search to a particular index or type, we have searched
9+
across _all_ documents in the cluster. Elasticsearch forwarded the search
10+
request in parallel to a primary or replica of every shard in the cluster,
11+
gathered the results to select the overall top 10, and returned them to us.
712

8-
然而,经常的情况下,你((("types", "specifying in search requests")))(((" indices", "specifying in search requests")))想在一个或多个特殊的索引并且在一个或者多个特殊的类型中进行搜索。我们可以通过在URL中指定特殊的索引和类型达到这种效果,如下所示:
13+
Usually, however, you will((("types", "specifying in search requests")))((("indices", "specifying in search requests"))) want to search within one or more specific indices,
14+
and probably one or more specific types. We can do this by specifying the
15+
index and type in the URL, as follows:
916

1017

1118
`/_search`::
12-
在所有的索引中搜索所有的类型
19+
Search all types in all indices
1320

1421
`/gb/_search`::
15-
`gb` 索引中搜索所有的类型
22+
Search all types in the `gb` index
1623

1724
`/gb,us/_search`::
18-
`gb` `us` 索引中搜索所有的文档
25+
Search all types in the `gb` and `us` indices
1926

2027
`/g*,u*/_search`::
21-
在任何以 `g` 或者 `u` 开头的索引中搜索所有的类型
28+
Search all types in any indices beginning with `g` or beginning with `u`
2229

2330
`/gb/user/_search`::
24-
在 `gb` 索引中搜索 `user` 类型
31+
Search type `user` in the `gb` index
2532

2633
`/gb,us/user,tweet/_search`::
27-
在 `gb` 和 `us` 索引中搜索 `user` 和 `tweet` 类型
34+
Search types `user` and `tweet` in the `gb` and `us` indices
2835

2936
`/_all/user,tweet/_search`::
30-
在所有的索引中搜索 `user` `tweet` 类型
37+
Search types `user` and `tweet` in all indices
3138

3239

33-
当在单一的索引下进行搜索的时候,Elasticsearch 转发请求到索引的每个分片中,可以是主分片也可以是副本分片,然后从每个分片中收集结果。多索引搜索恰好也是用相同的方式工作的--只是会涉及到更多的分片。
40+
When you search within a single index, Elasticsearch forwards the search
41+
request to a primary or replica of every shard in that index, and then gathers the
42+
results from each shard. Searching within multiple indices works in exactly
43+
the same way--there are just more shards involved.
3444

3545
[TIP]
3646
================================================
3747
38-
搜索一个索引有五个主分片和搜索五个索引各有一个分片准确来所说是等价的。
48+
Searching one index that has five primary shards is _exactly equivalent_ to
49+
searching five indices that have one primary shard each.
3950
4051
================================================
4152

42-
接下来,你将明白这种简单的方式如何弹性的把请求的变化变得简单化。
53+
Later, you will see how this simple fact makes it easy to scale flexibly
54+
as your requirements change.

050_Search/15_Pagination.asciidoc

Lines changed: 29 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,21 @@
11
[[pagination]]
2-
=== 分页
2+
=== Pagination
33

4-
在之前的 <<empty-search,empty search>> 中知道集群中有14个文档匹配了我们(empty)query。但是在 `hits` 数组中只有10个文档,怎么样我们才能看到其他的文档呢?
4+
Our preceding <<empty-search,empty search>> told us that 14 documents in the((("pagination")))
5+
cluster match our (empty) query. But there were only 10 documents in
6+
the `hits` array. How can we see the other documents?
57

6-
像SQL使用 `LIMIT` 关键字返回单页的结果一样,Elasticsearch 有 `from` 和 `size` 参数:
8+
In the same way as SQL uses the `LIMIT` keyword to return a single ``page'' of
9+
results, Elasticsearch accepts ((("from parameter")))((("size parameter")))the `from` and `size` parameters:
710

811
`size`::
9-
显示应该返回的结果数量,默认是 `10`
12+
Indicates the number of results that should be returned, defaults to `10`
1013

1114
`from`::
12-
显示应该跳过的初始结果数量,默认是 `0`
15+
Indicates the number of initial results that should be skipped, defaults to `0`
1316

14-
如果想每页展示五条结果,可以用下面三种方式请求:
17+
If you wanted to show five results per page, then pages 1 to 3
18+
could be requested as follows:
1519

1620
[source,js]
1721
--------------------------------------------------
@@ -22,17 +26,30 @@ GET /_search?size=5&from=10
2226
// SENSE: 050_Search/15_Pagination.json
2327

2428

25-
考虑到分页太深或者请求太多结果的情况,在返回结果之前可以对结果排序。但是请记住一个请求经常跨越多个分片,每个分片都产生自己的排序结果,这些结果需要进行集中排序以保证全部的次序是正确的。
29+
Beware of paging too deep or requesting too many results at once. Results are
30+
sorted before being returned. But remember that a search request usually spans
31+
multiple shards. Each shard generates its own sorted results, which then need
32+
to be sorted centrally to ensure that the overall order is correct.
2633

27-
.在分布式系统中深度分页
34+
.Deep Paging in Distributed Systems
2835
****
2936
30-
理解问什么深度分页是有问题的,我们可以想象搜索有五个主分片的单一索引。当我们请求结果的第一页(结果从1到10),每一个分片产生前10的结果,并且返回给起协调作用的节点,起协调作用的节点在对50个结果排序得到全部结果的前10个。
37+
To understand why ((("deep paging, problems with")))deep paging is problematic, let's imagine that we are
38+
searching within a single index with five primary shards. When we request the
39+
first page of results (results 1 to 10), each shard produces its own top 10
40+
results and returns them to the _coordinating node_, which then sorts all 50
41+
results in order to select the overall top 10.
3142
32-
现在想象我们请求第1000页--结果从10001到10010。所有都以相同的方式工作除了每个分片不得不产生前10010个结果以外。然后起协调作用的节点对全部50050个结果排序最后丢弃掉这些结果中的50040个结果。
43+
Now imagine that we ask for page 1,000--results 10,001 to 10,010. Everything
44+
works in the same way except that each shard has to produce its top 10,010
45+
results. The coordinating node then sorts through all 50,050 results and
46+
discards 50,040 of them!
3347
34-
看得出来,在分布式系统中,对结果排序的成本随分页的深度成指数上升。这就是为什么每次查询不要返回超过1000个结果的一个好理由。
48+
You can see that, in a distributed system, the cost of sorting results
49+
grows exponentially the deeper we page. There is a good reason
50+
that web search engines don't return more than 1,000 results for any query.
3551
3652
****
3753

38-
TIP: 在 <<reindex>> 中我们解释了如何有效的获取大量的文档。
54+
TIP: In <<reindex>> we explain how you _can_ retrieve large numbers of
55+
documents efficiently.

0 commit comments

Comments
 (0)