Skip to content

Commit ef1fc4c

Browse files
authored
Merge pull request #147 from elasticsearch-cn/revert-143-chapter/chapter24_part5
Revert "chapter24_part5: /270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc"
2 parents 5ddf939 + 87d08dc commit ef1fc4c

File tree

1 file changed

+22
-18
lines changed

1 file changed

+22
-18
lines changed
Lines changed: 22 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,33 @@
11
[[fuzzy-scoring]]
2-
=== 模糊性评分
2+
=== Scoring Fuzziness
33

4+
Users love fuzzy queries. They assume that these queries will somehow magically find
5+
the right combination of proper spellings.((("fuzzy queries", "scoring fuzziness")))((("typoes and misspellings", "scoring fuzziness")))((("relevance scores", "fuzziness and"))) Unfortunately, the truth is
6+
somewhat more prosaic.
47

5-
用户喜欢模糊查询。他们认为这种查询会魔法般的找到正确拼写组合。
6-
((("fuzzy queries", "scoring fuzziness")))((("typoes and misspellings", "scoring fuzziness")))((("relevance scores", "fuzziness and")))
7-
很遗憾,实际效果平平。
8+
Imagine that we have 1,000 documents containing ``Schwarzenegger,'' and just
9+
one document with the misspelling ``Schwarzeneger.'' According to the theory
10+
of <<tfidf,term frequency/inverse document frequency>>, the misspelling is
11+
much more relevant than the correct spelling, because it appears in far fewer
12+
documents!
813

14+
In other words, if we were to treat fuzzy matches((("match query", "fuzzy match query"))) like any other match, we
15+
would favor misspellings over correct spellings, which would make for grumpy
16+
users.
917

10-
假设我们有1000个文档包含 ``Schwarzenegger'' ,只是一个文档的出现拼写错误 ``Schwarzeneger'' 。
11-
根据 <<tfidf,term frequency/inverse document frequency>> 理论,这个拼写错误文档比拼写正确的相关度更高,因为它更少在文档中出现!
12-
13-
14-
换句话说,如果我们对待模糊匹配((("match query", "fuzzy match query")))类似其他匹配方法,我们将偏爱错误的拼写超过了正确的拼写,这会让用户发狂。
15-
16-
17-
TIP: 模糊匹配不应用于参与评分--只能在有拼写错误时扩大匹配项的范围。
18-
19-
20-
默认情况下, `match` 查询给定所有的模糊匹配的恒定评分为1。这可以满足在结果列表的末尾添加潜在的匹配记录,并且没有干扰非模糊查询的相关性评分。
18+
TIP: Fuzzy matching should not be used for scoring purposes--only to widen
19+
the net of matching terms in case there are misspellings.
2120

21+
By default, the `match` query gives all fuzzy matches the constant score of 1.
22+
This is sufficient to add potential matches onto the end of the result list,
23+
without interfering with the relevance scoring of nonfuzzy queries.
2224

2325
[TIP]
2426
==================================================
2527
26-
在模糊查询最初出现时很少能单独使用。他们更好的作为一个 ``bigger'' 场景的部分功能特性,如 _search-as-you-type_
27-
{ref}/search-suggesters-completion.html[`完成` 建议]或
28-
_did-you-mean_ {ref}/search-suggesters-phrase.html[`短语` 建议]。
28+
Fuzzy queries alone are much less useful than they initially appear. They are
29+
better used as part of a ``bigger'' feature, such as the _search-as-you-type_
30+
{ref}/search-suggesters-completion.html[`completion` suggester] or the
31+
_did-you-mean_ {ref}/search-suggesters-phrase.html[`phrase` suggester].
32+
2933
==================================================

0 commit comments

Comments
 (0)