Skip to content

Commit 50173e8

Browse files
authored
Merge pull request #143 from luotitan/chapter/chapter24_part5
chapter24_part5: /270_Fuzzy_matching/50_Scoring_fuzziness.asciidoc
2 parents 0379912 + 424f2fd commit 50173e8

File tree

1 file changed

+18
-22
lines changed

1 file changed

+18
-22
lines changed
Lines changed: 18 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,33 +1,29 @@
11
[[fuzzy-scoring]]
2-
=== Scoring Fuzziness
2+
=== 模糊性评分
33

4-
Users love fuzzy queries. They assume that these queries will somehow magically find
5-
the right combination of proper spellings.((("fuzzy queries", "scoring fuzziness")))((("typoes and misspellings", "scoring fuzziness")))((("relevance scores", "fuzziness and"))) Unfortunately, the truth is
6-
somewhat more prosaic.
74

8-
Imagine that we have 1,000 documents containing ``Schwarzenegger,'' and just
9-
one document with the misspelling ``Schwarzeneger.'' According to the theory
10-
of <<tfidf,term frequency/inverse document frequency>>, the misspelling is
11-
much more relevant than the correct spelling, because it appears in far fewer
12-
documents!
5+
用户喜欢模糊查询。他们认为这种查询会魔法般的找到正确拼写组合。
6+
((("fuzzy queries", "scoring fuzziness")))((("typoes and misspellings", "scoring fuzziness")))((("relevance scores", "fuzziness and")))
7+
很遗憾,实际效果平平。
138

14-
In other words, if we were to treat fuzzy matches((("match query", "fuzzy match query"))) like any other match, we
15-
would favor misspellings over correct spellings, which would make for grumpy
16-
users.
179

18-
TIP: Fuzzy matching should not be used for scoring purposes--only to widen
19-
the net of matching terms in case there are misspellings.
10+
假设我们有1000个文档包含 ``Schwarzenegger'' ,只是一个文档的出现拼写错误 ``Schwarzeneger'' 。
11+
根据 <<tfidf,term frequency/inverse document frequency>> 理论,这个拼写错误文档比拼写正确的相关度更高,因为它更少在文档中出现!
12+
13+
14+
换句话说,如果我们对待模糊匹配((("match query", "fuzzy match query")))类似其他匹配方法,我们将偏爱错误的拼写超过了正确的拼写,这会让用户发狂。
15+
16+
17+
TIP: 模糊匹配不应用于参与评分--只能在有拼写错误时扩大匹配项的范围。
18+
19+
20+
默认情况下, `match` 查询给定所有的模糊匹配的恒定评分为1。这可以满足在结果列表的末尾添加潜在的匹配记录,并且没有干扰非模糊查询的相关性评分。
2021

21-
By default, the `match` query gives all fuzzy matches the constant score of 1.
22-
This is sufficient to add potential matches onto the end of the result list,
23-
without interfering with the relevance scoring of nonfuzzy queries.
2422

2523
[TIP]
2624
==================================================
2725
28-
Fuzzy queries alone are much less useful than they initially appear. They are
29-
better used as part of a ``bigger'' feature, such as the _search-as-you-type_
30-
{ref}/search-suggesters-completion.html[`completion` suggester] or the
31-
_did-you-mean_ {ref}/search-suggesters-phrase.html[`phrase` suggester].
32-
26+
在模糊查询最初出现时很少能单独使用。他们更好的作为一个 ``bigger'' 场景的部分功能特性,如 _search-as-you-type_
27+
{ref}/search-suggesters-completion.html[`完成` 建议]或
28+
_did-you-mean_ {ref}/search-suggesters-phrase.html[`短语` 建议]。
3329
==================================================

0 commit comments

Comments
 (0)