|
| 1 | +--- |
| 2 | +template: overrides/blog.html |
| 3 | +title: Chinese search support |
| 4 | +description: > |
| 5 | + Insiders adds Chinese language support for the built-in search plugin – a |
| 6 | + feature that has been requested many times |
| 7 | +hide: |
| 8 | + - feedback |
| 9 | +--- |
| 10 | + |
| 11 | +# Chinese search support – 中文搜索支持 |
| 12 | + |
| 13 | +__Insiders adds experimental Chinese language support for the [built-in search |
| 14 | +plugin] – a feature that has been requested for a long time given the large |
| 15 | +number of Chinese users.__ |
| 16 | + |
| 17 | +<aside class="mdx-author" markdown> |
| 18 | +![@squidfunk][@squidfunk avatar] |
| 19 | + |
| 20 | +<span>__Martin Donath__ · @squidfunk</span> |
| 21 | +<span> |
| 22 | +:octicons-calendar-24: May 5, 2022 · |
| 23 | +:octicons-clock-24: 5 min read · |
| 24 | +[:octicons-tag-24: 8.2.13+insiders-4.14.0][insiders-4.14.0] |
| 25 | +</span> |
| 26 | +</aside> |
| 27 | + |
| 28 | + [built-in search plugin]: ../../setup/setting-up-site-search.md#built-in-search-plugin |
| 29 | + [@squidfunk avatar]: https://avatars.githubusercontent.com/u/932156 |
| 30 | + [insiders-4.14.0]: ../../insiders/changelog.md#4.14.0 |
| 31 | + |
| 32 | +--- |
| 33 | + |
| 34 | +After the United States and Germany, the third-largest country of origin of |
| 35 | +Material for MkDocs users is China. For a long time, the built-in search plugin |
| 36 | +didn't allow for proper segmentation of Chinese characters, mainly due to |
| 37 | +missing support in [lunr-languages] which is used for search tokenization and |
| 38 | +stemming. The latest Insiders release adds long-awaited Chinese language support |
| 39 | +for the built-in search plugin, something that has been requested by many users. |
| 40 | + |
| 41 | +_Material for MkDocs終於支持中文了!文本被正確分割並且更容易找到。_ |
| 42 | +{ style="display: inline" } |
| 43 | + |
| 44 | +_This article explains how to set up Chinese language support for the built-in |
| 45 | +search plugin in a few minutes._ |
| 46 | +{ style="display: inline" } |
| 47 | + |
| 48 | + [lunr-languages]: https://github.com/MihaiValentin/lunr-languages |
| 49 | + |
| 50 | +## Configuration |
| 51 | + |
| 52 | +Chinese language support for Material for MkDocs is provided by [jieba], an |
| 53 | +excellent Chinese text segmentation library. If [jieba] is installed, the |
| 54 | +built-in search plugin automatically detects Chinese characters and runs them |
| 55 | +through the segmenter. You can install [jieba] with: |
| 56 | + |
| 57 | +``` |
| 58 | +pip install jieba |
| 59 | +``` |
| 60 | + |
| 61 | +The next step is only required if you specified the [separator] configuration |
| 62 | +in `mkdocs.yml`. Text is segmented with [zero-width whitespace] characters, so |
| 63 | +it renders exactly the same in the search modal. Adjust `mkdocs.yml` so that |
| 64 | +the [separator] includes the `\u200b` character: |
| 65 | + |
| 66 | +``` yaml |
| 67 | +plugins: |
| 68 | + - search: |
| 69 | + separator: '[\\s\\u200b\\-]' |
| 70 | +``` |
| 71 | +
|
| 72 | +That's all that is necessary. |
| 73 | +
|
| 74 | +## Usage |
| 75 | +
|
| 76 | +If you followed the instructions in the configuration guide, Chinese words will |
| 77 | +now be tokenized using [jieba]. Try searching for |
| 78 | +[:octicons-search-24: 支持][q=支持] to see how it integrates with the |
| 79 | +built-in search plugin. |
| 80 | +
|
| 81 | +--- |
| 82 | +
|
| 83 | +Note that this is an experimental feature, and I, @squidfunk, am not |
| 84 | +proficient in Chinese (yet?). If you find a bug or think something can be |
| 85 | +improved, please [open an issue]. |
| 86 | +
|
| 87 | + [jieba]: https://pypi.org/project/jieba/ |
| 88 | + [zero-width whitespace]: https://en.wikipedia.org/wiki/Zero-width_space |
| 89 | + [separator]: ../../setup/setting-up-site-search.md#separator |
| 90 | + [q=支持]: ?q=支持 |
| 91 | + [open an issue]: https://github.com/squidfunk/mkdocs-material/issues/new/choose |
0 commit comments