Skip to content

Commit 94de212

Browse files
Fix broken diffWords test
Somehow I screwed up #613 and merged it with the tests failing and with the sentence I was actually using in the test inconsistent with the one I claimed to be using in the comment above. Also, even if I'd got it right, I wouldn't've actually avoided hitting the inconsistency in Intl.Segmenter's tokenization rules that that PR was specifically trying to avoid, because it considers 他有 (he has) to be one word; I should've used 她有 (she has) which the segmenter sees as two words. This fixes both mistakes.
1 parent 27e6c81 commit 94de212

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

test/diff/word.js

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -240,14 +240,14 @@ describe('WordDiff', function() {
240240

241241
it('supports tokenizing with an Intl.Segmenter', () => {
242242
// Example 1: Diffing Chinese text with no spaces.
243-
// a. "He (他) has (有) many (很多) tables (桌子)"
243+
// a. "She (她) has (有) many (很多) tables (桌子)"
244244
// b. "Mei (梅) has (有) many (很多) sons (儿子)"
245245
// We want to see that diffWords will get the word counts right and won't try to treat the
246246
// trailing 子 as common to both texts (since it's part of a different word each time).
247247
const chineseSegmenter = new Intl.Segmenter('zh', {granularity: 'word'});
248-
const diffResult = diffWords('我有很多桌子。', '梅有很多儿子。', {intlSegmenter: chineseSegmenter});
248+
const diffResult = diffWords('她有很多桌子。', '梅有很多儿子。', {intlSegmenter: chineseSegmenter});
249249
expect(diffResult).to.deep.equal([
250-
{ count: 1, added: false, removed: true, value: '' },
250+
{ count: 1, added: false, removed: true, value: '' },
251251
{ count: 1, added: true, removed: false, value: '梅' },
252252
{ count: 2, added: false, removed: false, value: '有很多' },
253253
{ count: 1, added: false, removed: true, value: '桌子' },

0 commit comments

Comments
 (0)