Skip to content

Commit d497c0e

Browse files
committed
JIS7/JIS8 encoding: use JISX0201 for U+203E (overline)
In other legacy Japanese encodings like Shift-JIS, we are now using a specific JISX 0208 character for the Unicode overline (U+203E). Previously, the single byte 0x7E was used, but an ASCII 0x7E does not represent an overline, so this was changed. However, JIS7/JIS8 can represent characters in the JISX 0201 character set as well. That character set also includes an overline character, which takes less bytes to encode than the corresponding JISX 0208 character, so we'll use it. This is what mbstring had been doing for a long time; but it changed as a side effect of the recent changes to how U+203E is encoded in Shift-JIS, etc. So change it back.
1 parent 40384da commit d497c0e

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

ext/mbstring/libmbfl/filters/mbfilter_jis.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -278,6 +278,8 @@ mbfl_filt_conv_wchar_jis(int c, mbfl_convert_filter *filter)
278278

279279
if (c >= ucs_a1_jis_table_min && c < ucs_a1_jis_table_max) {
280280
s = ucs_a1_jis_table[c - ucs_a1_jis_table_min];
281+
} else if (c == 0x203E) { /* OVERLINE */
282+
s = 0x1007E; /* Convert to JISX 0201 OVERLINE */
281283
} else if (c >= ucs_a2_jis_table_min && c < ucs_a2_jis_table_max) {
282284
s = ucs_a2_jis_table[c - ucs_a2_jis_table_min];
283285
} else if (c >= ucs_i_jis_table_min && c < ucs_i_jis_table_max) {
@@ -288,8 +290,6 @@ mbfl_filt_conv_wchar_jis(int c, mbfl_convert_filter *filter)
288290
if (s <= 0) {
289291
if (c == 0xa5) { /* YEN SIGN */
290292
s = 0x1005c;
291-
} else if (c == 0x203e) { /* OVER LINE */
292-
s = 0x1007e;
293293
} else if (c == 0xff3c) { /* FULLWIDTH REVERSE SOLIDUS */
294294
s = 0x2140;
295295
} else if (c == 0x2225) { /* PARALLEL TO */

0 commit comments

Comments
 (0)