Skip to content

Commit 4ab95bc

Browse files
SimonSapinalexcrichton
authored andcommitted
char: s/character/Unicode scalar value/
Tweak the definition of `char` to use the appropriate Unicode terminology.
1 parent 87c7c03 commit 4ab95bc

File tree

1 file changed

+6
-2
lines changed

1 file changed

+6
-2
lines changed

src/doc/rust.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3136,8 +3136,12 @@ machine.
31363136

31373137
The types `char` and `str` hold textual data.
31383138

3139-
A value of type `char` is a Unicode character,
3140-
represented as a 32-bit unsigned word holding a UCS-4 codepoint.
3139+
A value of type `char` is a [Unicode scalar value](
3140+
http://www.unicode.org/glossary/#unicode_scalar_value)
3141+
(ie. a code point that is not a surrogate),
3142+
represented as a 32-bit unsigned word in the 0x0000 to 0xD7FF
3143+
or 0xE000 to 0x10FFFF range.
3144+
A `[char]` vector is effectively an UCS-4 / UTF-32 string.
31413145

31423146
A value of type `str` is a Unicode string,
31433147
represented as a vector of 8-bit unsigned bytes holding a sequence of UTF-8 codepoints.

0 commit comments

Comments
 (0)