提交 4674809b 编写于 作者: R Rich Felker

fix case mapping for U+00DF (ß)

U+00DF ('ß') has had an uppercase form (U+1E9E) available since
Unicode 5.1, but Unicode lacks the case mappings for it due to
stability policy. when I added support for the new character in commit
1a63a9fc, I omitted the mapping in the
lowercase-to-uppercase direction. this choice was not based on any
actual information, only assumptions.

this commit adds bidirectional case mappings between U+00DF and
U+1E9E, and removes the special-case hack that allowed U+00DF to be
identified as lowecase despite lacking a mapping. aside from strong
evidence that this is the "right" behavior for real-world usage of
these characters, several factors informed this decision:

- the other "potentially correct" mapping, to "SS", is not
  representable in the C case-mapping system anyway.

- leaving one letter in lowercase form when transforming a string to
  uppercase is obviously wrong.

- having a character which is nominally lowercase but which is fixed
  under case mapping violates reasonable invariants.
上级 fff54693
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
int iswlower(wint_t wc) int iswlower(wint_t wc)
{ {
return towupper(wc) != wc || wc == 0xdf; return towupper(wc) != wc;
} }
int __iswlower_l(wint_t c, locale_t l) int __iswlower_l(wint_t c, locale_t l)
......
...@@ -151,7 +151,6 @@ static const unsigned short pairs[][2] = { ...@@ -151,7 +151,6 @@ static const unsigned short pairs[][2] = {
{ 0x03f7, 0x03f8 }, { 0x03f7, 0x03f8 },
{ 0x03fa, 0x03fb }, { 0x03fa, 0x03fb },
{ 0x1e60, 0x1e9b }, { 0x1e60, 0x1e9b },
{ 0xdf, 0xdf },
{ 0x1e9e, 0xdf }, { 0x1e9e, 0xdf },
{ 0x1f59, 0x1f51 }, { 0x1f59, 0x1f51 },
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册