make nl_langinfo(CODESET) always return "UTF-8" (844212d9) · 提交 · OpenHarmony / Third Party Musl

提交 844212d9 编写于 9月 09, 2015 作者: R Rich Felker

make nl_langinfo(CODESET) always return "UTF-8"

this restores the original behavior prior to the addition of the
byte-based C locale and fixes what is effectively a regression in
musl's property of always providing working UTF-8 support.

commit 1507ebf8 introduced the codeset
name "UTF-8-CODE-UNITS" for the byte-based C locale to represent that
the semantic content is UTF-8 but that it is being processed as code
units (bytes) rather than whole multibyte characters. however, many
programs assume that the codeset name is usable with iconv and/or
comes from a set of standard/widely-used names known to the
application. such programs are likely to produce warnings or errors,
run with reduced functionality, or mangle character data when run
explicitly in the C locale.

the standard places basically no requirements for the string returned
by nl_langinfo(CODESET) and how it interacts with other interfaces, so
returning "UTF-8" is permissible. moreover, it seems like the right
thing to do, since the identity of the character encoding as "UTF-8"
is independent of whether it is being processed as bytes of characters
by the standard library functions.

上级 426a0e29

隐藏空白更改

内联并排

浏览文件 @ 844212d9

...	@@ -33,8 +33,7 @@ char *__nl_langinfo_l(nl_item item, locale_t loc)	...	@@ -33,8 +33,7 @@ char *__nl_langinfo_l(nl_item item, locale_t loc)
	int idx = item & 65535;		int idx = item & 65535;
	const char *str;		const char *str;

	if (item == CODESET)		if (item == CODESET) "UTF-8";
	return MB_CUR_MAX==1 ? "UTF-8-CODE-UNITS" : "UTF-8";

	switch (cat) {		switch (cat) {
	case LC_NUMERIC:		case LC_NUMERIC:
...		...

想要评论请注册或

OpenHarmony / Third Party Musl 接近 2 年 前同步成功

make nl_langinfo(CODESET) always return "UTF-8"

OpenHarmony / Third Party Musl
接近 2 年前同步成功