提交 · 628cf979b249fa76a80962e2eefe05073216a4db · OpenHarmony / Third Party Musl

19 12月, 2017 6 次提交

fix iconv output of surrogate pairs in ucs2 · 628cf979

由 Rich Felker 提交于 12月 18, 2017

in the unified code for handling utf-16 and ucs2 output, the check for
ucs2 wrongly looked at the source charset rather than the destination
charset.

628cf979

add support for BOM-determined-endian UCS2, UTF-16, and UTF-32 to iconv · 95c6044e

由 Rich Felker 提交于 12月 18, 2017

previously, the charset names without endianness specified were always
interpreted as big endian. unicode specifies that UTF-16 and UTF-32
have BOM-determined endianness if BOM is present, and are otherwise
big endian. since commit 5b546faa
added support for stateful encodings, it is now possible to implement
BOM support via the conversion descriptor state.

for conversions to these charsets, the output is always big endian and
does not have a BOM.

95c6044e

R

add cp866 (dos cyrillic) to iconv · 9d4d0ee4
由 Rich Felker 提交于 12月 18, 2017

9d4d0ee4

update case mappings to unicode 10.0 · 54941edd

由 Rich Felker 提交于 12月 18, 2017

the mapping tables and code are not automatically generated; they were
produced by comparing the output of towupper/towlower against the
mappings in the UCD, ignoring characters that were previously excluded
from case mappings or from alphabetic status (micro sign and circled
letters), and adding table entries or code for everything else
missing.

based very loosely on a patch by Reini Urban.

54941edd

R

update ctype tables to unicode 10.0 · c72c1c52
由 Rich Felker 提交于 12月 18, 2017

c72c1c52

reformat ctype tables to be diff-friendly, match tool output · d3f23337

由 Rich Felker 提交于 12月 18, 2017

the new version of the code used to generate these tables forces a
newline every 256 entries, whereas at the time these files were
originally generated and committed, it only wrapped them at 80
columns. the new behavior ensures that localized changes to the
tables, if they are ever needed, will produce localized diffs.

commit d060edf6 made the corresponding
changes to the iconv tables.

d3f23337

16 12月, 2017 1 次提交
- R
  
  fix endian errors in netinet/icmp6.h due to failure to include endian.h · d5029bb8
  由 Rich Felker 提交于 12月 15, 2017
  
  d5029bb8
15 12月, 2017 6 次提交

J

fix endian errors in arpa/nameser.h due to failure to include endian.h · 14cec867
由 Jo-Philipp Wich 提交于 12月 04, 2017

14cec867

remove unused explicit dependency rules for crti/crtn · 2a831786

由 Nicholas Wilson 提交于 12月 07, 2017

notes by maintainer:

commit 2f853dd6 added these rules
because the new system for handling arch-provided replacement files
introduced for out-of-tree builds did not apply to the crt tree.

commit 63bcda4d later adapted the
makefile logic so that the crt and ldso trees go through the same
replacement logic as everything else, but failed to remove the
explicit rules that assumed the arch would always provide asm
replacements.

in addition to cleaning things up, removing these spurious rules
allows crti/crtn asm to be omitted by an arch (thereby using the empty
C files instead) if they are not needed.

2a831786

use the name UTC instead of GMT for UTC timezone · eb7f93c4

由 Natanael Copa 提交于 12月 07, 2017

notes by maintainer:

both C and POSIX use the term UTC to specify related functionality,
despite POSIX defining it as something more like UT1 or historical
(pre-UTC) GMT without leap seconds. neither specifies the associated
string for %Z. old choice of "GMT" violated principle of least
surprise for users and some applications/tests. use "UTC" instead.

eb7f93c4

N
fix sysconf for infinite rlimits · 3ec82877
由 Natanael Copa 提交于 12月 07, 2017
```
sysconf should return -1 for infinity, not LONG_MAX.
```
3ec82877
N

fix x32 unistd macros to report as ILP32 not LP64 · 13127680
由 Nicholas Wilson 提交于 12月 12, 2017

13127680

fix data race in at_quick_exit · 64303156

由 Rich Felker 提交于 12月 14, 2017

aside from theoretical arbitrary results due to UB, this could
practically cause unbounded overflow of static array if hit, but
hitting it depends on having more than 32 calls to at_quick_exit and
having them sufficiently often.

64303156

13 12月, 2017 1 次提交
- R
  
  add ibm1047 codepage (ebcdic representation of latin1) to iconv · 01957bed
  由 Rich Felker 提交于 12月 12, 2017
  
  01957bed
12 12月, 2017 1 次提交

implement strftime padding specifier extensions · 8a6bd730

由 Timo Teräs 提交于 11月 22, 2016

notes added by maintainer:

the '-' specifier allows default padding to be suppressed, and '_'
allows padding with spaces instead of the default (zeros).

these extensions seem to be included in several other implementations
including FreeBSD and derivatives, and Solaris. while portable
software should not depend on them, time format strings are often
exposed to the user for configurable time display. reportedly some
python programs also use and depend on them.

8a6bd730

07 12月, 2017 2 次提交

R
adjust fopencookie structure tag for ABI-compat · 2488d31f
由 Rich Felker 提交于 12月 06, 2017
```
stdio types use the struct tag names from glibc libio to match C++
ABI.
```
2488d31f

implement the fopencookie extension to stdio · 06184334

由 William Pitcock 提交于 12月 05, 2017

notes added by maintainer:

this function is a GNU extension. it was chosen over the similar BSD
function funopen because the latter depends on fpos_t being an
arithmetic type as part of its public API, conflicting with our
definition of fpos_t and with the intent that it be an opaque type. it
was accepted for inclusion because, despite not being widely used, it
is usually very difficult to extricate software using it from the
dependency on it.

calling pattern for the read and write callbacks is not likely to
match glibc or other implementations, but should work with any
reasonable callbacks. in particular the read function is never called
without at least one byte being needed to satisfy its caller, so that
spurious blocking is not introduced.

contracts for what callbacks called from inside libc/stdio can do are
always complicated, and at some point still need to be specified
explicitly. at the very least, the callbacks must return or block
indefinitely (they cannot perform nonlocal exits) and they should not
make calls to stdio using their own FILE as an argument.

06184334

21 11月, 2017 2 次提交

make fgetwc handling of encoding errors consistent with/without buffer · 4000b010

由 Rich Felker 提交于 11月 20, 2017

previously, fgetwc left all but the first byte of an illegal sequence
unread (available for subsequent calls) when reading out of the FILE
buffer, but dropped all bytes contibuting to the error when falling
back to reading a byte at a time. neither behavior was ideal. in the
buffered case, each malformed character produced one error per byte,
rather than one per character. in the unbuffered case, consuming the
last byte that caused the transition from "incomplete" to "invalid"
state potentially dropped (and produced additional spurious encoding
errors for) the next valid character.

to handle both cases uniformly without duplicate code, revise the
buffered case to only cover situations where a complete and valid
character is present in the buffer, and fall back to byte-at-a-time
for all other cases. this allows using mbtowc (stateless) instead of
mbrtowc, which may slightly improve performance too.

when an encoding error has been hit in the byte-at-a-time case, leave
the final byte that produced the error unread (via ungetc) except in
the case of single-byte errors (for UTF-8, bytes c0, c1, f5-ff, and
continuation bytes with no lead byte). single-byte errors are fully
consumed so as not to leave the caller in an infinite loop repeating
the same error.

none of these changes are distinguished from a conformance standpoint,
since the file position is unspecified after encoding errors. they are
intended merely as QoI/consistency improvements.

4000b010

fix treatment by fgetws of encoding errors as eof · a90d9da1

由 Rich Felker 提交于 11月 20, 2017

fgetwc does not set the stream's error indicator on encoding errors,
making ferror insufficient to distinguish between error and eof
conditions. feof is also insufficient, since it will return true if
the file ended with a partial character encoding error.

whether fgetwc should be setting the error indicator itself is a
question with conflicting answers. the POSIX text for the function
states it as a requirement, but the ISO C text seems to require that
it not. this may be revisited in the future based on the outcome of
Austin Group issue #1170.

a90d9da1

19 11月, 2017 1 次提交

fix fgetwc when decoding a character that crosses buffer boundary · 72656157

由 Szabolcs Nagy 提交于 11月 18, 2017

Update the buffer position according to the bytes consumed into st when
decoding an incomplete character at the end of the buffer.

72656157

15 11月, 2017 1 次提交

add reverse iconv mappings for JIS-based encodings · a223dbd2

由 Rich Felker 提交于 11月 14, 2017

these encodings are still commonly used in messaging protocols and
such. the reverse mapping is implemented as a binary search of a list
of the jis 0208 characters in unicode order; the existing forward
table is used to perform the comparison in the search.

a223dbd2

14 11月, 2017 2 次提交

generalize iconv framework for 8-bit codepages · 105eff9d

由 Rich Felker 提交于 11月 13, 2017

previously, 8-bit codepages could only remap the high 128 bytes; the
low range was assumed/forced to agree with ascii. interpretation of
codepage table headers has been changed so that it's possible to
represent mappings for up to 256 slots (fewer if the initial portion
of the map is elided because it coincides with unicode codepoints).
this requires consuming a bit more of the 10-bit space of characters
that can be represented in 8-bit codepages, but there's still a plenty
left. the size of the legacy_chars table is actually reduced now by
eliding the first 256 entries and considering them to map implicitly
via the identity map.

before these changes, there seem to have been minor bugs/omissions in
codepage table generation, so it's likely that some actual bug fixes
are silently included in this commit. round-trip testing of a few
codepages was performed on the new version of the code, but no
differential testing against the old version was done.

105eff9d

fix malloc state corruption when ldso rejects loading a second libc · a71b46cf

由 Rich Felker 提交于 11月 13, 2017

commit c49d3c8a added logic to detect
attempts to load libc.so via another name and instead redirect to the
existing libc, rather than loading two and producing dangerously
inconsistent state. however, the check for and unmapping of the
duplicate libc happened after reclaim_gaps was already called,
donating the slack space around the writable segment to malloc.
subsequent unmapping of the library then invalidated malloc's free
lists.

fix the issue by moving the call to reclaim_gaps out of map_library
into load_library, after the duplicate libc check but before the first
call to calloc, so that the gaps can still be used to satisfy the
allocation of struct dso. this change also eliminates the need for an
ugly hack (temporarily setting runtime=1) to avoid reclaim_gaps when
loading the main program via map_library, which happens when ldso is
invoked as a command.

only programs/libraries erroneously containing a DT_NEEDED reference
to libc.so via an absolute pathname or symlink were affected by this
issue.

a71b46cf

11 11月, 2017 6 次提交

reformat cjk iconv tables to be diff-friendly, match tool output · d060edf6

由 Rich Felker 提交于 11月 11, 2017

the new version of the code used to generate these tables forces a
newline every 256 entries, whereas at the time these files were
originally generated and committed, it only wrapped them at 80
columns. the new behavior ensures that localized changes to the
tables, if they are ever needed, will produce localized diffs. other
tables including hkscs were already committed in the new format.

binary comparison of the generated object files was performed to
confirm that no spurious changes slipped in.

d060edf6

prevent fork's errno from being clobbered by atfork handlers · c21051e9

由 Bobby Bingham 提交于 11月 10, 2017

If the syscall fails, errno must be set correctly for the caller.
There's no guarantee that the handlers registered with pthread_atfork
won't clobber errno, so we need to ensure it gets set after they are
called.

c21051e9

add iso-2022-jp support (decoding only) to iconv · a39f20bf