提交 · c4bc0b1a64e1ef1e105df84401805a16e8dbe82a · OpenHarmony / Third Party Musl

10 1月, 2018 2 次提交

consistently use the LOCK an UNLOCK macros · c4bc0b1a

由 Jens Gustedt 提交于 1月 03, 2018

In some places there has been a direct usage of the functions. Use the
macros consistently everywhere, such that it might be easier later on to
capture the fast path directly inside the macro and only have the call
overhead on the slow path.

c4bc0b1a

new lock algorithm with state and congestion count in one atomic int · 47d0bcd4

由 Jens Gustedt 提交于 1月 03, 2018

A variant of this new lock algorithm has been presented at SAC'16, see
https://hal.inria.fr/hal-01304108. A full version of that paper is
available at https://hal.inria.fr/hal-01236734.

The main motivation of this is to improve on the safety of the basic lock
implementation in musl. This is achieved by squeezing a lock flag and a
congestion count (= threads inside the critical section) into a single
int. Thereby an unlock operation does exactly one memory
transfer (a_fetch_add) and never touches the value again, but still
detects if a waiter has to be woken up.

This is a fix of a use-after-free bug in pthread_detach that had
temporarily been patched. Therefore this patch also reverts

         c1e27367

This is also the only place where internal knowledge of the lock
algorithm is used.

The main price for the improved safety is a little bit larger code.

Under high congestion, the scheduling behavior will be different
compared to the previous algorithm. In that case, a successful
put-to-sleep may appear out of order compared to the arrival in the
critical section.

47d0bcd4

19 12月, 2017 6 次提交

fix iconv output of surrogate pairs in ucs2 · 628cf979

由 Rich Felker 提交于 12月 18, 2017

in the unified code for handling utf-16 and ucs2 output, the check for
ucs2 wrongly looked at the source charset rather than the destination
charset.

628cf979

add support for BOM-determined-endian UCS2, UTF-16, and UTF-32 to iconv · 95c6044e

由 Rich Felker 提交于 12月 18, 2017

previously, the charset names without endianness specified were always
interpreted as big endian. unicode specifies that UTF-16 and UTF-32
have BOM-determined endianness if BOM is present, and are otherwise
big endian. since commit 5b546faa
added support for stateful encodings, it is now possible to implement
BOM support via the conversion descriptor state.

for conversions to these charsets, the output is always big endian and
does not have a BOM.

95c6044e

R

add cp866 (dos cyrillic) to iconv · 9d4d0ee4
由 Rich Felker 提交于 12月 18, 2017

9d4d0ee4

update case mappings to unicode 10.0 · 54941edd

由 Rich Felker 提交于 12月 18, 2017

the mapping tables and code are not automatically generated; they were
produced by comparing the output of towupper/towlower against the
mappings in the UCD, ignoring characters that were previously excluded
from case mappings or from alphabetic status (micro sign and circled
letters), and adding table entries or code for everything else
missing.

based very loosely on a patch by Reini Urban.

54941edd

R

update ctype tables to unicode 10.0 · c72c1c52
由 Rich Felker 提交于 12月 18, 2017

c72c1c52

reformat ctype tables to be diff-friendly, match tool output · d3f23337

由 Rich Felker 提交于 12月 18, 2017

the new version of the code used to generate these tables forces a
newline every 256 entries, whereas at the time these files were
originally generated and committed, it only wrapped them at 80
columns. the new behavior ensures that localized changes to the
tables, if they are ever needed, will produce localized diffs.

commit d060edf6 made the corresponding
changes to the iconv tables.

d3f23337

15 12月, 2017 3 次提交

use the name UTC instead of GMT for UTC timezone · eb7f93c4

由 Natanael Copa 提交于 12月 07, 2017

notes by maintainer:

both C and POSIX use the term UTC to specify related functionality,
despite POSIX defining it as something more like UT1 or historical
(pre-UTC) GMT without leap seconds. neither specifies the associated
string for %Z. old choice of "GMT" violated principle of least
surprise for users and some applications/tests. use "UTC" instead.

eb7f93c4

N
fix sysconf for infinite rlimits · 3ec82877
由 Natanael Copa 提交于 12月 07, 2017
```
sysconf should return -1 for infinity, not LONG_MAX.
```
3ec82877

fix data race in at_quick_exit · 64303156

由 Rich Felker 提交于 12月 14, 2017

aside from theoretical arbitrary results due to UB, this could
practically cause unbounded overflow of static array if hit, but
hitting it depends on having more than 32 calls to at_quick_exit and
having them sufficiently often.

64303156

13 12月, 2017 1 次提交
- R
  
  add ibm1047 codepage (ebcdic representation of latin1) to iconv · 01957bed
  由 Rich Felker 提交于 12月 12, 2017
  
  01957bed
12 12月, 2017 1 次提交

implement strftime padding specifier extensions · 8a6bd730

由 Timo Teräs 提交于 11月 22, 2016

notes added by maintainer:

the '-' specifier allows default padding to be suppressed, and '_'
allows padding with spaces instead of the default (zeros).

these extensions seem to be included in several other implementations
including FreeBSD and derivatives, and Solaris. while portable
software should not depend on them, time format strings are often
exposed to the user for configurable time display. reportedly some
python programs also use and depend on them.

8a6bd730

07 12月, 2017 1 次提交

implement the fopencookie extension to stdio · 06184334

由 William Pitcock 提交于 12月 05, 2017

notes added by maintainer:

this function is a GNU extension. it was chosen over the similar BSD
function funopen because the latter depends on fpos_t being an
arithmetic type as part of its public API, conflicting with our
definition of fpos_t and with the intent that it be an opaque type. it
was accepted for inclusion because, despite not being widely used, it
is usually very difficult to extricate software using it from the
dependency on it.

calling pattern for the read and write callbacks is not likely to
match glibc or other implementations, but should work with any
reasonable callbacks. in particular the read function is never called
without at least one byte being needed to satisfy its caller, so that
spurious blocking is not introduced.

contracts for what callbacks called from inside libc/stdio can do are
always complicated, and at some point still need to be specified
explicitly. at the very least, the callbacks must return or block
indefinitely (they cannot perform nonlocal exits) and they should not
make calls to stdio using their own FILE as an argument.

06184334

21 11月, 2017 2 次提交

make fgetwc handling of encoding errors consistent with/without buffer · 4000b010

由 Rich Felker 提交于 11月 20, 2017

previously, fgetwc left all but the first byte of an illegal sequence
unread (available for subsequent calls) when reading out of the FILE
buffer, but dropped all bytes contibuting to the error when falling
back to reading a byte at a time. neither behavior was ideal. in the
buffered case, each malformed character produced one error per byte,
rather than one per character. in the unbuffered case, consuming the
last byte that caused the transition from "incomplete" to "invalid"
state potentially dropped (and produced additional spurious encoding
errors for) the next valid character.

to handle both cases uniformly without duplicate code, revise the
buffered case to only cover situations where a complete and valid
character is present in the buffer, and fall back to byte-at-a-time
for all other cases. this allows using mbtowc (stateless) instead of
mbrtowc, which may slightly improve performance too.

when an encoding error has been hit in the byte-at-a-time case, leave
the final byte that produced the error unread (via ungetc) except in
the case of single-byte errors (for UTF-8, bytes c0, c1, f5-ff, and
continuation bytes with no lead byte). single-byte errors are fully
consumed so as not to leave the caller in an infinite loop repeating
the same error.

none of these changes are distinguished from a conformance standpoint,
since the file position is unspecified after encoding errors. they are
intended merely as QoI/consistency improvements.

4000b010

fix treatment by fgetws of encoding errors as eof · a90d9da1

由 Rich Felker 提交于 11月 20, 2017

fgetwc does not set the stream's error indicator on encoding errors,
making ferror insufficient to distinguish between error and eof
conditions. feof is also insufficient, since it will return true if
the file ended with a partial character encoding error.

whether fgetwc should be setting the error indicator itself is a
question with conflicting answers. the POSIX text for the function
states it as a requirement, but the ISO C text seems to require that
it not. this may be revisited in the future based on the outcome of
Austin Group issue #1170.

a90d9da1

19 11月, 2017 1 次提交

fix fgetwc when decoding a character that crosses buffer boundary · 72656157

由 Szabolcs Nagy 提交于 11月 18, 2017

Update the buffer position according to the bytes consumed into st when
decoding an incomplete character at the end of the buffer.

72656157

15 11月, 2017 1 次提交

add reverse iconv mappings for JIS-based encodings · a223dbd2

由 Rich Felker 提交于 11月 14, 2017

these encodings are still commonly used in messaging protocols and
such. the reverse mapping is implemented as a binary search of a list
of the jis 0208 characters in unicode order; the existing forward
table is used to perform the comparison in the search.

a223dbd2

14 11月, 2017 1 次提交

generalize iconv framework for 8-bit codepages · 105eff9d

由 Rich Felker 提交于 11月 13, 2017

previously, 8-bit codepages could only remap the high 128 bytes; the
low range was assumed/forced to agree with ascii. interpretation of
codepage table headers has been changed so that it's possible to
represent mappings for up to 256 slots (fewer if the initial portion
of the map is elided because it coincides with unicode codepoints).
this requires consuming a bit more of the 10-bit space of characters
that can be represented in 8-bit codepages, but there's still a plenty
left. the size of the legacy_chars table is actually reduced now by
eliding the first 256 entries and considering them to map implicitly
via the identity map.

before these changes, there seem to have been minor bugs/omissions in
codepage table generation, so it's likely that some actual bug fixes
are silently included in this commit. round-trip testing of a few
codepages was performed on the new version of the code, but no
differential testing against the old version was done.

105eff9d

11 11月, 2017 6 次提交

reformat cjk iconv tables to be diff-friendly, match tool output · d060edf6

由 Rich Felker 提交于 11月 11, 2017

the new version of the code used to generate these tables forces a
newline every 256 entries, whereas at the time these files were
originally generated and committed, it only wrapped them at 80
columns. the new behavior ensures that localized changes to the
tables, if they are ever needed, will produce localized diffs. other
tables including hkscs were already committed in the new format.

binary comparison of the generated object files was performed to
confirm that no spurious changes slipped in.

d060edf6

prevent fork's errno from being clobbered by atfork handlers · c21051e9

由 Bobby Bingham 提交于 11月 10, 2017

If the syscall fails, errno must be set correctly for the caller.
There's no guarantee that the handlers registered with pthread_atfork
won't clobber errno, so we need to ensure it gets set after they are
called.

c21051e9

add iso-2022-jp support (decoding only) to iconv · a39f20bf