• A
    NLS: update handling of Unicode · 74675a58
    Alan Stern 提交于
    This patch (as1239) updates the kernel's treatment of Unicode.  The
    character-set conversion routines are well behind the current state of
    the Unicode specification: They don't recognize the existence of code
    points beyond plane 0 or of surrogate pairs in the UTF-16 encoding.
    
    The old wchar_t 16-bit type is retained because it's still used in
    lots of places.  This shouldn't cause any new problems; if a
    conversion now results in an invalid 16-bit code then before it must
    have yielded an undefined code.
    
    Difficult-to-read names like "utf_mbstowcs" are replaced with more
    transparent names like "utf8s_to_utf16s" and the ordering of the
    parameters is rationalized (buffer lengths come immediate after the
    pointers they refer to, and the inputs precede the outputs).
    Fortunately the low-level conversion routines are used in only a few
    places; the interfaces to the higher-level uni2char and char2uni
    methods have been left unchanged.
    Signed-off-by: NAlan Stern <stern@rowland.harvard.edu>
    Acked-by: NClemens Ladisch <clemens@ladisch.de>
    Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>
    74675a58
message.c 55.6 KB