• J
    t3900: ISO-2022-JP has more than one popular variants · eb127887
    Junio C Hamano 提交于
    When converting from other encodings (e.g. EUC-JP or UTF-8), there are
    subtly different variants of ISO-2022-JP, all of which are valid.  At the
    end of line or when a run of string switches to 1-byte sequence, ESC ( B
    can be used to switch to ASCII or ESC ( J can be used to switch to ISO
    646:JP (JIS X 0201) but they essentially are the same character set and
    are used interchangeably.  Similarly the set ESC $ @ switches to (JIS X
    0208-1978) and ESC $ B switches to (JIS X 0208-1983) are in practice used
    interchangeably.
    
    Depending on the iconv library and the locale definition on the system, a
    program that converts from another encoding to ISO-2022-JP can produce
    different byte sequence, and GIT_TEST_CMP (aka "diff -u") will report the
    difference as a failure.
    
    Fix this by converting the expected and the actual output to UTF-8 before
    comparing when the end result is ISO-2022-JP.  The test vector string in
    t3900/ISO-2022-JP.txt is expressed with ASCII and JIS X 0208-1983, but it
    can be expressed with any other possible variant, and when converted back
    to UTF-8, these variants produce identical byte sequences.
    Signed-off-by: NJunio C Hamano <gitster@pobox.com>
    eb127887
t3900-i18n-commit.sh 2.9 KB