提交 a3e5cd34 编写于 作者: F Fredrik Roubert

Let mk_hyb_file.py replace ßSS in .chr.txt files with ßẞ.

Here these mappings are used to convert from uppercase to lowercase,
and mk_hyb_file.py doesn't handle multi-character uppercase sequences.
Therefore, in case the sequence ßSS is encountered in a .chr.txt,
replace it internally with ßẞ.

Test: make -j
Change-Id: I8f678aad9298784f70645c453ec07da5bf43cb66
上级 415ff26d
......@@ -35,6 +35,10 @@ import getopt
VERBOSE = False
# U+00DF is LATIN SMALL LETTER SHARP S
# U+1E9E is LATIN CAPITAL LETTER SHARP S
SHARP_S_TO_DOUBLE = u'\u00dfSS'
SHARP_S_TO_CAPITAL = u'\u00df\u1e9e'
if sys.version_info[0] >= 3:
def unichr(x):
......@@ -283,8 +287,12 @@ def load_chr(fn):
for i, l in enumerate(f):
l = l.strip()
if len(l) > 2:
# lowercase maps to multi-character uppercase sequence, ignore uppercase for now
l = l[:1]
if l == SHARP_S_TO_DOUBLE:
# replace with lowercasing from capital letter sharp s
l = SHARP_S_TO_CAPITAL
else:
# lowercase maps to multi-character uppercase sequence, ignore uppercase for now
l = l[:1]
else:
assert len(l) == 2, 'expected 2 chars in chr'
for c in l:
......@@ -419,6 +427,9 @@ def verify_file_sorted(lines, fn):
file_lines = [l.strip() for l in io.open(fn, encoding='UTF-8')]
line_set = set(lines)
file_set = set(file_lines)
if SHARP_S_TO_DOUBLE in file_set:
# ignore difference of double capital letter s and capital letter sharp s
file_set.symmetric_difference_update([SHARP_S_TO_DOUBLE, SHARP_S_TO_CAPITAL])
if line_set == file_set:
return True
for line in line_set - file_set:
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册