1. 24 7月, 2012 4 次提交
    • B
      [Indic] Limit syllables to at most five consonants · 9fa05273
      Behdad Esfahbod 提交于
      Seems to be about what Uniscribe does.  Not exactly.  But close enough.
      More consonants will start a new cluster.
      
      A few scripts went way down in failures.  In particular:
      
        - Devanagari failures went down from 490 to 56.
        - Telugu went down from 113 to 49.
      
      Other scripts went down slightly or didn't change.  New numbers:
      
      BENGALI: 353908 out of 354285 tests passed. 377 failed (0.106412%)
      DEVANAGARI: 693572 out of 693628 tests passed. 56 failed (0.00807349%)
      GUJARATI: 366485 out of 366506 tests passed. 21 failed (0.00572978%)
      GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%)
      KANNADA: 950730 out of 951913 tests passed. 1183 failed (0.124276%)
      KHMER: 298613 out of 299124 tests passed. 511 failed (0.170832%)
      MALAYALAM: 1046881 out of 1048416 tests passed. 1535 failed (0.146411%)
      ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
      SINHALA: 271333 out of 271847 tests passed. 514 failed (0.189077%)
      TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
      TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)
      
      Some of the remaining Telugu and Devanagari issues seem to be Uniscribe
      eating Anusvara when placed before a non-joiner.  Ouch!
      9fa05273
    • B
      [Thai] Fix SARA AM handling · 093cd583
      Behdad Esfahbod 提交于
      Oops, thinko.
      093cd583
    • B
      [Thai] Reorder U+0E3A THAI VOWEL SIGN PHINTHU · 42848453
      Behdad Esfahbod 提交于
      Uniscribe reorders U+0E3A to be after U+0E38 and U+0E39.  We do that by
      modifying the ccc for U+0E3A.
      
      Fixes the two remaining Thai failures (see previous commit).
      42848453
    • B
      [Thai] Adjust SARA AM reordering to match Uniscribe · 4a7f4f3e
      Behdad Esfahbod 提交于
      Adjust the list of marks before SARA AM that get the reordering
      treatment.  Also adjust cluster formation to match Uniscribe.
      
      With Wikipedia test data, now I see:
      
        - For Thai, with the Angsana New font from Win7, I see 54 failures out
          of over 4M tests  (0.00129107%).  Of the 54, two are legitimate
          reordering issues (fix coming soon), and the other 52 are simply
          Uniscribe using a zero-width space char instead of an unknown
          character for missing glyphs.  No idea why.  The missing-glyph
          sequences include one that is a Thai character followed by an Arabic
          Sokun.  Someone confused it with Nikhahit I assume!
      
        - For Lao, with the Dokchampa font from Win7, 33 tests fail out of
          54k (0.0615167%).  All seem to be insignificant mark positioning
          with two marks on a base.  Have to investigate.
      4a7f4f3e
  2. 23 7月, 2012 7 次提交
  3. 21 7月, 2012 17 次提交
  4. 20 7月, 2012 12 次提交