• M
    json: Reject invalid UTF-8 sequences · e59f39d4
    Markus Armbruster 提交于
    We reject bytes that can't occur in valid UTF-8 (\xC0..\xC1,
    \xF5..\xFF in the lexer.  That's insufficient; there's plenty of
    invalid UTF-8 not containing these bytes, as demonstrated by
    check-qjson:
    
    * Malformed sequences
    
      - Unexpected continuation bytes
    
      - Missing continuation bytes after start bytes other than
        \xC0..\xC1, \xF5..\xFD.
    
    * Overlong sequences with start bytes other than \xC0..\xC1,
      \xF5..\xFD.
    
    * Invalid code points
    
    Fixing this in the lexer would be bothersome.  Fixing it in the parser
    is straightforward, so do that.
    Signed-off-by: NMarkus Armbruster <armbru@redhat.com>
    Reviewed-by: NEric Blake <eblake@redhat.com>
    Message-Id: <20180823164025.12553-23-armbru@redhat.com>
    e59f39d4
json-parser.c 14.8 KB