未验证 提交 e52ab312 编写于 作者: A Ashe Connor

URI.unescape handles mixed Unicode/escaped input

Previously, URI.enscape could handle Unicode input (without any actual
escaped characters), or input with escaped characters (but no actual
Unicode characters) - not both.

    URI.unescape("\xe3\x83\x90")  # => "バ"
    URI.unescape("%E3%83%90")  # => "バ"
    URI.unescape("\xe3\x83\x90%E3%83%90")  # =>
                                         # Encoding::CompatibilityError

We need to let `gsub` handle this for us, and then force back to the
original encoding of the input.  The result String will be mangled if
the percent-encoded characters don't conform to the encoding of the
String itself, but that goes without saying.
Signed-off-by: NAshe Connor <ashe@kivikakk.ee>
上级 e126078a
## Rails 6.0.0.alpha (Unreleased) ##
* Fix bug where `URI.unscape` would fail with mixed Unicode/escaped character input:
URI.unescape("\xe3\x83\x90") # => "バ"
URI.unescape("%E3%83%90") # => "バ"
URI.unescape("\xe3\x83\x90%E3%83%90") # => Encoding::CompatibilityError
GH#32183
*Ashe Connor*, *Aaron Patterson*
* Add `:private` option to ActiveSupport's `Module#delegate`
in order to delegate methods as private:
......
......@@ -13,7 +13,7 @@ def unescape(str, escaped = /%[a-fA-F\d]{2}/)
# YK: My initial experiments say yes, but let's be sure please
enc = str.encoding
enc = Encoding::UTF_8 if enc == Encoding::US_ASCII
str.gsub(escaped) { |match| [match[1, 2].hex].pack("C") }.force_encoding(enc)
str.dup.force_encoding(Encoding::ASCII_8BIT).gsub(escaped) { |match| [match[1, 2].hex].pack("C") }.force_encoding(enc)
end
end
end
......
......@@ -9,6 +9,6 @@ def test_uri_decode_handle_multibyte
str = "\xE6\x97\xA5\xE6\x9C\xAC\xE8\xAA\x9E" # Ni-ho-nn-go in UTF-8, means Japanese.
parser = URI.parser
assert_equal str, parser.unescape(parser.escape(str))
assert_equal str + str, parser.unescape(str + parser.escape(str).encode(Encoding::UTF_8))
end
end
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册