提交 124f88ea 编写于 作者: G Gareth Rees 提交者: Gareth Rees

Deal with regex match groups in excerpt

Original implementation has bugs if the regex contains a match group.

Example:

    excerpt('This is a beautiful? morning', /\b(beau\w*)\b/i, :radius => 5)
    Expected: "...is a beautiful? mor..."
    Actual: "...is a beautifulbeaut..."

The original phrase was being converted to a regex and returning the text
either side of the phrase as expected:

    'This is a beautiful? morning'.split(/beautiful/i, 2)
    # => ["This is a ", "? morning"]

When we have a match with groups the match is returned in the array.

Quoting the ruby docs: "If pattern is a Regexp, str is divided where the
pattern matches. [...] If pattern contains groups, the respective matches will
be returned in the array as well."

    'This is a beautiful? morning'.split(/\b(beau\w*)\b/iu, 2)
    # => ["This is a ", "beautiful", "? morning"]

If we assume we want to split on the first match – this fix makes that
assumption – we can pass the already assigned `phrase` variable as the place
to split (because we already know that a match exists from line 168).

Originally spotted by Louise Crow (@crowbot) at
https://github.com/mysociety/alaveteli/pull/1557
上级 a1bd00d5
......@@ -188,7 +188,7 @@ def excerpt(text, phrase, options = {})
end
end
first_part, second_part = text.split(regex, 2)
first_part, second_part = text.split(phrase, 2)
prefix, first_part = cut_excerpt_part(:first, first_part, separator, options)
postfix, second_part = cut_excerpt_part(:second, second_part, separator, options)
......
......@@ -280,8 +280,13 @@ def test_excerpt
end
def test_excerpt_with_regex
assert_equal('...is a beautiful! mor...', excerpt('This is a beautiful! morning', 'beautiful', :radius => 5))
assert_equal('...is a beautiful? mor...', excerpt('This is a beautiful? morning', 'beautiful', :radius => 5))
assert_equal('...is a beautiful? mor...', excerpt('This is a beautiful? morning', /\bbeau\w*\b/i, :radius => 5))
assert_equal('...is a beautiful? mor...', excerpt('This is a beautiful? morning', /\b(beau\w*)\b/i, :radius => 5))
assert_equal("...udge Allen and...", excerpt("This day was challenging for judge Allen and his colleagues.", /\ballen\b/i, :radius => 5))
assert_equal("...judge Allen and...", excerpt("This day was challenging for judge Allen and his colleagues.", /\ballen\b/i, :radius => 1, :separator => ' '))
assert_equal("...was challenging for...", excerpt("This day was challenging for judge Allen and his colleagues.", /\b(\w*allen\w*)\b/i, :radius => 5))
end
def test_excerpt_should_not_be_html_safe
......@@ -305,11 +310,6 @@ def test_excerpt_in_borderline_cases
assert_equal("...abc...", excerpt("z abc d", "b", :radius => 1))
end
def test_excerpt_with_regex
assert_equal('...is a beautiful! mor...', excerpt('This is a beautiful! morning', 'beautiful', :radius => 5))
assert_equal('...is a beautiful? mor...', excerpt('This is a beautiful? morning', 'beautiful', :radius => 5))
end
def test_excerpt_with_omission
assert_equal("[...]is a beautiful morn[...]", excerpt("This is a beautiful morning", "beautiful", :omission => "[...]",:radius => 5))
assert_equal(
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册