• Tom Lane's avatar
    Fix potential infinite loop in regular expression execution. · f2c4ffc3
    Tom Lane authored
    In cfindloop(), if the initial call to shortest() reports that a
    zero-length match is possible at the current search start point, but then
    it is unable to construct any actual match to that, it'll just loop around
    with the same start point, and thus make no progress.  We need to force the
    start point to be advanced.  This is safe because the loop over "begin"
    points has already tried and failed to match starting at "close", so there
    is surely no need to try that again.
    
    This bug was introduced in commit e2bd9049,
    wherein we allowed continued searching after we'd run out of match
    possibilities, but evidently failed to think hard enough about exactly
    where we needed to search next.
    
    Because of the way this code works, such a match failure is only possible
    in the presence of backrefs --- otherwise, shortest()'s judgment that a
    match is possible should always be correct.  That probably explains how
    come the bug has escaped detection for several years.
    
    The actual fix is a one-liner, but I took the trouble to add/improve some
    comments related to the loop logic.
    
    After fixing that, the submitted test case "()*\1" didn't loop anymore.
    But it reported failure, though it seems like it ought to match a
    zero-length string; both Tcl and Perl think it does.  That seems to be from
    overenthusiastic optimization on my part when I rewrote the iteration match
    logic in commit 173e29aa: we can't just
    "declare victory" for a zero-length match without bothering to set match
    data for capturing parens inside the iterator node.
    
    Per fuzz testing by Greg Stark.  The first part of this is a bug in all
    supported branches, and the second part is a bug since 9.2 where the
    iteration rewrite happened.
    f2c4ffc3
regexec.c 34 KB