• Tom Lane's avatar
    Prevent regexp back-refs from sometimes matching when they shouldn't. · 779557bd
    Tom Lane authored
    The recursion in cdissect() was careless about clearing match data
    for capturing parentheses after rejecting a partial match.  This
    could allow a later back-reference to succeed when by rights it
    should fail for lack of a defined referent.
    
    To fix, think a little more rigorously about what the contract
    between different levels of cdissect's recursion needs to be.
    With the right spec, we can fix this using fewer rather than more
    resets of the match data; the key decision being that a failed
    sub-match is now explicitly responsible for clearing any matches
    it may have set.
    
    There are enough other cross-checks and optimizations in the code
    that it's not especially easy to exhibit this problem; usually, the
    match will fail as-expected.  Plus, regexps that are even potentially
    vulnerable are most likely user errors, since there's just not much
    point in writing a back-ref that doesn't always have a referent.
    These facts perhaps explain why the issue hasn't been detected,
    even though it's almost certainly a couple of decades old.
    
    Discussion: https://postgr.es/m/151435.1629733387@sss.pgh.pa.us
    779557bd
regex.sql 5.73 KB