• Tom Lane's avatar
    Further fix pg_trgm's extraction of trigrams from regular expressions. · 1dffabed
    Tom Lane authored
    Commit 9e43e871 turns out to have been insufficient: not only is it
    necessary to track tentative parent links while considering a set of
    arc removals, but it's necessary to track tentative flag additions
    as well.  This is because we always merge arc target states into
    arc source states; therefore, when considering a merge of the final
    state with some other, it is the other state that will acquire a new
    TSTATE_FIN bit.  If there's another arc for the same color trigram
    that would cause merging of that state with the initial state, we
    failed to recognize the problem.  The test cases for the prior commit
    evidently only exercised situations where a tentative merge with the
    initial state occurs before one with the final state.  If it goes the
    other way around, we'll happily merge the initial and final states,
    either producing a broken final graph that would never match anything,
    or triggering the Assert added by the prior commit.
    
    It's tempting to consider switching the merge direction when the merge
    involves the final state, but I lack the time to analyze that idea in
    detail.  Instead just keep track of the flag changes that would result
    from proposed merges, in the same way that the prior commit tracked
    proposed parent links.
    
    Along the way, add some more debugging support, because I'm not entirely
    confident that this is the last bug here.  And tweak matters so that
    the transformed.dot file uses small integers rather than pointer values
    to identify states; that makes it more readable if you're just eyeballing
    it rather than fooling with Graphviz.  And rename a couple of identically
    named struct fields to reduce confusion.
    
    Per report from Corey Csuhta.  Add a test case based on his example.
    (Note: this case does not trigger the bug under 9.3, apparently because
    its different measurement of costs causes it to stop merging states before
    it hits the failure.  I spent some time trying to find a variant that would
    fail in 9.3, without success; but I'm sure such cases exist.)
    
    Like the previous patch, back-patch to 9.3 where this code was added.
    
    Report: https://postgr.es/m/E2B01A4B-4530-406B-8D17-2F67CF9A16BA@csuhta.com
    1dffabed
pg_trgm.out 90.6 KB