• Tom Lane's avatar
    Fix nasty performance problem in tsquery_rewrite(). · 5ec81ace
    Tom Lane authored
    tsquery_rewrite() tries to find matches to subsets of AND/OR conditions;
    for example, in the query 'a | b | c' the substitution subquery 'a | c'
    should match and lead to replacement of the first and third items.
    That's fine, but the matching algorithm apparently takes about O(2^N)
    for an N-clause query (I say "apparently" because the code is also both
    unintelligible and uncommented).  We could probably do better than that
    even without any extra assumptions --- but actually, we know that the
    subclauses are sorted, indeed are depending on that elsewhere in this very
    same function.  So we can just scan the two lists a single time to detect
    matches, as though we were doing a merge join.
    
    Also do a re-flattening call (QTNTernary()) in tsquery_rewrite_query, just
    to make sure that the tree fits the expectations of the next search cycle.
    I didn't try to devise a test case for this, but I'm pretty sure that the
    oversight could have led to failure to match in some cases where a match
    would be expected.
    
    Improve comments, and also stick a CHECK_FOR_INTERRUPTS into
    dofindsubquery, just in case it's still too slow for somebody.
    
    Per report from Andreas Seltenreich.  Back-patch to all supported branches.
    
    Discussion: <8760oasf2y.fsf@credativ.de>
    5ec81ace
tsquery_rewrite.c 10.1 KB