• Tom Lane's avatar
    Invent "rainbow" arcs within the regex engine. · 08c0d6ad
    Tom Lane authored
    Some regular expression constructs, most notably the "." match-anything
    metacharacter, produce a sheaf of parallel NFA arcs covering all
    possible colors (that is, character equivalence classes).  We can make
    a noticeable improvement in the space and time needed to process large
    regexes by replacing such cases with a single arc bearing the special
    color code "RAINBOW".  This requires only minor additional complication
    in places such as pull() and push().
    
    Callers of pg_reg_getoutarcs() must now be prepared for the possibility
    of seeing a RAINBOW arc.  For the one known user, contrib/pg_trgm,
    that's a net benefit since it cuts the number of arcs to be dealt with,
    and the handling isn't any different than for other colors that contain
    too many characters to be dealt with individually.
    
    This is part of a patch series that in total reduces the regex engine's
    runtime by about a factor of four on a large corpus of real-world regexes.
    
    Patch by me, reviewed by Joel Jacobson
    
    Discussion: https://postgr.es/m/1340281.1613018383@sss.pgh.pa.us
    08c0d6ad
rege_dfa.c 23 KB