• Tom Lane's avatar
    Micro-optimize GenericXLogFinish(). · 68689c66
    Tom Lane authored
    Make the inner comparison loops of computeDelta() as tight as possible by
    pulling considerations of valid and invalid ranges out of the inner loops,
    and extending a match or non-match detection as far as possible before
    deciding what to do next.  To keep this tractable, give up the possibility
    of merging fragments across the pd_lower to pd_upper gap.  The fraction of
    pages where that could happen (ie, there are 4 or fewer bytes in the gap,
    *and* data changes immediately adjacent to it on both sides) is too small
    to be worth spending cycles on.
    
    Also, avoid two BLCKSZ-length memcpy()s by computing the delta before
    moving data into the target buffer, instead of after.  This doesn't save
    nearly as many cycles as being tenser about computeDelta(), but it still
    seems worth doing.
    
    On my machine, this patch cuts a full 40% off the runtime of
    contrib/bloom's regression test.
    68689c66
generic_xlog.c 13.8 KB