• Tom Lane's avatar
    Improve FSM management for BRIN indexes. · 1383e2a1
    Tom Lane authored
    BRIN indexes like to propagate additions of free space into the upper pages
    of their free space maps as soon as the new space is known, even when it's
    just on one individual index page.  Previously this required calling
    FreeSpaceMapVacuum, which is quite an expensive thing if the map is large.
    Use the FreeSpaceMapVacuumRange function recently added by commit c79f6df7
    to reduce the amount of work done for this purpose.
    
    Fix a couple of places that neglected to do the upper-page vacuuming at all
    after recording new free space.  If the policy is to be that BRIN should do
    that, it should do it everywhere.
    
    Do RecordPageWithFreeSpace unconditionally in brin_page_cleanup, and do
    FreeSpaceMapVacuum unconditionally in brin_vacuum_scan.  Because of the
    FSM's imprecise storage of free space, the old complications here seldom
    bought anything, they just slowed things down.  This approach also
    provides a predictable path for FSM corruption to be repaired.
    
    Remove premature RecordPageWithFreeSpace call in brin_getinsertbuffer
    where it's about to return an extended page to the caller.  The caller
    should do that, instead, after it's inserted its new tuple.  Fix the
    one caller that forgot to do so.
    
    Simplify logic in brin_doupdate's same-page-update case by postponing
    brin_initialize_empty_new_buffer to after the critical section; I see
    little point in doing it before.
    
    Avoid repeat calls of RelationGetNumberOfBlocks in brin_vacuum_scan.
    Avoid duplicate BufferGetBlockNumber and BufferGetPage calls in
    a couple of places where we already had the right values.
    
    Move a BRIN_elog debug logging call out of a critical section; that's
    pretty unsafe and I don't think it buys us anything to not wait till
    after the critical section.
    
    Move the "*extended = false" step in brin_getinsertbuffer into the
    routine's main loop.  There's no actual bug there, since the loop can't
    iterate with *extended still true, but it doesn't seem very future-proof
    as coded; and it's certainly not documented as a loop invariant.
    
    This is all from follow-on investigation inspired by commit c79f6df7.
    
    Discussion: https://postgr.es/m/5801.1522429460@sss.pgh.pa.us
    1383e2a1
brin_pageops.h 1.28 KB