• Peter Geoghegan's avatar
    Consider secondary factors during nbtree splits. · fab25024
    Peter Geoghegan authored
    Teach nbtree to give some consideration to how "distinguishing"
    candidate leaf page split points are.  This should not noticeably affect
    the balance of free space within each half of the split, while still
    making suffix truncation truncate away significantly more attributes on
    average.
    
    The logic for choosing a leaf split point now uses a fallback mode in
    the case where the page is full of duplicates and it isn't possible to
    find even a minimally distinguishing split point.  When the page is full
    of duplicates, the split should pack the left half very tightly, while
    leaving the right half mostly empty.  Our assumption is that logical
    duplicates will almost always be inserted in ascending heap TID order
    with v4 indexes.  This strategy leaves most of the free space on the
    half of the split that will likely be where future logical duplicates of
    the same value need to be placed.
    
    The number of cycles added is not very noticeable.  This is important
    because deciding on a split point takes place while at least one
    exclusive buffer lock is held.  We avoid using authoritative insertion
    scankey comparisons to save cycles, unlike suffix truncation proper.  We
    use a faster binary comparison instead.
    
    Note that even pg_upgrade'd v3 indexes make use of these optimizations.
    Benchmarking has shown that even v3 indexes benefit, despite the fact
    that suffix truncation will only truncate non-key attributes in INCLUDE
    indexes.  Grouping relatively similar tuples together is beneficial in
    and of itself, since it reduces the number of leaf pages that must be
    accessed by subsequent index scans.
    
    Author: Peter Geoghegan
    Reviewed-By: Heikki Linnakangas
    Discussion: https://postgr.es/m/CAH2-WzmmoLNQOj9mAD78iQHfWLJDszHEDrAzGTUMG3mVh5xWPw@mail.gmail.com
    fab25024
nbtree.h 32.9 KB