• Robert Haas's avatar
    Improve hash index bucket split behavior. · 6d46f478
    Robert Haas authored
    Previously, the right to split a bucket was represented by a
    heavyweight lock on the page number of the primary bucket page.
    Unfortunately, this meant that every scan needed to take a heavyweight
    lock on that bucket also, which was bad for concurrency.  Instead, use
    a cleanup lock on the primary bucket page to indicate the right to
    begin a split, so that scans only need to retain a pin on that page,
    which is they would have to acquire anyway, and which is also much
    cheaper.
    
    In addition to reducing the locking cost, this also avoids locking out
    scans and inserts for the entire lifetime of the split: while the new
    bucket is being populated with copies of the appropriate tuples from
    the old bucket, scans and inserts can happen in parallel.  There are
    minor concurrency improvements for vacuum operations as well, though
    the situation there is still far from ideal.
    
    This patch also removes the unworldly assumption that a split will
    never be interrupted.  With the new code, a split is done in a series
    of small steps and the system can pick up where it left off if it is
    interrupted prior to completion.  While this patch does not itself add
    write-ahead logging for hash indexes, it is clearly a necessary first
    step, since one of the things that could interrupt a split is the
    removal of electrical power from the machine performing it.
    
    Amit Kapila.  I wrote the original design on which this patch is
    based, and did a good bit of work on the comments and README through
    multiple rounds of review, but all of the code is Amit's.  Also
    reviewed by Jesper Pedersen, Jeff Janes, and others.
    
    Discussion: http://postgr.es/m/CAA4eK1LfzcZYxLoXS874Ad0+S-ZM60U9bwcyiUZx9mHZ-KCWhw@mail.gmail.com
    6d46f478
hash.c 23.5 KB