• Tom Lane's avatar
    Rethink checkpointer's fsync-request table representation. · be86e3dd
    Tom Lane authored
    Instead of having one hash table entry per relation/fork/segment, just have
    one per relation, and use bitmapsets to represent which specific segments
    need to be fsync'd.  This eliminates the need to scan the whole hash table
    to implement FORGET_RELATION_FSYNC, which fixes the O(N^2) behavior
    recently demonstrated by Jeff Janes for cases involving lots of TRUNCATE or
    DROP TABLE operations during a single checkpoint cycle.  Per an idea from
    Robert Haas.
    
    (FORGET_DATABASE_FSYNC still sucks, but since dropping a database is a
    pretty expensive operation anyway, we'll live with that.)
    
    In passing, improve the delayed-unlink code: remove the pass over the list
    in mdpreckpt, since it wasn't doing anything for us except supporting a
    useless Assert in mdpostckpt, and fix mdpostckpt so that it will absorb
    fsync requests every so often when clearing a large backlog of deletion
    requests.
    be86e3dd
md.c 54.3 KB