• Tom Lane's avatar
    Improve parallel scheduling logic in pg_dump/pg_restore. · 548e5097
    Tom Lane authored
    Previously, the way this worked was that a parallel pg_dump would
    re-order the TABLE_DATA items in the dump's TOC into decreasing size
    order, and separately re-order (some of) the INDEX items into decreasing
    size order.  Then pg_dump would dump the items in that order.  Later,
    parallel pg_restore just followed the TOC order.  This method had lots
    of deficiencies:
    
    * TOC ordering randomly differed between parallel and non-parallel
    dumps, and was hard to predict in the former case, causing problems
    for building stable pg_dump test cases.
    
    * Parallel restore only followed a well-chosen order if the dump had
    been done in parallel; in particular, this never happened for restore
    from custom-format dumps.
    
    * The best order for restore isn't necessarily the same as for dump,
    and it's not really static either because of locking considerations.
    
    * TABLE_DATA and INDEX items aren't the only things that might take a lot
    of work during restore.  Scheduling was particularly stupid for the BLOBS
    item, which might require lots of work during dump as well as restore,
    but was left to the end in either case.
    
    This patch removes the logic that changed the TOC order, fixing the
    test instability problem.  Instead, we sort the parallelizable items
    just before processing them during a parallel dump.  Independently
    of that, parallel restore prioritizes the ready-to-execute tasks
    based on the size of the underlying table.  In the case of dependent
    tasks such as index, constraint, or foreign key creation, the largest
    relevant table is used as the metric for estimating the task length.
    (This is pretty crude, but it should be enough to avoid the case we
    want to avoid, which is ending the run with just a few large tasks
    such that we can't make use of all N workers.)
    
    Patch by me, responding to a complaint from Peter Eisentraut,
    who also reviewed the patch.
    
    Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com
    548e5097
pg_backup_archiver.c 128 KB