• Andres Freund's avatar
    Add parallel-aware hash joins. · 18042840
    Andres Freund authored
    Introduce parallel-aware hash joins that appear in EXPLAIN plans as Parallel
    Hash Join with Parallel Hash.  While hash joins could already appear in
    parallel queries, they were previously always parallel-oblivious and had a
    partial subplan only on the outer side, meaning that the work of the inner
    subplan was duplicated in every worker.
    
    After this commit, the planner will consider using a partial subplan on the
    inner side too, using the Parallel Hash node to divide the work over the
    available CPU cores and combine its results in shared memory.  If the join
    needs to be split into multiple batches in order to respect work_mem, then
    workers process different batches as much as possible and then work together
    on the remaining batches.
    
    The advantages of a parallel-aware hash join over a parallel-oblivious hash
    join used in a parallel query are that it:
    
     * avoids wasting memory on duplicated hash tables
     * avoids wasting disk space on duplicated batch files
     * divides the work of building the hash table over the CPUs
    
    One disadvantage is that there is some communication between the participating
    CPUs which might outweigh the benefits of parallelism in the case of small
    hash tables.  This is avoided by the planner's existing reluctance to supply
    partial plans for small scans, but it may be necessary to estimate
    synchronization costs in future if that situation changes.  Another is that
    outer batch 0 must be written to disk if multiple batches are required.
    
    A potential future advantage of parallel-aware hash joins is that right and
    full outer joins could be supported, since there is a single set of matched
    bits for each hashtable, but that is not yet implemented.
    
    A new GUC enable_parallel_hash is defined to control the feature, defaulting
    to on.
    
    Author: Thomas Munro
    Reviewed-By: Andres Freund, Robert Haas
    Tested-By: Rafia Sabih, Prabhat Sahu
    Discussion:
        https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com
        https://postgr.es/m/CAEepm=37HKyJ4U6XOLi=JgfSHM3o6B-GaeO-6hkOmneTDkH+Uw@mail.gmail.com
    18042840
readfuncs.c 55.4 KB