• Peter Geoghegan's avatar
    Add hash_mem_multiplier GUC. · d6c08e29
    Peter Geoghegan authored
    Add a GUC that acts as a multiplier on work_mem.  It gets applied when
    sizing executor node hash tables that were previously size constrained
    using work_mem alone.
    
    The new GUC can be used to preferentially give hash-based nodes more
    memory than the generic work_mem limit.  It is intended to enable admin
    tuning of the executor's memory usage.  Overall system throughput and
    system responsiveness can be improved by giving hash-based executor
    nodes more memory (especially over sort-based alternatives, which are
    often much less sensitive to being memory constrained).
    
    The default value for hash_mem_multiplier is 1.0, which is also the
    minimum valid value.  This means that hash-based nodes continue to apply
    work_mem in the traditional way by default.
    
    hash_mem_multiplier is generally useful.  However, it is being added now
    due to concerns about hash aggregate performance stability for users
    that upgrade to Postgres 13 (which added disk-based hash aggregation in
    commit 1f39bce0).  While the old hash aggregate behavior risked
    out-of-memory errors, it is nevertheless likely that many users actually
    benefited.  Hash agg's previous indifference to work_mem during query
    execution was not just faster; it also accidentally made aggregation
    resilient to grouping estimate problems (at least in cases where this
    didn't create destabilizing memory pressure).
    
    hash_mem_multiplier can provide a certain kind of continuity with the
    behavior of Postgres 12 hash aggregates in cases where the planner
    incorrectly estimates that all groups (plus related allocations) will
    fit in work_mem/hash_mem.  This seems necessary because hash-based
    aggregation is usually much slower when only a small fraction of all
    groups can fit.  Even when it isn't possible to totally avoid hash
    aggregates that spill, giving hash aggregation more memory will reliably
    improve performance (the same cannot be said for external sort
    operations, which appear to be almost unaffected by memory availability
    provided it's at least possible to get a single merge pass).
    
    The PostgreSQL 13 release notes should advise users that increasing
    hash_mem_multiplier can help with performance regressions associated
    with hash aggregation.  That can be taken care of by a later commit.
    
    Author: Peter Geoghegan
    Reviewed-By: Álvaro Herrera, Jeff Davis
    Discussion: https://postgr.es/m/20200625203629.7m6yvut7eqblgmfo@alap3.anarazel.de
    Discussion: https://postgr.es/m/CAH2-WzmD%2Bi1pG6rc1%2BCjc4V6EaFJ_qSuKCCHVnH%3DoruqD-zqow%40mail.gmail.com
    Backpatch: 13-, where disk-based hash aggregation was introduced.
    d6c08e29
guc.c 309 KB