• Tom Lane's avatar
    Fix bug in HashAgg's selective-column-spilling logic. · 0ff865fb
    Tom Lane authored
    Commit 23023022 taught nodeAgg.c that, when spilling tuples from
    memory in an oversized hash aggregation, it only needed to spill
    input columns referenced in the node's tlist and quals.  Unfortunately,
    that's wrong: we also have to save the grouping columns.  The error
    is masked in common cases because the grouping columns also appear
    in the tlist, but that's not necessarily true.  The main category
    of plans where it's not true seem to come from semijoins ("WHERE
    outercol IN (SELECT innercol FROM innertable)") where the innercol
    needs an implicit promotion to make it comparable to the outercol.
    The grouping column will be "innercol::promotedtype", but that
    expression appears nowhere in the Agg node's own tlist and quals;
    only the bare "innercol" is found in the tlist.
    
    I spent quite a bit of time looking for a suitable regression test
    case for this, without much success.  If the number of distinct
    values of the innercol is large enough to make spilling happen,
    the planner tends to prefer a non-HashAgg plan, at least for
    problem sizes that are reasonable to use in the regression tests.
    So, no new regression test.  However, this patch does demonstrably
    fix the originally-reported test case.
    
    Per report from s.p.e (at) gmx-topmail.de.  Backpatch to v13
    where the troublesome code came in.
    
    Discussion: https://postgr.es/m/trinity-1c565d44-159f-488b-a518-caf13883134f-1611835701633@3c-app-gmx-bap78
    0ff865fb
nodeAgg.c 148 KB