• Tom Lane's avatar
    Improve speed of aggregates that use array_append as transition function. · 9a00f03e
    Tom Lane authored
    In the previous coding, if an aggregate's transition function returned an
    expanded array, nodeAgg.c and nodeWindowAgg.c would always copy it and thus
    force it into the flat representation.  This led to ping-ponging between
    flat and expanded formats, which costs a lot.  For an aggregate using
    array_append as transition function, I measured about a 15X slowdown
    compared to the pre-9.5 code, when working on simple int[] arrays.
    Of course, the old code was already O(N^2) in this usage due to copying
    flat arrays all the time, but it wasn't quite this inefficient.
    
    To fix, teach nodeAgg.c and nodeWindowAgg.c to allow expanded transition
    values without copying, so long as the transition function takes care to
    return the transition value already properly parented under the aggcontext.
    That puts a bit of extra responsibility on the transition function, but
    doing it this way allows us to not need any extra logic in the fast path
    of advance_transition_function (ie, with a pass-by-value transition value,
    or with a modified-in-place pass-by-reference value).  We already know
    that that's a hot spot so I'm loath to add any cycles at all there.  Also,
    while only array_append currently knows how to follow this convention,
    this solution allows other transition functions to opt-in without needing
    to have a whitelist in the core aggregation code.
    
    (The reason we would need a whitelist is that currently, if you pass a
    R/W expanded-object pointer to an arbitrary function, it's allowed to do
    anything with it including deleting it; that breaks the core agg code's
    assumption that it should free discarded values.  Returning a value under
    aggcontext is the transition function's signal that it knows it is an
    aggregate transition function and will play nice.  Possibly the API rules
    for expanded objects should be refined, but that would not be a
    back-patchable change.)
    
    With this fix, an aggregate using array_append is no longer O(N^2), so it's
    much faster than pre-9.5 code rather than much slower.  It's still a bit
    slower than the bespoke infrastructure for array_agg, but the differential
    seems to be only about 10%-20% rather than orders of magnitude.
    
    Discussion: <6315.1477677885@sss.pgh.pa.us>
    9a00f03e
nodeWindowAgg.c 85.8 KB