• Tom Lane's avatar
    Improve planner's handling of set-returning functions in grouping columns. · df3a66e2
    Tom Lane authored
    Improve query_is_distinct_for() to accept SRFs in the targetlist when
    we can prove distinctness from a DISTINCT clause.  In that case the
    de-duplication will surely happen after SRF expansion, so the proof
    still works.  Continue to punt in the case where we'd try to prove
    distinctness from GROUP BY (or, in the future, source relations).
    To do that, we'd have to determine whether the SRFs were in the
    grouping columns or elsewhere in the tlist, and it still doesn't
    seem worth the trouble.  But this trivial change allows us to
    recognize that "SELECT DISTINCT unnest(foo) FROM ..." produces
    unique-ified output, which seems worth having.
    
    Also, fix estimate_num_groups() to consider the possibility of SRFs in
    the grouping columns.  Its failure to do so was masked before v10 because
    grouping_planner() scaled up plan rowcount estimates by the estimated SRF
    multiplier after performing grouping.  That doesn't happen anymore, which
    is more correct, but it means we need an adjustment in the estimate for
    the number of groups.  Failure to do this leads to an underestimate for
    the number of output rows of subqueries like "SELECT DISTINCT unnest(foo)"
    compared to what 9.6 and earlier estimated, thus breaking plan choices
    in some cases.
    
    Per report from Dmitry Shalashov.  Back-patch to v10 to avoid degraded
    plan choices compared to previous releases.
    
    Discussion: https://postgr.es/m/CAKPeCUGAeHgoh5O=SvcQxREVkoX7UdeJUMj1F5=aBNvoTa+O8w@mail.gmail.com
    df3a66e2
selfuncs.c 239 KB