• Tom Lane's avatar
    Make the planner assume that the entries in a VALUES list are distinct. · 2b743036
    Tom Lane authored
    Previously, if we had to estimate the number of distinct values in a
    VALUES column, we fell back on the default behavior used whenever we lack
    statistics, which effectively is that there are Min(# of entries, 200)
    distinct values.  This can be very badly off with a large VALUES list,
    as noted by Jeff Janes.
    
    We could consider actually running an ANALYZE-like scan on the VALUES,
    but that seems unduly expensive, and anyway it could not deliver reliable
    info if the entries are not all constants.  What seems like a better choice
    is to assume that the values are all distinct.  This will sometimes be just
    as wrong as the old code, but it seems more likely to be more nearly right
    in many common cases.  Also, it is more consistent with what happens in
    some related cases, for example WHERE x = ANY(ARRAY[1,2,3,...,n]) and
    WHERE x = ANY(VALUES (1),(2),(3),...,(n)) now are estimated similarly.
    
    This was discussed some time ago, but consensus was it'd be better
    to slip it in at the start of a development cycle not near the end.
    (It should've gone into v10, really, but I forgot about it.)
    
    Discussion: https://postgr.es/m/CAMkU=1xHkyPa8VQgGcCNg3RMFFvVxUdOpus1gKcFuvVi0w6Acg@mail.gmail.com
    2b743036
selfuncs.c 233 KB