Commit 33e52ad9 authored by Tomas Vondra's avatar Tomas Vondra

Fix ndistinct estimates with system attributes

When estimating the number of groups using extended statistics, the code
was discarding information about system attributes. This led to strange
situation that

    SELECT 1 FROM t GROUP BY ctid;

could have produced higher estimate (equal to pg_class.reltuples) than

    SELECT 1 FROM t GROUP BY a, b, ctid;

with extended statistics on (a,b). Fixed by retaining information about
the system attribute.

Backpatch all the way to 10, where extended statistics were introduced.

Author: Tomas Vondra
Backpatch-through: 10
parent a14a0118
......@@ -3987,10 +3987,10 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
attnum = ((Var *) varinfo->var)->varattno;
if (!AttrNumberIsForUserDefinedAttr(attnum))
if (AttrNumberIsForUserDefinedAttr(attnum) &&
bms_is_member(attnum, matched))
continue;
if (!bms_is_member(attnum, matched))
newlist = lappend(newlist, varinfo);
}
......
......@@ -260,7 +260,7 @@ SELECT s.stxkind, d.stxdndistinct
SELECT * FROM check_estimated_rows('SELECT COUNT(*) FROM ndistinct GROUP BY ctid, a, b');
estimated | actual
-----------+--------
11 | 1000
1000 | 1000
(1 row)
-- Hash Aggregate, thanks to estimates improved by the statistic
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment