Commit 7300a699 authored by Tomas Vondra's avatar Tomas Vondra

Add support for multivariate MCV lists

Introduce a third extended statistic type, supported by the CREATE
STATISTICS command - MCV lists, a generalization of the statistic
already built and used for individual columns.

Compared to the already supported types (n-distinct coefficients and
functional dependencies), MCV lists are more complex, include column
values and allow estimation of much wider range of common clauses
(equality and inequality conditions, IS NULL, IS NOT NULL etc.).
Similarly to the other types, a new pseudo-type (pg_mcv_list) is used.

Author: Tomas Vondra
Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera
Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com
parent 333ed246
......@@ -6562,7 +6562,8 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
An array containing codes for the enabled statistic kinds;
valid values are:
<literal>d</literal> for n-distinct statistics,
<literal>f</literal> for functional dependency statistics
<literal>f</literal> for functional dependency statistics, and
<literal>m</literal> for most common values (MCV) list statistics
</entry>
</row>
......@@ -6585,6 +6586,16 @@ SCRAM-SHA-256$<replaceable>&lt;iteration count&gt;</replaceable>:<replaceable>&l
</entry>
</row>
<row>
<entry><structfield>stxmcv</structfield></entry>
<entry><type>pg_mcv_list</type></entry>
<entry></entry>
<entry>
MCV (most-common values) list statistics, serialized as
<structname>pg_mcv_list</structname> type.
</entry>
</row>
</tbody>
</tgroup>
</table>
......
......@@ -22174,4 +22174,86 @@ CREATE EVENT TRIGGER test_table_rewrite_oid
</sect2>
</sect1>
<sect1 id="functions-statistics">
<title>Statistics Information Functions</title>
<indexterm zone="functions-statistics">
<primary>function</primary>
<secondary>statistics</secondary>
</indexterm>
<para>
<productname>PostgreSQL</productname> provides a function to inspect complex
statistics defined using the <command>CREATE STATISTICS</command> command.
</para>
<sect2 id="functions-statistics-mcv">
<title>Inspecting MCV lists</title>
<indexterm>
<primary>pg_mcv_list_items</primary>
<secondary>pg_mcv_list</secondary>
</indexterm>
<para>
<function>pg_mcv_list_items</function> returns a list of all items
stored in a multi-column <acronym>MCV</acronym> list, and returns the
following columns:
<informaltable>
<tgroup cols="3">
<thead>
<row>
<entry>Name</entry>
<entry>Type</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>index</literal></entry>
<entry><type>int</type></entry>
<entry>index of the item in the <acronym>MCV</acronym> list</entry>
</row>
<row>
<entry><literal>values</literal></entry>
<entry><type>text[]</type></entry>
<entry>values stored in the MCV item</entry>
</row>
<row>
<entry><literal>nulls</literal></entry>
<entry><type>boolean[]</type></entry>
<entry>flags identifying <literal>NULL</literal> values</entry>
</row>
<row>
<entry><literal>frequency</literal></entry>
<entry><type>double precision</type></entry>
<entry>frequency of this <acronym>MCV</acronym> item</entry>
</row>
<row>
<entry><literal>base_frequency</literal></entry>
<entry><type>double precision</type></entry>
<entry>base frequency of this <acronym>MCV</acronym> item</entry>
</row>
</tbody>
</tgroup>
</informaltable>
</para>
<para>
The <function>pg_mcv_list_items</function> function can be used like this:
<programlisting>
SELECT m.* FROM pg_statistic_ext,
pg_mcv_list_items(stxmcv) m WHERE stxname = 'stts';
</programlisting>
Values of the <type>pg_mcv_list</type> can be obtained only from the
<literal>pg_statistic_ext.stxmcv</literal> column.
</para>
</sect2>
</sect1>
</chapter>
......@@ -1285,6 +1285,72 @@ nd | {"1, 2": 33178, "1, 5": 33178, "2, 5": 27435, "1, 2, 5": 33178}
plans. Otherwise, the <command>ANALYZE</command> cycles are just wasted.
</para>
</sect3>
<sect3>
<title>Multivariate MCV lists</title>
<para>
Another type of statistics stored for each column are most-common value
lists. This allows very accurate estimates for individual columns, but
may result in significant misestimates for queries with conditions on
multiple columns.
</para>
<para>
To improve such estimates, <command>ANALYZE</command> can collect MCV
lists on combinations of columns. Similarly to functional dependencies
and n-distinct coefficients, it's impractical to do this for every
possible column grouping. Even more so in this case, as the MCV list
(unlike functional dependencies and n-distinct coefficients) does store
the common column values. So data is collected only for those groups
of columns appearing together in a statistics object defined with the
<literal>mcv</literal> option.
</para>
<para>
Continuing the previous example, the MCV list for a table of ZIP codes
might look like the following (unlike for simpler types of statistics,
a function is required for inspection of MCV contents):
<programlisting>
CREATE STATISTICS stts3 (mcv) ON state, city FROM zipcodes;
ANALYZE zipcodes;
SELECT m.* FROM pg_statistic_ext,
pg_mcv_list_items(stxmcv) m WHERE stxname = 'stts3';
index | values | nulls | frequency | base_frequency
-------+------------------------+-------+-----------+----------------
0 | {Washington, DC} | {f,f} | 0.003467 | 2.7e-05
1 | {Apo, AE} | {f,f} | 0.003067 | 1.9e-05
2 | {Houston, TX} | {f,f} | 0.002167 | 0.000133
3 | {El Paso, TX} | {f,f} | 0.002 | 0.000113
4 | {New York, NY} | {f,f} | 0.001967 | 0.000114
5 | {Atlanta, GA} | {f,f} | 0.001633 | 3.3e-05
6 | {Sacramento, CA} | {f,f} | 0.001433 | 7.8e-05
7 | {Miami, FL} | {f,f} | 0.0014 | 6e-05
8 | {Dallas, TX} | {f,f} | 0.001367 | 8.8e-05
9 | {Chicago, IL} | {f,f} | 0.001333 | 5.1e-05
...
(99 rows)
</programlisting>
This indicates that the most common combination of city and state is
Washington in DC, with actual frequency (in the sample) about 0.35%.
The base frequency of the combination (as computed from the simple
per-column frequencies) is only 0.0027%, resulting in two orders of
magnitude under-estimates.
</para>
<para>
It's advisable to create <acronym>MCV</acronym> statistics objects only
on combinations of columns that are actually used in conditions together,
and for which misestimation of the number of groups is resulting in bad
plans. Otherwise, the <command>ANALYZE</command> and planning cycles
are just wasted.
</para>
</sect3>
</sect2>
</sect1>
......
......@@ -455,7 +455,7 @@ rows = (outer_cardinality * inner_cardinality) * selectivity
<secondary>multivariate</secondary>
</indexterm>
<sect2>
<sect2 id="functional-dependencies">
<title>Functional Dependencies</title>
<para>
......@@ -540,7 +540,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
</para>
</sect2>
<sect2>
<sect2 id="multivariate-ndistinct-counts">
<title>Multivariate N-Distinct Counts</title>
<para>
......@@ -585,6 +585,118 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
</para>
</sect2>
<sect2 id="mcv-lists">
<title>MCV lists</title>
<para>
As explained in <xref linkend="functional-dependencies"/>, functional
dependencies are very cheap and efficient type of statistics, but their
main limitation is their global nature (only tracking dependencies at
the column level, not between individual column values).
</para>
<para>
This section introduces multivariate variant of <acronym>MCV</acronym>
(most-common values) lists, a straightforward extension of the per-column
statistics described in <xref linkend="row-estimation-examples"/>. These
statistics address the limitation by storing individual values, but it is
naturally more expensive, both in terms of building the statistics in
<command>ANALYZE</command>, storage and planning time.
</para>
<para>
Let's look at the query from <xref linkend="functional-dependencies"/>
again, but this time with a <acronym>MCV</acronym> list created on the
same set of columns (be sure to drop the functional dependencies, to
make sure the planner uses the newly created statistics).
<programlisting>
DROP STATISTICS stts;
CREATE STATISTICS stts2 (mcv) ON a, b FROM t;
ANALYZE t;
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
QUERY PLAN
-------------------------------------------------------------------------------
Seq Scan on t (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1)
Filter: ((a = 1) AND (b = 1))
Rows Removed by Filter: 9900
</programlisting>
The estimate is as accurate as with the functional dependencies, mostly
thanks to the table being fairly small and having a simple distribution
with a low number of distinct values. Before looking at the second query,
which was not handled by functional dependencies particularly well,
let's inspect the <acronym>MCV</acronym> list a bit.
</para>
<para>
Inspecting the <acronym>MCV</acronym> list is possible using
<function>pg_mcv_list_items</function> set-returning function.
<programlisting>
SELECT m.* FROM pg_statistic_ext,
pg_mcv_list_items(stxmcv) m WHERE stxname = 'stts2';
index | values | nulls | frequency | base_frequency
-------+----------+-------+-----------+----------------
0 | {0, 0} | {f,f} | 0.01 | 0.0001
1 | {1, 1} | {f,f} | 0.01 | 0.0001
...
49 | {49, 49} | {f,f} | 0.01 | 0.0001
50 | {50, 50} | {f,f} | 0.01 | 0.0001
...
97 | {97, 97} | {f,f} | 0.01 | 0.0001
98 | {98, 98} | {f,f} | 0.01 | 0.0001
99 | {99, 99} | {f,f} | 0.01 | 0.0001
(100 rows)
</programlisting>
This confirms there are 100 distinct combinations in the two columns, and
all of them are about equally likely (1% frequency for each one). The
base frequency is the frequency computed from per-column statistics, as if
there were no multi-column statistics. Had there been any null values in
either of the columns, this would be identified in the
<structfield>nulls</structfield> column.
</para>
<para>
When estimating the selectivity, the planner applies all the conditions
on items in the <acronym>MCV</acronym> list, and then sums the frequencies
of the matching ones. See <function>mcv_clauselist_selectivity</function>
in <filename>src/backend/statistics/mcv.c</filename> for details.
</para>
<para>
Compared to functional dependencies, <acronym>MCV</acronym> lists have two
major advantages. Firstly, the list stores actual values, making it possible
to decide which combinations are compatible.
<programlisting>
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10;
QUERY PLAN
---------------------------------------------------------------------------
Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
Filter: ((a = 1) AND (b = 10))
Rows Removed by Filter: 10000
</programlisting>
Secondly, <acronym>MCV</acronym> lists handle a wider range of clause types,
not just equality clauses like functional dependencies. See for example the
example range query, presented earlier:
<programlisting>
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a &lt;= 49 AND b &gt; 49;
QUERY PLAN
---------------------------------------------------------------------------
Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
Filter: ((a &lt;= 49) AND (b &gt; 49))
Rows Removed by Filter: 10000
</programlisting>
</para>
</sect2>
</sect1>
<sect1 id="planner-stats-security">
......
......@@ -81,9 +81,10 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="parameter">statistics_na
<para>
A statistics kind to be computed in this statistics object.
Currently supported kinds are
<literal>ndistinct</literal>, which enables n-distinct statistics, and
<literal>ndistinct</literal>, which enables n-distinct statistics,
<literal>dependencies</literal>, which enables functional
dependency statistics.
dependency statistics, and <literal>mcv</literal> which enables
most-common values lists.
If this clause is omitted, all supported statistics kinds are
included in the statistics object.
For more information, see <xref linkend="planner-stats-extended"/>
......@@ -164,6 +165,36 @@ EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 0);
conditions are redundant and does not underestimate the row count.
</para>
<para>
Create table <structname>t2</structname> with two perfectly correlated columns
(containing identical data), and a MCV list on those columns:
<programlisting>
CREATE TABLE t2 (
a int,
b int
);
INSERT INTO t2 SELECT mod(i,100), mod(i,100)
FROM generate_series(1,1000000) s(i);
CREATE STATISTICS s2 (mcv) ON (a, b) FROM t2;
ANALYZE t2;
-- valid combination (found in MCV)
EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 1);
-- invalid combination (not found in MCV)
EXPLAIN ANALYZE SELECT * FROM t2 WHERE (a = 1) AND (b = 2);
</programlisting>
The MCV list gives the planner more detailed information about the
specific values that commonly appear in the table, as well as an upper
bound on the selectivities of combinations of values that do not appear
in the table, allowing it to generate better estimates in both cases.
</para>
</refsect1>
<refsect1>
......
......@@ -73,11 +73,12 @@ CreateStatistics(CreateStatsStmt *stmt)
Oid relid;
ObjectAddress parentobject,
myself;
Datum types[2]; /* one for each possible type of statistic */
Datum types[3]; /* one for each possible type of statistic */
int ntypes;
ArrayType *stxkind;
bool build_ndistinct;
bool build_dependencies;
bool build_mcv;
bool requested_type = false;
int i;
ListCell *cell;
......@@ -272,6 +273,7 @@ CreateStatistics(CreateStatsStmt *stmt)
*/
build_ndistinct = false;
build_dependencies = false;
build_mcv = false;
foreach(cell, stmt->stat_types)
{
char *type = strVal((Value *) lfirst(cell));
......@@ -286,6 +288,11 @@ CreateStatistics(CreateStatsStmt *stmt)
build_dependencies = true;
requested_type = true;
}
else if (strcmp(type, "mcv") == 0)
{
build_mcv = true;
requested_type = true;
}
else
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
......@@ -297,6 +304,7 @@ CreateStatistics(CreateStatsStmt *stmt)
{
build_ndistinct = true;
build_dependencies = true;
build_mcv = true;
}
/* construct the char array of enabled statistic types */
......@@ -305,6 +313,8 @@ CreateStatistics(CreateStatsStmt *stmt)
types[ntypes++] = CharGetDatum(STATS_EXT_NDISTINCT);
if (build_dependencies)
types[ntypes++] = CharGetDatum(STATS_EXT_DEPENDENCIES);
if (build_mcv)
types[ntypes++] = CharGetDatum(STATS_EXT_MCV);
Assert(ntypes > 0 && ntypes <= lengthof(types));
stxkind = construct_array(types, ntypes, CHAROID, 1, true, 'c');
......@@ -329,6 +339,7 @@ CreateStatistics(CreateStatsStmt *stmt)
/* no statistics built yet */
nulls[Anum_pg_statistic_ext_stxndistinct - 1] = true;
nulls[Anum_pg_statistic_ext_stxdependencies - 1] = true;
nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
/* insert it into pg_statistic_ext */
htup = heap_form_tuple(statrel->rd_att, values, nulls);
......@@ -424,23 +435,72 @@ RemoveStatisticsById(Oid statsOid)
* null until the next ANALYZE. (Note that the type change hasn't actually
* happened yet, so one option that's *not* on the table is to recompute
* immediately.)
*
* For both ndistinct and functional-dependencies stats, the on-disk
* representation is independent of the source column data types, and it is
* plausible to assume that the old statistic values will still be good for
* the new column contents. (Obviously, if the ALTER COLUMN TYPE has a USING
* expression that substantially alters the semantic meaning of the column
* values, this assumption could fail. But that seems like a corner case
* that doesn't justify zapping the stats in common cases.)
*
* For MCV lists that's not the case, as those statistics store the datums
* internally. In this case we simply reset the statistics value to NULL.
*/
void
UpdateStatisticsForTypeChange(Oid statsOid, Oid relationOid, int attnum,
Oid oldColumnType, Oid newColumnType)
{
HeapTuple stup,
oldtup;
Relation rel;
Datum values[Natts_pg_statistic_ext];
bool nulls[Natts_pg_statistic_ext];
bool replaces[Natts_pg_statistic_ext];
oldtup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statsOid));
if (!oldtup)
elog(ERROR, "cache lookup failed for statistics object %u", statsOid);
/*
* Currently, we don't actually need to do anything here. For both
* ndistinct and functional-dependencies stats, the on-disk representation
* is independent of the source column data types, and it is plausible to
* assume that the old statistic values will still be good for the new
* column contents. (Obviously, if the ALTER COLUMN TYPE has a USING
* expression that substantially alters the semantic meaning of the column
* values, this assumption could fail. But that seems like a corner case
* that doesn't justify zapping the stats in common cases.)
*
* Future types of extended stats will likely require us to work harder.
* When none of the defined statistics types contain datum values
* from the table's columns then there's no need to reset the stats.
* Functional dependencies and ndistinct stats should still hold true.
*/
if (!statext_is_kind_built(oldtup, STATS_EXT_MCV))
{
ReleaseSysCache(oldtup);
return;
}
/*
* OK, we need to reset some statistics. So let's build the new tuple,
* replacing the affected statistics types with NULL.
*/
memset(nulls, 0, Natts_pg_statistic_ext * sizeof(bool));
memset(replaces, 0, Natts_pg_statistic_ext * sizeof(bool));
memset(values, 0, Natts_pg_statistic_ext * sizeof(Datum));
replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
nulls[Anum_pg_statistic_ext_stxmcv - 1] = true;
rel = heap_open(StatisticExtRelationId, RowExclusiveLock);
/* replace the old tuple */
stup = heap_modify_tuple(oldtup,
RelationGetDescr(rel),
values,
nulls,
replaces);
ReleaseSysCache(oldtup);
CatalogTupleUpdate(rel, &stup->t_self, stup);
heap_freetuple(stup);
heap_close(rel, RowExclusiveLock);
}
/*
......
......@@ -443,6 +443,50 @@ bms_is_member(int x, const Bitmapset *a)
return false;
}
/*
* bms_member_index
* determine 0-based index of member x in the bitmap
*
* Returns (-1) when x is not a member.
*/
int
bms_member_index(Bitmapset *a, int x)
{
int i;
int bitnum;
int wordnum;
int result = 0;
bitmapword mask;
/* return -1 if not a member of the bitmap */
if (!bms_is_member(x, a))
return -1;
wordnum = WORDNUM(x);
bitnum = BITNUM(x);
/* count bits in preceding words */
for (i = 0; i < wordnum; i++)
{
bitmapword w = a->words[i];
/* No need to count the bits in a zero word */
if (w != 0)
result += bmw_popcount(w);
}
/*
* Now add bits of the last word, but only those before the item.
* We can do that by applying a mask and then using popcount again.
* To get 0-based index, we want to count only preceding bits, not
* the item itself, so we subtract 1.
*/
mask = ((bitmapword) 1 << bitnum) - 1;
result += bmw_popcount(a->words[wordnum] & mask);
return result;
}
/*
* bms_overlap - do sets overlap (ie, have a nonempty intersection)?
*/
......
......@@ -60,17 +60,67 @@ static RelOptInfo *find_single_rel_for_clauses(PlannerInfo *root,
*
* See clause_selectivity() for the meaning of the additional parameters.
*
* The basic approach is to apply extended statistics first, on as many
* clauses as possible, in order to capture cross-column dependencies etc.
* The remaining clauses are then estimated using regular statistics tracked
* for individual columns. This is done by simply passing the clauses to
* clauselist_selectivity_simple.
*/
Selectivity
clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
SpecialJoinInfo *sjinfo)
{
Selectivity s1 = 1.0;
RelOptInfo *rel;
Bitmapset *estimatedclauses = NULL;
/*
* Determine if these clauses reference a single relation. If so, and if
* it has extended statistics, try to apply those.
*/
rel = find_single_rel_for_clauses(root, clauses);
if (rel && rel->rtekind == RTE_RELATION && rel->statlist != NIL)
{
/*
* Estimate as many clauses as possible using extended statistics.
*
* 'estimatedclauses' tracks the 0-based list position index of
* clauses that we've estimated using extended statistics, and that
* should be ignored.
*/
s1 *= statext_clauselist_selectivity(root, clauses, varRelid,
jointype, sjinfo, rel,
&estimatedclauses);
}
/*
* Apply normal selectivity estimates for the remaining clauses, passing
* 'estimatedclauses' so that it skips already estimated ones.
*/
return s1 * clauselist_selectivity_simple(root, clauses, varRelid,
jointype, sjinfo,
estimatedclauses);
}
/*
* clauselist_selectivity_simple -
* Compute the selectivity of an implicitly-ANDed list of boolean
* expression clauses. The list can be empty, in which case 1.0
* must be returned. List elements may be either RestrictInfos
* or bare expression clauses --- the former is preferred since
* it allows caching of results. The estimatedclauses bitmap tracks
* clauses that have already been estimated by other means.
*
* See clause_selectivity() for the meaning of the additional parameters.
*
* Our basic approach is to take the product of the selectivities of the
* subclauses. However, that's only right if the subclauses have independent
* probabilities, and in reality they are often NOT independent. So,
* we want to be smarter where we can.
*
* If the clauses taken together refer to just one relation, we'll try to
* apply selectivity estimates using any extended statistics for that rel.
* Currently we only have (soft) functional dependencies, so apply these in as
* many cases as possible, and fall back on normal estimates for remaining
* clauses.
*
* We also recognize "range queries", such as "x > 34 AND x < 42". Clauses
* are recognized as possible range query components if they are restriction
* opclauses whose operators have scalarltsel or a related function as their
......@@ -98,54 +148,29 @@ static RelOptInfo *find_single_rel_for_clauses(PlannerInfo *root,
* selectivity functions; perhaps some day we can generalize the approach.
*/
Selectivity
clauselist_selectivity(PlannerInfo *root,
clauselist_selectivity_simple(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
SpecialJoinInfo *sjinfo)
SpecialJoinInfo *sjinfo,
Bitmapset *estimatedclauses)
{
Selectivity s1 = 1.0;
RelOptInfo *rel;
Bitmapset *estimatedclauses = NULL;
RangeQueryClause *rqlist = NULL;
ListCell *l;
int listidx;
/*
* If there's exactly one clause, just go directly to
* clause_selectivity(). None of what we might do below is relevant.
* If there's exactly one clause (and it was not estimated yet), just
* go directly to clause_selectivity(). None of what we might do below
* is relevant.
*/
if (list_length(clauses) == 1)
if ((list_length(clauses) == 1) &&
bms_num_members(estimatedclauses) == 0)
return clause_selectivity(root, (Node *) linitial(clauses),
varRelid, jointype, sjinfo);
/*
* Determine if these clauses reference a single relation. If so, and if
* it has extended statistics, try to apply those.
*/
rel = find_single_rel_for_clauses(root, clauses);
if (rel && rel->rtekind == RTE_RELATION && rel->statlist != NIL)
{
/*
* Perform selectivity estimations on any clauses found applicable by
* dependencies_clauselist_selectivity. 'estimatedclauses' will be
* filled with the 0-based list positions of clauses used that way, so
* that we can ignore them below.
*/
s1 *= dependencies_clauselist_selectivity(root, clauses, varRelid,
jointype, sjinfo, rel,
&estimatedclauses);
/*
* This would be the place to apply any other types of extended
* statistics selectivity estimations for remaining clauses.
*/
}
/*
* Apply normal selectivity estimates for remaining clauses. We'll be
* careful to skip any clauses which were already estimated above.
*
* Anything that doesn't look like a potential rangequery clause gets
* multiplied into s1 and forgotten. Anything that does gets inserted into
* an rqlist entry.
......
......@@ -1363,6 +1363,18 @@ get_relation_statistics(RelOptInfo *rel, Relation relation)
stainfos = lcons(info, stainfos);
}
if (statext_is_kind_built(htup, STATS_EXT_MCV))
{
StatisticExtInfo *info = makeNode(StatisticExtInfo);
info->statOid = statOid;
info->rel = rel;
info->kind = STATS_EXT_MCV;
info->keys = bms_copy(keys);
stainfos = lcons(info, stainfos);
}
ReleaseSysCache(htup);
bms_free(keys);
}
......
......@@ -1652,6 +1652,8 @@ generateClonedExtStatsStmt(RangeVar *heapRel, Oid heapRelid,
stat_types = lappend(stat_types, makeString("ndistinct"));
else if (enabled[i] == STATS_EXT_DEPENDENCIES)
stat_types = lappend(stat_types, makeString("dependencies"));
else if (enabled[i] == STATS_EXT_MCV)
stat_types = lappend(stat_types, makeString("mcv"));
else
elog(ERROR, "unrecognized statistics kind %c", enabled[i]);
}
......
......@@ -12,6 +12,6 @@ subdir = src/backend/statistics
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
OBJS = extended_stats.o dependencies.o mvdistinct.o
OBJS = extended_stats.o dependencies.o mcv.o mvdistinct.o
include $(top_srcdir)/src/backend/common.mk
......@@ -18,6 +18,8 @@ There are currently two kinds of extended statistics:
(b) soft functional dependencies (README.dependencies)
(c) MCV lists (README.mcv)
Compatible clause types
-----------------------
......@@ -26,6 +28,8 @@ Each type of statistics may be used to estimate some subset of clause types.
(a) functional dependencies - equality clauses (AND), possibly IS NULL
(b) MCV lists - equality and inequality clauses (AND, OR, NOT), IS [NOT] NULL
Currently, only OpExprs in the form Var op Const, or Const op Var are
supported, however it's feasible to expand the code later to also estimate the
selectivities on clauses such as Var op Var.
......
MCV lists
=========
Multivariate MCV (most-common values) lists are a straightforward extension of
regular MCV list, tracking most frequent combinations of values for a group of
attributes.
This works particularly well for columns with a small number of distinct values,
as the list may include all the combinations and approximate the distribution
very accurately.
For columns with a large number of distinct values (e.g. those with continuous
domains), the list will only track the most frequent combinations. If the
distribution is mostly uniform (all combinations about equally frequent), the
MCV list will be empty.
Estimates of some clauses (e.g. equality) based on MCV lists are more accurate
than when using histograms.
Also, MCV lists don't necessarily require sorting of the values (the fact that
we use sorting when building them is implementation detail), but even more
importantly the ordering is not built into the approximation (while histograms
are built on ordering). So MCV lists work well even for attributes where the
ordering of the data type is disconnected from the meaning of the data. For
example we know how to sort strings, but it's unlikely to make much sense for
city names (or other label-like attributes).
Selectivity estimation
----------------------
The estimation, implemented in mcv_clauselist_selectivity(), is quite simple
in principle - we need to identify MCV items matching all the clauses and sum
frequencies of all those items.
Currently MCV lists support estimation of the following clause types:
(a) equality clauses WHERE (a = 1) AND (b = 2)
(b) inequality clauses WHERE (a < 1) AND (b >= 2)
(c) NULL clauses WHERE (a IS NULL) AND (b IS NOT NULL)
(d) OR clauses WHERE (a < 1) OR (b >= 2)
It's possible to add support for additional clauses, for example:
(e) multi-var clauses WHERE (a > b)
and possibly others. These are tasks for the future, not yet implemented.
Hashed MCV (not yet implemented)
--------------------------------
Regular MCV lists have to include actual values for each item, so if those items
are large the list may be quite large. This is especially true for multivariate
MCV lists, although the current implementation partially mitigates this by
performing de-duplicating the values before storing them on disk.
It's possible to only store hashes (32-bit values) instead of the actual values,
significantly reducing the space requirements. Obviously, this would only make
the MCV lists useful for estimating equality conditions (assuming the 32-bit
hashes make the collisions rare enough).
This might also complicate matching the columns to available stats.
TODO Consider implementing hashed MCV list, storing just 32-bit hashes instead
of the actual values. This type of MCV list will be useful only for
estimating equality clauses, and will reduce space requirements for large
varlena types (in such cases we usually only want equality anyway).
Inspecting the MCV list
-----------------------
Inspecting the regular (per-attribute) MCV lists is trivial, as it's enough
to select the columns from pg_stats. The data is encoded as anyarrays, and
all the items have the same data type, so anyarray provides a simple way to
get a text representation.
With multivariate MCV lists the columns may use different data types, making
it impossible to use anyarrays. It might be possible to produce a similar
array-like representation, but that would complicate further processing and
analysis of the MCV list.
So instead the MCV lists are stored in a custom data type (pg_mcv_list),
which however makes it more difficult to inspect the contents. To make that
easier, there's a SRF returning detailed information about the MCV lists.
SELECT m.* FROM pg_statistic_ext,
pg_mcv_list_items(stxmcv) m WHERE stxname = 'stts2';
It accepts one parameter - a pg_mcv_list value (which can only be obtained
from pg_statistic_ext catalog, to defend against malicious input), and
returns these columns:
- item index (0, ..., (nitems-1))
- values (string array)
- nulls only (boolean array)
- frequency (double precision)
- base_frequency (double precision)
......@@ -202,13 +202,12 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
VacAttrStats **stats, Bitmapset *attrs)
{
int i,
j;
int nvalues = numrows * k;
nitems;
MultiSortSupport mss;
SortItem *items;
Datum *values;
bool *isnull;
int *attnums;
AttrNumber *attnums;
AttrNumber *attnums_dep;
int numattrs;
/* counters valid within a group */
int group_size = 0;
......@@ -223,26 +222,16 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
/* sort info for all attributes columns */
mss = multi_sort_init(k);
/* data for the sort */
items = (SortItem *) palloc(numrows * sizeof(SortItem));
values = (Datum *) palloc(sizeof(Datum) * nvalues);
isnull = (bool *) palloc(sizeof(bool) * nvalues);
/* fix the pointers to values/isnull */
for (i = 0; i < numrows; i++)
{
items[i].values = &values[i * k];
items[i].isnull = &isnull[i * k];
}
/*
* Transform the bms into an array, to make accessing i-th member easier.
* Transform the attrs from bitmap to an array to make accessing the i-th
* member easier, and then construct a filtered version with only attnums
* referenced by the dependency we validate.
*/
attnums = (int *) palloc(sizeof(int) * bms_num_members(attrs));
i = 0;
j = -1;
while ((j = bms_next_member(attrs, j)) >= 0)
attnums[i++] = j;
attnums = build_attnums_array(attrs, &numattrs);
attnums_dep = (AttrNumber *) palloc(k * sizeof(AttrNumber));
for (i = 0; i < k; i++)
attnums_dep[i] = attnums[dependency[i]];
/*
* Verify the dependency (a,b,...)->z, using a rather simple algorithm:
......@@ -257,7 +246,7 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
* perhaps at some point it'd be worth using column-specific collations?
*/
/* prepare the sort function for the first dimension, and SortItem array */
/* prepare the sort function for the dimensions */
for (i = 0; i < k; i++)
{
VacAttrStats *colstat = stats[dependency[i]];
......@@ -270,19 +259,17 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
/* prepare the sort function for this dimension */
multi_sort_add_dimension(mss, i, type->lt_opr, type->typcollation);
/* accumulate all the data for both columns into an array and sort it */
for (j = 0; j < numrows; j++)
{
items[j].values[i] =
heap_getattr(rows[j], attnums[dependency[i]],
stats[i]->tupDesc, &items[j].isnull[i]);
}
}
/* sort the items so that we can detect the groups */
qsort_arg((void *) items, numrows, sizeof(SortItem),
multi_sort_compare, mss);
/*
* build an array of SortItem(s) sorted using the multi-sort support
*
* XXX This relies on all stats entries pointing to the same tuple
* descriptor. For now that assumption holds, but it might change in
* the future for example if we support statistics on multiple tables.
*/
items = build_sorted_items(numrows, &nitems, rows, stats[0]->tupDesc,
mss, k, attnums_dep);
/*
* Walk through the sorted array, split it into rows according to the
......@@ -295,14 +282,14 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
group_size = 1;
/* loop 1 beyond the end of the array so that we count the final group */
for (i = 1; i <= numrows; i++)
for (i = 1; i <= nitems; i++)
{
/*
* Check if the group ended, which may be either because we processed
* all the items (i==numrows), or because the i-th item is not equal
* all the items (i==nitems), or because the i-th item is not equal
* to the preceding one.
*/
if (i == numrows ||
if (i == nitems ||
multi_sort_compare_dims(0, k - 2, &items[i - 1], &items[i], mss) != 0)
{
/*
......@@ -324,10 +311,12 @@ dependency_degree(int numrows, HeapTuple *rows, int k, AttrNumber *dependency,
group_size++;
}
if (items)
pfree(items);
pfree(values);
pfree(isnull);
pfree(mss);
pfree(attnums);
pfree(attnums_dep);
/* Compute the 'degree of validity' as (supporting/total). */
return (n_supporting_rows * 1.0 / numrows);
......@@ -354,24 +343,17 @@ statext_dependencies_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
VacAttrStats **stats)
{
int i,
j,
k;
int numattrs;
int *attnums;
AttrNumber *attnums;
/* result */
MVDependencies *dependencies = NULL;
numattrs = bms_num_members(attrs);
/*
* Transform the bms into an array, to make accessing i-th member easier.
*/
attnums = palloc(sizeof(int) * bms_num_members(attrs));
i = 0;
j = -1;
while ((j = bms_next_member(attrs, j)) >= 0)
attnums[i++] = j;
attnums = build_attnums_array(attrs, &numattrs);
Assert(numattrs >= 2);
......@@ -918,9 +900,9 @@ find_strongest_dependency(StatisticExtInfo *stats, MVDependencies *dependencies,
* using functional dependency statistics, or 1.0 if no useful functional
* dependency statistic exists.
*
* 'estimatedclauses' is an output argument that gets a bit set corresponding
* to the (zero-based) list index of each clause that is included in the
* estimated selectivity.
* 'estimatedclauses' is an input/output argument that gets a bit set
* corresponding to the (zero-based) list index of each clause that is included
* in the estimated selectivity.
*
* Given equality clauses on attributes (a,b) we find the strongest dependency
* between them, i.e. either (a=>b) or (b=>a). Assuming (a=>b) is the selected
......@@ -955,9 +937,6 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
AttrNumber *list_attnums;
int listidx;
/* initialize output argument */
*estimatedclauses = NULL;
/* check if there's any stats that might be useful for us. */
if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
return 1.0;
......@@ -972,6 +951,9 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
* the attnums for each clause in a list which we'll reference later so we
* don't need to repeat the same work again. We'll also keep track of all
* attnums seen.
*
* We also skip clauses that we already estimated using different types of
* statistics (we treat them as incompatible).
*/
listidx = 0;
foreach(l, clauses)
......@@ -979,7 +961,8 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
Node *clause = (Node *) lfirst(l);
AttrNumber attnum;
if (dependency_is_compatible_clause(clause, rel->relid, &attnum))
if (!bms_is_member(listidx, *estimatedclauses) &&
dependency_is_compatible_clause(clause, rel->relid, &attnum))
{
list_attnums[listidx] = attnum;
clauses_attnums = bms_add_member(clauses_attnums, attnum);
......@@ -1049,8 +1032,7 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
/*
* Skip incompatible clauses, and ones we've already estimated on.
*/
if (list_attnums[listidx] == InvalidAttrNumber ||
bms_is_member(listidx, *estimatedclauses))
if (list_attnums[listidx] == InvalidAttrNumber)
continue;
/*
......
......@@ -19,9 +19,13 @@
#include "access/genam.h"
#include "access/htup_details.h"
#include "access/table.h"
#include "access/tuptoaster.h"
#include "catalog/indexing.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_statistic_ext.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/optimizer.h"
#include "postmaster/autovacuum.h"
#include "statistics/extended_stats_internal.h"
#include "statistics/statistics.h"
......@@ -30,8 +34,19 @@
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/rel.h"
#include "utils/selfuncs.h"
#include "utils/syscache.h"
/*
* To avoid consuming too much memory during analysis and/or too much space
* in the resulting pg_statistic rows, we ignore varlena datums that are wider
* than WIDTH_THRESHOLD (after detoasting!). This is legitimate for MCV
* and distinct-value calculations since a wide value is unlikely to be
* duplicated at all, much less be a most-common value. For the same reason,
* ignoring wide values will not affect our estimates of histogram bin
* boundaries very much.
*/
#define WIDTH_THRESHOLD 1024
/*
* Used internally to refer to an individual statistics object, i.e.,
......@@ -52,7 +67,7 @@ static VacAttrStats **lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
int nvacatts, VacAttrStats **vacatts);
static void statext_store(Relation pg_stext, Oid relid,
MVNDistinct *ndistinct, MVDependencies *dependencies,
VacAttrStats **stats);
MCVList * mcvlist, VacAttrStats **stats);
/*
......@@ -86,6 +101,7 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
StatExtEntry *stat = (StatExtEntry *) lfirst(lc);
MVNDistinct *ndistinct = NULL;
MVDependencies *dependencies = NULL;
MCVList *mcv = NULL;
VacAttrStats **stats;
ListCell *lc2;
......@@ -123,10 +139,13 @@ BuildRelationExtStatistics(Relation onerel, double totalrows,
else if (t == STATS_EXT_DEPENDENCIES)
dependencies = statext_dependencies_build(numrows, rows,
stat->columns, stats);
else if (t == STATS_EXT_MCV)
mcv = statext_mcv_build(numrows, rows, stat->columns, stats,
totalrows);
}
/* store the statistics in the catalog */
statext_store(pg_stext, stat->statOid, ndistinct, dependencies, stats);
statext_store(pg_stext, stat->statOid, ndistinct, dependencies, mcv, stats);
}
table_close(pg_stext, RowExclusiveLock);
......@@ -154,6 +173,10 @@ statext_is_kind_built(HeapTuple htup, char type)
attnum = Anum_pg_statistic_ext_stxdependencies;
break;
case STATS_EXT_MCV:
attnum = Anum_pg_statistic_ext_stxmcv;
break;
default:
elog(ERROR, "unexpected statistics type requested: %d", type);
}
......@@ -218,7 +241,8 @@ fetch_statentries_for_relation(Relation pg_statext, Oid relid)
for (i = 0; i < ARR_DIMS(arr)[0]; i++)
{
Assert((enabled[i] == STATS_EXT_NDISTINCT) ||
(enabled[i] == STATS_EXT_DEPENDENCIES));
(enabled[i] == STATS_EXT_DEPENDENCIES) ||
(enabled[i] == STATS_EXT_MCV));
entry->types = lappend_int(entry->types, (int) enabled[i]);
}
......@@ -293,7 +317,7 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs,
static void
statext_store(Relation pg_stext, Oid statOid,
MVNDistinct *ndistinct, MVDependencies *dependencies,
VacAttrStats **stats)
MCVList * mcv, VacAttrStats **stats)
{
HeapTuple stup,
oldtup;
......@@ -324,9 +348,18 @@ statext_store(Relation pg_stext, Oid statOid,
values[Anum_pg_statistic_ext_stxdependencies - 1] = PointerGetDatum(data);
}
if (mcv != NULL)
{
bytea *data = statext_mcv_serialize(mcv, stats);
nulls[Anum_pg_statistic_ext_stxmcv - 1] = (data == NULL);
values[Anum_pg_statistic_ext_stxmcv - 1] = PointerGetDatum(data);
}
/* always replace the value (either by bytea or NULL) */
replaces[Anum_pg_statistic_ext_stxndistinct - 1] = true;
replaces[Anum_pg_statistic_ext_stxdependencies - 1] = true;
replaces[Anum_pg_statistic_ext_stxmcv - 1] = true;
/* there should already be a pg_statistic_ext tuple */
oldtup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statOid));
......@@ -433,6 +466,202 @@ multi_sort_compare_dims(int start, int end,
return 0;
}
int
compare_scalars_simple(const void *a, const void *b, void *arg)
{
return compare_datums_simple(*(Datum *) a,
*(Datum *) b,
(SortSupport) arg);
}
int
compare_datums_simple(Datum a, Datum b, SortSupport ssup)
{
return ApplySortComparator(a, false, b, false, ssup);
}
/* simple counterpart to qsort_arg */
void *
bsearch_arg(const void *key, const void *base, size_t nmemb, size_t size,
int (*compar) (const void *, const void *, void *),
void *arg)
{
size_t l,
u,
idx;
const void *p;
int comparison;
l = 0;
u = nmemb;
while (l < u)
{
idx = (l + u) / 2;
p = (void *) (((const char *) base) + (idx * size));
comparison = (*compar) (key, p, arg);
if (comparison < 0)
u = idx;
else if (comparison > 0)
l = idx + 1;
else
return (void *) p;
}
return NULL;
}
/*
* build_attnums_array
* Transforms a bitmap into an array of AttrNumber values.
*
* This is used for extended statistics only, so all the attribute must be
* user-defined. That means offsetting by FirstLowInvalidHeapAttributeNumber
* is not necessary here (and when querying the bitmap).
*/
AttrNumber *
build_attnums_array(Bitmapset *attrs, int *numattrs)
{
int i,
j;
AttrNumber *attnums;
int num = bms_num_members(attrs);
if (numattrs)
*numattrs = num;
/* build attnums from the bitmapset */
attnums = (AttrNumber *) palloc(sizeof(AttrNumber) * num);
i = 0;
j = -1;
while ((j = bms_next_member(attrs, j)) >= 0)
{
/*
* Make sure the bitmap contains only user-defined attributes. As
* bitmaps can't contain negative values, this can be violated in
* two ways. Firstly, the bitmap might contain 0 as a member, and
* secondly the integer value might be larger than MaxAttrNumber.
*/
Assert(AttrNumberIsForUserDefinedAttr(j));
Assert(j <= MaxAttrNumber);
attnums[i++] = (AttrNumber) j;
/* protect against overflows */
Assert(i <= num);
}
return attnums;
}
/*
* build_sorted_items
* build a sorted array of SortItem with values from rows
*
* Note: All the memory is allocated in a single chunk, so that the caller
* can simply pfree the return value to release all of it.
*/
SortItem *
build_sorted_items(int numrows, int *nitems, HeapTuple *rows, TupleDesc tdesc,
MultiSortSupport mss, int numattrs, AttrNumber *attnums)
{
int i,
j,
len,
idx;
int nvalues = numrows * numattrs;
SortItem *items;
Datum *values;
bool *isnull;
char *ptr;
/* Compute the total amount of memory we need (both items and values). */
len = numrows * sizeof(SortItem) + nvalues * (sizeof(Datum) + sizeof(bool));
/* Allocate the memory and split it into the pieces. */
ptr = palloc0(len);
/* items to sort */
items = (SortItem *) ptr;
ptr += numrows * sizeof(SortItem);
/* values and null flags */
values = (Datum *) ptr;
ptr += nvalues * sizeof(Datum);
isnull = (bool *) ptr;
ptr += nvalues * sizeof(bool);
/* make sure we consumed the whole buffer exactly */
Assert((ptr - (char *) items) == len);
/* fix the pointers to Datum and bool arrays */
idx = 0;
for (i = 0; i < numrows; i++)
{
bool toowide = false;
items[idx].values = &values[idx * numattrs];
items[idx].isnull = &isnull[idx * numattrs];
/* load the values/null flags from sample rows */
for (j = 0; j < numattrs; j++)
{
Datum value;
bool isnull;
value = heap_getattr(rows[i], attnums[j], tdesc, &isnull);
/*
* If this is a varlena value, check if it's too wide and if yes
* then skip the whole item. Otherwise detoast the value.
*
* XXX It may happen that we've already detoasted some preceding
* values for the current item. We don't bother to cleanup those
* on the assumption that those are small (below WIDTH_THRESHOLD)
* and will be discarded at the end of analyze.
*/
if ((!isnull) &&
(TupleDescAttr(tdesc, attnums[j] - 1)->attlen == -1))
{
if (toast_raw_datum_size(value) > WIDTH_THRESHOLD)
{
toowide = true;
break;
}
value = PointerGetDatum(PG_DETOAST_DATUM(value));
}
items[idx].values[j] = value;
items[idx].isnull[j] = isnull;
}
if (toowide)
continue;
idx++;
}
/* store the actual number of items (ignoring the too-wide ones) */
*nitems = idx;
/* all items were too wide */
if (idx == 0)
{
/* everything is allocated as a single chunk */
pfree(items);
return NULL;
}
/* do the sort, using the multi-sort */
qsort_arg((void *) items, idx, sizeof(SortItem),
multi_sort_compare, mss);
return items;
}
/*
* has_stats_of_kind
* Check whether the list contains statistic of a given kind
......@@ -463,7 +692,7 @@ has_stats_of_kind(List *stats, char requiredkind)
* object referencing the most of the requested attributes, breaking ties
* in favor of objects with fewer keys overall.
*
* XXX if multiple statistics objects tie on both criteria, then which object
* XXX If multiple statistics objects tie on both criteria, then which object
* is chosen depends on the order that they appear in the stats list. Perhaps
* further tiebreakers are needed.
*/
......@@ -513,3 +742,391 @@ choose_best_statistics(List *stats, Bitmapset *attnums, char requiredkind)
return best_match;
}
/*
* statext_is_compatible_clause_internal
* Determines if the clause is compatible with MCV lists.
*
* Does the heavy lifting of actually inspecting the clauses for
* statext_is_compatible_clause. It needs to be split like this because
* of recursion. The attnums bitmap is an input/output parameter collecting
* attribute numbers from all compatible clauses (recursively).
*/
static bool
statext_is_compatible_clause_internal(Node *clause, Index relid, Bitmapset **attnums)
{
/* Look inside any binary-compatible relabeling (as in examine_variable) */
if (IsA(clause, RelabelType))
clause = (Node *) ((RelabelType *) clause)->arg;
/* plain Var references (boolean Vars or recursive checks) */
if (IsA(clause, Var))
{
Var *var = (Var *) clause;
/* Ensure var is from the correct relation */
if (var->varno != relid)
return false;
/* we also better ensure the Var is from the current level */
if (var->varlevelsup > 0)
return false;
/* Also skip system attributes (we don't allow stats on those). */
if (!AttrNumberIsForUserDefinedAttr(var->varattno))
return false;
*attnums = bms_add_member(*attnums, var->varattno);
return true;
}
/* (Var op Const) or (Const op Var) */
if (is_opclause(clause))
{
OpExpr *expr = (OpExpr *) clause;
Var *var;
bool varonleft = true;
bool ok;
/* Only expressions with two arguments are considered compatible. */
if (list_length(expr->args) != 2)
return false;
/* see if it actually has the right shape (one Var, one Const) */
ok = (NumRelids((Node *) expr) == 1) &&
(is_pseudo_constant_clause(lsecond(expr->args)) ||
(varonleft = false,
is_pseudo_constant_clause(linitial(expr->args))));
/* unsupported structure (two variables or so) */
if (!ok)
return false;
/*
* If it's not one of the supported operators ("=", "<", ">", etc.),
* just ignore the clause, as it's not compatible with MCV lists.
*
* This uses the function for estimating selectivity, not the operator
* directly (a bit awkward, but well ...).
*/
switch (get_oprrest(expr->opno))
{
case F_EQSEL:
case F_NEQSEL:
case F_SCALARLTSEL:
case F_SCALARLESEL:
case F_SCALARGTSEL:
case F_SCALARGESEL:
/* supported, will continue with inspection of the Var */
break;
default:
/* other estimators are considered unknown/unsupported */
return false;
}
var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
return statext_is_compatible_clause_internal((Node *) var, relid, attnums);
}
/* AND/OR/NOT clause */
if (is_andclause(clause) ||
is_orclause(clause) ||
is_notclause(clause))
{
/*
* AND/OR/NOT-clauses are supported if all sub-clauses are supported
*
* Perhaps we could improve this by handling mixed cases, when some of
* the clauses are supported and some are not. Selectivity for the
* supported subclauses would be computed using extended statistics,
* and the remaining clauses would be estimated using the traditional
* algorithm (product of selectivities).
*
* It however seems overly complex, and in a way we already do that
* because if we reject the whole clause as unsupported here, it will
* be eventually passed to clauselist_selectivity() which does exactly
* this (split into supported/unsupported clauses etc).
*/
BoolExpr *expr = (BoolExpr *) clause;
ListCell *lc;
foreach(lc, expr->args)
{
/*
* Had we found incompatible clause in the arguments, treat the
* whole clause as incompatible.
*/
if (!statext_is_compatible_clause_internal((Node *) lfirst(lc),
relid, attnums))
return false;
}
return true;
}
/* Var IS NULL */
if (IsA(clause, NullTest))
{
NullTest *nt = (NullTest *) clause;
/*
* Only simple (Var IS NULL) expressions supported for now. Maybe we
* could use examine_variable to fix this?
*/
if (!IsA(nt->arg, Var))
return false;
return statext_is_compatible_clause_internal((Node *) (nt->arg), relid, attnums);
}
return false;
}
/*
* statext_is_compatible_clause
* Determines if the clause is compatible with MCV lists.
*
* Currently, we only support three types of clauses:
*
* (a) OpExprs of the form (Var op Const), or (Const op Var), where the op
* is one of ("=", "<", ">", ">=", "<=")
*
* (b) (Var IS [NOT] NULL)
*
* (c) combinations using AND/OR/NOT
*
* In the future, the range of supported clauses may be expanded to more
* complex cases, for example (Var op Var).
*/
static bool
statext_is_compatible_clause(Node *clause, Index relid, Bitmapset **attnums)
{
RestrictInfo *rinfo = (RestrictInfo *) clause;
if (!IsA(rinfo, RestrictInfo))
return false;
/* Pseudoconstants are not really interesting here. */
if (rinfo->pseudoconstant)
return false;
/* clauses referencing multiple varnos are incompatible */
if (bms_membership(rinfo->clause_relids) != BMS_SINGLETON)
return false;
return statext_is_compatible_clause_internal((Node *) rinfo->clause,
relid, attnums);
}
/*
* statext_mcv_clauselist_selectivity
* Estimate clauses using the best multi-column statistics.
*
* Selects the best extended (multi-column) statistic on a table (measured by
* the number of attributes extracted from the clauses and covered by it), and
* computes the selectivity for the supplied clauses.
*
* One of the main challenges with using MCV lists is how to extrapolate the
* estimate to the data not covered by the MCV list. To do that, we compute
* not only the "MCV selectivity" (selectivities for MCV items matching the
* supplied clauses), but also a couple of derived selectivities:
*
* - simple selectivity: Computed without extended statistic, i.e. as if the
* columns/clauses were independent
*
* - base selectivity: Similar to simple selectivity, but is computed using
* the extended statistic by adding up the base frequencies (that we compute
* and store for each MCV item) of matching MCV items.
*
* - total selectivity: Selectivity covered by the whole MCV list.
*
* - other selectivity: A selectivity estimate for data not covered by the MCV
* list (i.e. satisfying the clauses, but not common enough to make it into
* the MCV list)
*
* Note: While simple and base selectivities are defined in a quite similar
* way, the values are computed differently and are not therefore equal. The
* simple selectivity is computed as a product of per-clause estimates, while
* the base selectivity is computed by adding up base frequencies of matching
* items of the multi-column MCV list. So the values may differ for two main
* reasons - (a) the MCV list may not cover 100% of the data and (b) some of
* the MCV items did not match the estimated clauses.
*
* As both (a) and (b) reduce the base selectivity value, it generally holds
* that (simple_selectivity >= base_selectivity). If the MCV list covers all
* the data, the values may be equal.
*
* So, (simple_selectivity - base_selectivity) is an estimate for the part
* not covered by the MCV list, and (mcv_selectivity - base_selectivity) may
* be seen as a correction for the part covered by the MCV list. Those two
* statements are actually equivalent.
*
* Note: Due to rounding errors and minor differences in how the estimates
* are computed, the inequality may not always hold. Which is why we clamp
* the selectivities to prevent strange estimate (negative etc.).
*
* 'estimatedclauses' is an input/output parameter. We set bits for the
* 0-based 'clauses' indexes we estimate for and also skip clause items that
* already have a bit set.
*
* XXX If we were to use multiple statistics, this is where it would happen.
* We would simply repeat this on a loop on the "remaining" clauses, possibly
* using the already estimated clauses as conditions (and combining the values
* using conditional probability formula).
*/
static Selectivity
statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
JoinType jointype, SpecialJoinInfo *sjinfo,
RelOptInfo *rel, Bitmapset **estimatedclauses)
{
ListCell *l;
Bitmapset *clauses_attnums = NULL;
Bitmapset **list_attnums;
int listidx;
StatisticExtInfo *stat;
List *stat_clauses;
Selectivity simple_sel,
mcv_sel,
mcv_basesel,
mcv_totalsel,
other_sel,
sel;
/* check if there's any stats that might be useful for us. */
if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
return 1.0;
list_attnums = (Bitmapset **) palloc(sizeof(Bitmapset *) *
list_length(clauses));
/*
* Pre-process the clauses list to extract the attnums seen in each item.
* We need to determine if there's any clauses which will be useful for
* selectivity estimations with extended stats. Along the way we'll record
* all of the attnums for each clause in a list which we'll reference later
* so we don't need to repeat the same work again. We'll also keep track of
* all attnums seen.
*
* We also skip clauses that we already estimated using different types of
* statistics (we treat them as incompatible).
*/
listidx = 0;
foreach(l, clauses)
{
Node *clause = (Node *) lfirst(l);
Bitmapset *attnums = NULL;
if (!bms_is_member(listidx, *estimatedclauses) &&
statext_is_compatible_clause(clause, rel->relid, &attnums))
{
list_attnums[listidx] = attnums;
clauses_attnums = bms_add_members(clauses_attnums, attnums);
}
else
list_attnums[listidx] = NULL;
listidx++;
}
/* We need at least two attributes for multivariate statistics. */
if (bms_membership(clauses_attnums) != BMS_MULTIPLE)
return 1.0;
/* find the best suited statistics object for these attnums */
stat = choose_best_statistics(rel->statlist, clauses_attnums, STATS_EXT_MCV);
/* if no matching stats could be found then we've nothing to do */
if (!stat)
return 1.0;
/* Ensure choose_best_statistics produced an expected stats type. */
Assert(stat->kind == STATS_EXT_MCV);
/* now filter the clauses to be estimated using the selected MCV */
stat_clauses = NIL;
listidx = 0;
foreach(l, clauses)
{
/*
* If the clause is compatible with the selected statistics, mark it
* as estimated and add it to the list to estimate.
*/
if (list_attnums[listidx] != NULL &&
bms_is_subset(list_attnums[listidx], stat->keys))
{
stat_clauses = lappend(stat_clauses, (Node *) lfirst(l));
*estimatedclauses = bms_add_member(*estimatedclauses, listidx);
}
listidx++;
}
/*
* First compute "simple" selectivity, i.e. without the extended statistics,
* and essentially assuming independence of the columns/clauses. We'll then
* use the various selectivities computed from MCV list to improve it.
*/
simple_sel = clauselist_selectivity_simple(root, stat_clauses, varRelid,
jointype, sjinfo, NULL);
/*
* Now compute the multi-column estimate from the MCV list, along with the
* other selectivities (base & total selectivity).
*/
mcv_sel = mcv_clauselist_selectivity(root, stat, stat_clauses, varRelid,
jointype, sjinfo, rel,
&mcv_basesel, &mcv_totalsel);
/* Estimated selectivity of values not covered by MCV matches */
other_sel = simple_sel - mcv_basesel;
CLAMP_PROBABILITY(other_sel);
/* The non-MCV selectivity can't exceed the 1 - mcv_totalsel. */
if (other_sel > 1.0 - mcv_totalsel)
other_sel = 1.0 - mcv_totalsel;
/* Overall selectivity is the combination of MCV and non-MCV estimates. */
sel = mcv_sel + other_sel;
CLAMP_PROBABILITY(sel);
return sel;
}
/*
* statext_clauselist_selectivity
* Estimate clauses using the best multi-column statistics.
*/
Selectivity
statext_clauselist_selectivity(PlannerInfo *root, List *clauses, int varRelid,
JoinType jointype, SpecialJoinInfo *sjinfo,
RelOptInfo *rel, Bitmapset **estimatedclauses)
{
Selectivity sel;
/* First, try estimating clauses using a multivariate MCV list. */
sel = statext_mcv_clauselist_selectivity(root, clauses, varRelid, jointype,
sjinfo, rel, estimatedclauses);
/*
* Then, apply functional dependencies on the remaining clauses by
* calling dependencies_clauselist_selectivity. Pass 'estimatedclauses'
* so the function can properly skip clauses already estimated above.
*
* The reasoning for applying dependencies last is that the more complex
* stats can track more complex correlations between the attributes, and
* so may be considered more reliable.
*
* For example, MCV list can give us an exact selectivity for values in
* two columns, while functional dependencies can only provide information
* about the overall strength of the dependency.
*/
sel *= dependencies_clauselist_selectivity(root, clauses, varRelid,
jointype, sjinfo, rel,
estimatedclauses);
return sel;
}
/*-------------------------------------------------------------------------
*
* mcv.c
* POSTGRES multivariate MCV lists
*
*
* Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* IDENTIFICATION
* src/backend/statistics/mcv.c
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include <math.h>
#include "access/htup_details.h"
#include "catalog/pg_collation.h"
#include "catalog/pg_statistic_ext.h"
#include "fmgr.h"
#include "funcapi.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/clauses.h"
#include "statistics/extended_stats_internal.h"
#include "statistics/statistics.h"
#include "utils/builtins.h"
#include "utils/bytea.h"
#include "utils/fmgroids.h"
#include "utils/fmgrprotos.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
#include "utils/typcache.h"
/*
* Computes size of a serialized MCV item, depending on the number of
* dimensions (columns) the statistic is defined on. The datum values are
* stored in a separate array (deduplicated, to minimize the size), and
* so the serialized items only store uint16 indexes into that array.
*
* Each serialized item store (in this order):
*
* - indexes to values (ndim * sizeof(uint16))
* - null flags (ndim * sizeof(bool))
* - frequency (sizeof(double))
* - base_frequency (sizeof(double))
*
* So in total each MCV item requires this many bytes:
*
* ndim * (sizeof(uint16) + sizeof(bool)) + 2 * sizeof(double)
*/
#define ITEM_SIZE(ndims) \
((ndims) * (sizeof(uint16) + sizeof(bool)) + 2 * sizeof(double))
/*
* Macros for convenient access to parts of a serialized MCV item.
*/
#define ITEM_INDEXES(item) ((uint16 *) (item))
#define ITEM_NULLS(item,ndims) ((bool *) (ITEM_INDEXES(item) + (ndims)))
#define ITEM_FREQUENCY(item,ndims) ((double *) (ITEM_NULLS(item, ndims) + (ndims)))
#define ITEM_BASE_FREQUENCY(item,ndims) ((double *) (ITEM_FREQUENCY(item, ndims) + 1))
static MultiSortSupport build_mss(VacAttrStats **stats, int numattrs);
static SortItem *build_distinct_groups(int numrows, SortItem *items,
MultiSortSupport mss, int *ndistinct);
static int count_distinct_groups(int numrows, SortItem *items,
MultiSortSupport mss);
/*
* get_mincount_for_mcv_list
* Determine the minimum number of times a value needs to appear in
* the sample for it to be included in the MCV list.
*
* We want to keep only values that appear sufficiently often in the
* sample that it is reasonable to extrapolate their sample frequencies to
* the entire table. We do this by placing an upper bound on the relative
* standard error of the sample frequency, so that any estimates the
* planner generates from the MCV statistics can be expected to be
* reasonably accurate.
*
* Since we are sampling without replacement, the sample frequency of a
* particular value is described by a hypergeometric distribution. A
* common rule of thumb when estimating errors in this situation is to
* require at least 10 instances of the value in the sample, in which case
* the distribution can be approximated by a normal distribution, and
* standard error analysis techniques can be applied. Given a sample size
* of n, a population size of N, and a sample frequency of p=cnt/n, the
* standard error of the proportion p is given by
* SE = sqrt(p*(1-p)/n) * sqrt((N-n)/(N-1))
* where the second term is the finite population correction. To get
* reasonably accurate planner estimates, we impose an upper bound on the
* relative standard error of 20% -- i.e., SE/p < 0.2. This 20% relative
* error bound is fairly arbitrary, but has been found empirically to work
* well. Rearranging this formula gives a lower bound on the number of
* instances of the value seen:
* cnt > n*(N-n) / (N-n+0.04*n*(N-1))
* This bound is at most 25, and approaches 0 as n approaches 0 or N. The
* case where n approaches 0 cannot happen in practice, since the sample
* size is at least 300. The case where n approaches N corresponds to
* sampling the whole the table, in which case it is reasonable to keep
* the whole MCV list (have no lower bound), so it makes sense to apply
* this formula for all inputs, even though the above derivation is
* technically only valid when the right hand side is at least around 10.
*
* An alternative way to look at this formula is as follows -- assume that
* the number of instances of the value seen scales up to the entire
* table, so that the population count is K=N*cnt/n. Then the distribution
* in the sample is a hypergeometric distribution parameterised by N, n
* and K, and the bound above is mathematically equivalent to demanding
* that the standard deviation of that distribution is less than 20% of
* its mean. Thus the relative errors in any planner estimates produced
* from the MCV statistics are likely to be not too large.
*/
static double
get_mincount_for_mcv_list(int samplerows, double totalrows)
{
double n = samplerows;
double N = totalrows;
double numer,
denom;
numer = n * (N - n);
denom = N - n + 0.04 * n * (N - 1);
/* Guard against division by zero (possible if n = N = 1) */
if (denom == 0.0)
return 0.0;
return numer / denom;
}
/*
* Builds MCV list from the set of sampled rows.
*
* The algorithm is quite simple:
*
* (1) sort the data (default collation, '<' for the data type)
*
* (2) count distinct groups, decide how many to keep
*
* (3) build the MCV list using the threshold determined in (2)
*
* (4) remove rows represented by the MCV from the sample
*
*/
MCVList *
statext_mcv_build(int numrows, HeapTuple *rows, Bitmapset *attrs,
VacAttrStats **stats, double totalrows)
{
int i,
numattrs,
ngroups,
nitems;
AttrNumber *attnums = build_attnums_array(attrs, &numattrs);
SortItem *items;
SortItem *groups;
MCVList *mcvlist = NULL;
/* comparator for all the columns */
MultiSortSupport mss = build_mss(stats, numattrs);
/* sort the rows */
items = build_sorted_items(numrows, &nitems, rows, stats[0]->tupDesc,
mss, numattrs, attnums);
if (!items)
return NULL;
/* transform the sorted rows into groups (sorted by frequency) */
groups = build_distinct_groups(nitems, items, mss, &ngroups);
/*
* Maximum number of MCV items to store, based on the attribute with the
* largest stats target (and the number of groups we have available).
*/
nitems = stats[0]->attr->attstattarget;
for (i = 1; i < numattrs; i++)
{
if (stats[i]->attr->attstattarget > nitems)
nitems = stats[i]->attr->attstattarget;
}
if (nitems > ngroups)
nitems = ngroups;
/*
* Decide how many items to keep in the MCV list. We can't use the same
* algorithm as per-column MCV lists, because that only considers the
* actual group frequency - but we're primarily interested in how the
* actual frequency differs from the base frequency (product of simple
* per-column frequencies, as if the columns were independent).
*
* Using the same algorithm might exclude items that are close to the
* "average" frequency. But it does not say whether the frequency is
* close to base frequency or not. We also need to consider unexpectedly
* uncommon items (compared to base frequency), and the single-column
* algorithm ignores that entirely.
*
* If we can fit all the items onto the MCV list, do that. Otherwise
* use get_mincount_for_mcv_list to decide which items to keep in the
* MCV list, based on the number of occurrences in the sample.
*/
if (ngroups > nitems)
{
double mincount;
mincount = get_mincount_for_mcv_list(numrows, totalrows);
/*
* Walk the groups until we find the first group with a count below
* the mincount threshold (the index of that group is the number of
* groups we want to keep).
*/
for (i = 0; i < nitems; i++)
{
if (groups[i].count < mincount)
{
nitems = i;
break;
}
}
}
/*
* At this point we know the number of items for the MCV list. There might
* be none (for uniform distribution with many groups), and in that case
* there will be no MCV list. Otherwise construct the MCV list.
*/
if (nitems > 0)
{
int j;
/*
* Allocate the MCV list structure, set the global parameters.
*/
mcvlist = (MCVList *) palloc0(sizeof(MCVList));
mcvlist->magic = STATS_MCV_MAGIC;
mcvlist->type = STATS_MCV_TYPE_BASIC;
mcvlist->ndimensions = numattrs;
mcvlist->nitems = nitems;
/* store info about data type OIDs */
for (i = 0; i < numattrs; i++)
mcvlist->types[i] = stats[i]->attrtypid;
/*
* Preallocate Datum/isnull arrays for all items.
*
* XXX Perhaps we might allocate this in a single chunk, to reduce
* the palloc overhead. We're the only ones dealing with the built
* MCV lists anyway. Not sure it's worth it, though, as we're not
* re-building stats very often.
*/
mcvlist->items = (MCVItem **) palloc(sizeof(MCVItem *) * nitems);
for (i = 0; i < nitems; i++)
{
mcvlist->items[i] = (MCVItem *) palloc(sizeof(MCVItem));
mcvlist->items[i]->values = (Datum *) palloc(sizeof(Datum) * numattrs);
mcvlist->items[i]->isnull = (bool *) palloc(sizeof(bool) * numattrs);
}
/* Copy the first chunk of groups into the result. */
for (i = 0; i < nitems; i++)
{
/* just pointer to the proper place in the list */
MCVItem *item = mcvlist->items[i];
/* copy values for the group */
memcpy(item->values, groups[i].values, sizeof(Datum) * numattrs);
memcpy(item->isnull, groups[i].isnull, sizeof(bool) * numattrs);
/* groups should be sorted by frequency in descending order */
Assert((i == 0) || (groups[i - 1].count >= groups[i].count));
/* group frequency */
item->frequency = (double) groups[i].count / numrows;
/* base frequency, if the attributes were independent */
item->base_frequency = 1.0;
for (j = 0; j < numattrs; j++)
{
int count = 0;
int k;
for (k = 0; k < ngroups; k++)
{
if (multi_sort_compare_dim(j, &groups[i], &groups[k], mss) == 0)
count += groups[k].count;
}
item->base_frequency *= (double) count / numrows;
}
}
}
pfree(items);
pfree(groups);
return mcvlist;
}
/*
* build_mss
* build MultiSortSupport for the attributes passed in attrs
*/
static MultiSortSupport
build_mss(VacAttrStats **stats, int numattrs)
{
int i;
/* Sort by multiple columns (using array of SortSupport) */
MultiSortSupport mss = multi_sort_init(numattrs);
/* prepare the sort functions for all the attributes */
for (i = 0; i < numattrs; i++)
{
VacAttrStats *colstat = stats[i];
TypeCacheEntry *type;
type = lookup_type_cache(colstat->attrtypid, TYPECACHE_LT_OPR);
if (type->lt_opr == InvalidOid) /* shouldn't happen */
elog(ERROR, "cache lookup failed for ordering operator for type %u",
colstat->attrtypid);
multi_sort_add_dimension(mss, i, type->lt_opr, type->typcollation);
}
return mss;
}
/*
* count_distinct_groups
* count distinct combinations of SortItems in the array
*
* The array is assumed to be sorted according to the MultiSortSupport.
*/
static int
count_distinct_groups(int numrows, SortItem *items, MultiSortSupport mss)
{
int i;
int ndistinct;
ndistinct = 1;
for (i = 1; i < numrows; i++)
{
/* make sure the array really is sorted */
Assert(multi_sort_compare(&items[i], &items[i - 1], mss) >= 0);
if (multi_sort_compare(&items[i], &items[i - 1], mss) != 0)
ndistinct += 1;
}
return ndistinct;
}
/*
* compare_sort_item_count
* comparator for sorting items by count (frequencies) in descending order
*/
static int
compare_sort_item_count(const void *a, const void *b)
{
SortItem *ia = (SortItem *) a;
SortItem *ib = (SortItem *) b;
if (ia->count == ib->count)
return 0;
else if (ia->count > ib->count)
return -1;
return 1;
}
/*
* build_distinct_groups
* build an array of SortItems for distinct groups and counts matching items
*
* The input array is assumed to be sorted
*/
static SortItem *
build_distinct_groups(int numrows, SortItem *items, MultiSortSupport mss,
int *ndistinct)
{
int i,
j;
int ngroups = count_distinct_groups(numrows, items, mss);
SortItem *groups = (SortItem *) palloc(ngroups * sizeof(SortItem));
j = 0;
groups[0] = items[0];
groups[0].count = 1;
for (i = 1; i < numrows; i++)
{
/* Assume sorted in ascending order. */
Assert(multi_sort_compare(&items[i], &items[i - 1], mss) >= 0);
/* New distinct group detected. */
if (multi_sort_compare(&items[i], &items[i - 1], mss) != 0)
{
groups[++j] = items[i];
groups[j].count = 0;
}
groups[j].count++;
}
/* ensure we filled the expected number of distinct groups */
Assert(j + 1 == ngroups);
/* Sort the distinct groups by frequency (in descending order). */
pg_qsort((void *) groups, ngroups, sizeof(SortItem),
compare_sort_item_count);
*ndistinct = ngroups;
return groups;
}
/*
* statext_mcv_load
* Load the MCV list for the indicated pg_statistic_ext tuple
*/
MCVList *
statext_mcv_load(Oid mvoid)
{
bool isnull;
Datum mcvlist;
HeapTuple htup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(mvoid));
if (!HeapTupleIsValid(htup))
elog(ERROR, "cache lookup failed for statistics object %u", mvoid);
mcvlist = SysCacheGetAttr(STATEXTOID, htup,
Anum_pg_statistic_ext_stxmcv, &isnull);
ReleaseSysCache(htup);
if (isnull)
return NULL;
return statext_mcv_deserialize(DatumGetByteaP(mcvlist));
}
/*
* Serialize MCV list into a bytea value.
*
* The basic algorithm is simple:
*
* (1) perform deduplication (for each attribute separately)
* (a) collect all (non-NULL) attribute values from all MCV items
* (b) sort the data (using 'lt' from VacAttrStats)
* (c) remove duplicate values from the array
*
* (2) serialize the arrays into a bytea value
*
* (3) process all MCV list items
* (a) replace values with indexes into the arrays
*
* Each attribute has to be processed separately, as we may be mixing different
* datatypes, with different sort operators, etc.
*
* We use uint16 values for the indexes in step (3), as we currently don't allow
* more than 8k MCV items anyway, although that's mostly arbitrary limit. We might
* increase this to 65k and still fit into uint16. Furthermore, this limit is on
* the number of distinct values per column, and we usually have few of those
* (and various combinations of them for the those MCV list). So uint16 seems fine.
*
* We don't really expect the serialization to save as much space as for
* histograms, as we are not doing any bucket splits (which is the source
* of high redundancy in histograms).
*
* TODO: Consider packing boolean flags (NULL) for each item into a single char
* (or a longer type) instead of using an array of bool items.
*/
bytea *
statext_mcv_serialize(MCVList * mcvlist, VacAttrStats **stats)
{
int i;
int dim;
int ndims = mcvlist->ndimensions;
int itemsize = ITEM_SIZE(ndims);
SortSupport ssup;
DimensionInfo *info;
Size total_length;
/* allocate the item just once */
char *item = palloc0(itemsize);
/* serialized items (indexes into arrays, etc.) */
bytea *output;
char *data = NULL;
/* values per dimension (and number of non-NULL values) */
Datum **values = (Datum **) palloc0(sizeof(Datum *) * ndims);
int *counts = (int *) palloc0(sizeof(int) * ndims);
/*
* We'll include some rudimentary information about the attributes (type
* length, etc.), so that we don't have to look them up while
* deserializing the MCV list.
*
* XXX Maybe this is not a great idea? Or maybe we should actually copy
* more fields, e.g. typeid, which would allow us to display the MCV list
* using only the serialized representation (currently we have to fetch
* this info from the relation).
*/
info = (DimensionInfo *) palloc0(sizeof(DimensionInfo) * ndims);
/* sort support data for all attributes included in the MCV list */
ssup = (SortSupport) palloc0(sizeof(SortSupportData) * ndims);
/* collect and deduplicate values for each dimension (attribute) */
for (dim = 0; dim < ndims; dim++)
{
int ndistinct;
TypeCacheEntry *typentry;
/*
* Lookup the LT operator (can't get it from stats extra_data, as we
* don't know how to interpret that - scalar vs. array etc.).
*/
typentry = lookup_type_cache(stats[dim]->attrtypid, TYPECACHE_LT_OPR);
/* copy important info about the data type (length, by-value) */
info[dim].typlen = stats[dim]->attrtype->typlen;
info[dim].typbyval = stats[dim]->attrtype->typbyval;
/* allocate space for values in the attribute and collect them */
values[dim] = (Datum *) palloc0(sizeof(Datum) * mcvlist->nitems);
for (i = 0; i < mcvlist->nitems; i++)
{
/* skip NULL values - we don't need to deduplicate those */
if (mcvlist->items[i]->isnull[dim])
continue;
/* append the value at the end */
values[dim][counts[dim]] = mcvlist->items[i]->values[dim];
counts[dim] += 1;
}
/* if there are just NULL values in this dimension, we're done */
if (counts[dim] == 0)
continue;
/* sort and deduplicate the data */
ssup[dim].ssup_cxt = CurrentMemoryContext;
ssup[dim].ssup_collation = DEFAULT_COLLATION_OID;
ssup[dim].ssup_nulls_first = false;
PrepareSortSupportFromOrderingOp(typentry->lt_opr, &ssup[dim]);
qsort_arg(values[dim], counts[dim], sizeof(Datum),
compare_scalars_simple, &ssup[dim]);
/*
* Walk through the array and eliminate duplicate values, but keep the
* ordering (so that we can do bsearch later). We know there's at
* least one item as (counts[dim] != 0), so we can skip the first
* element.
*/
ndistinct = 1; /* number of distinct values */
for (i = 1; i < counts[dim]; i++)
{
/* expect sorted array */
Assert(compare_datums_simple(values[dim][i - 1], values[dim][i], &ssup[dim]) <= 0);
/* if the value is the same as the previous one, we can skip it */
if (!compare_datums_simple(values[dim][i - 1], values[dim][i], &ssup[dim]))
continue;
values[dim][ndistinct] = values[dim][i];
ndistinct += 1;
}
/* we must not exceed PG_UINT16_MAX, as we use uint16 indexes */
Assert(ndistinct <= PG_UINT16_MAX);
/*
* Store additional info about the attribute - number of deduplicated
* values, and also size of the serialized data. For fixed-length data
* types this is trivial to compute, for varwidth types we need to
* actually walk the array and sum the sizes.
*/
info[dim].nvalues = ndistinct;
if (info[dim].typlen > 0) /* fixed-length data types */
info[dim].nbytes = info[dim].nvalues * info[dim].typlen;
else if (info[dim].typlen == -1) /* varlena */
{
info[dim].nbytes = 0;
for (i = 0; i < info[dim].nvalues; i++)
{
values[dim][i] = PointerGetDatum(PG_DETOAST_DATUM(values[dim][i]));
info[dim].nbytes += VARSIZE_ANY(values[dim][i]);
}
}
else if (info[dim].typlen == -2) /* cstring */
{
info[dim].nbytes = 0;
for (i = 0; i < info[dim].nvalues; i++)
{
/* c-strings include terminator, so +1 byte */
values[dim][i] = PointerGetDatum(PG_DETOAST_DATUM(values[dim][i]));
info[dim].nbytes += strlen(DatumGetCString(values[dim][i])) + 1;
}
}
/* we know (count>0) so there must be some data */
Assert(info[dim].nbytes > 0);
}
/*
* Now we can finally compute how much space we'll actually need for the
* whole serialized MCV list, as it contains these fields:
*
* - length (4B) for varlena - magic (4B) - type (4B) - ndimensions (4B) -
* nitems (4B) - info (ndim * sizeof(DimensionInfo) - arrays of values for
* each dimension - serialized items (nitems * itemsize)
*
* So the 'header' size is 20B + ndim * sizeof(DimensionInfo) and then we
* will place all the data (values + indexes). We'll however use offsetof
* and sizeof to compute sizes of the structs.
*/
total_length = (sizeof(int32) + offsetof(MCVList, items)
+ (ndims * sizeof(DimensionInfo))
+ mcvlist->nitems * itemsize);
/* add space for the arrays of deduplicated values */
for (i = 0; i < ndims; i++)
total_length += info[i].nbytes;
/* allocate space for the serialized MCV list, set header fields */
output = (bytea *) palloc0(total_length);
SET_VARSIZE(output, total_length);
/* 'data' points to the current position in the output buffer */
data = VARDATA(output);
/* MCV list header (number of items, ...) */
memcpy(data, mcvlist, offsetof(MCVList, items));
data += offsetof(MCVList, items);
/* information about the attributes */
memcpy(data, info, sizeof(DimensionInfo) * ndims);
data += sizeof(DimensionInfo) * ndims;
/* Copy the deduplicated values for all attributes to the output. */
for (dim = 0; dim < ndims; dim++)
{
#ifdef USE_ASSERT_CHECKING
/* remember the starting point for Asserts later */
char *tmp = data;
#endif
for (i = 0; i < info[dim].nvalues; i++)
{
Datum v = values[dim][i];
if (info[dim].typbyval) /* passed by value */
{
memcpy(data, &v, info[dim].typlen);
data += info[dim].typlen;
}
else if (info[dim].typlen > 0) /* pased by reference */
{
memcpy(data, DatumGetPointer(v), info[dim].typlen);
data += info[dim].typlen;
}
else if (info[dim].typlen == -1) /* varlena */
{
int len = VARSIZE_ANY(v);
memcpy(data, DatumGetPointer(v), len);
data += len;
}
else if (info[dim].typlen == -2) /* cstring */
{
Size len = strlen(DatumGetCString(v)) + 1; /* terminator */
memcpy(data, DatumGetCString(v), len );
data += len;
}
/* no underflows or overflows */
Assert((data > tmp) && ((data - tmp) <= info[dim].nbytes));
}
/*
* check we got exactly the amount of data we expected for this
* dimension
*/
Assert((data - tmp) == info[dim].nbytes);
}
/* Serialize the items, with uint16 indexes instead of the values. */
for (i = 0; i < mcvlist->nitems; i++)
{
MCVItem *mcvitem = mcvlist->items[i];
/* don't write beyond the allocated space */
Assert(data <= (char *) output + total_length - itemsize);
/* reset the item (we only allocate it once and reuse it) */
memset(item, 0, itemsize);
for (dim = 0; dim < ndims; dim++)
{
Datum *value;
/* do the lookup only for non-NULL values */
if (mcvlist->items[i]->isnull[dim])
continue;
value = (Datum *) bsearch_arg(&mcvitem->values[dim], values[dim],
info[dim].nvalues, sizeof(Datum),
compare_scalars_simple, &ssup[dim]);
Assert(value != NULL); /* serialization or deduplication error */
/* compute index within the array */
ITEM_INDEXES(item)[dim] = (uint16) (value - values[dim]);
/* check the index is within expected bounds */
Assert(ITEM_INDEXES(item)[dim] >= 0);
Assert(ITEM_INDEXES(item)[dim] < info[dim].nvalues);
}
/* copy NULL and frequency flags into the item */
memcpy(ITEM_NULLS(item, ndims), mcvitem->isnull, sizeof(bool) * ndims);
memcpy(ITEM_FREQUENCY(item, ndims), &mcvitem->frequency, sizeof(double));
memcpy(ITEM_BASE_FREQUENCY(item, ndims), &mcvitem->base_frequency, sizeof(double));
/* copy the serialized item into the array */
memcpy(data, item, itemsize);
data += itemsize;
}
/* at this point we expect to match the total_length exactly */
Assert((data - (char *) output) == total_length);
pfree(item);
pfree(values);
pfree(counts);
return output;
}
/*
* Reads serialized MCV list into MCVList structure.
*
* Unlike with histograms, we deserialize the MCV list fully (i.e. we don't
* keep the deduplicated arrays and pointers into them), as we don't expect
* there to be a lot of duplicate values. But perhaps that's not true and we
* should keep the MCV in serialized form too.
*
* XXX See how much memory we could save by keeping the deduplicated version
* (both for typical and corner cases with few distinct values but many items).
*/
MCVList *
statext_mcv_deserialize(bytea *data)
{
int dim,
i;
Size expected_size;
MCVList *mcvlist;
char *tmp;
int ndims,
nitems,
itemsize;
DimensionInfo *info = NULL;
Datum **values = NULL;
/* local allocation buffer (used only for deserialization) */
int bufflen;
char *buff;
char *ptr;
/* buffer used for the result */
int rbufflen;
char *rbuff;
char *rptr;
if (data == NULL)
return NULL;
/*
* We can't possibly deserialize a MCV list if there's not even a complete
* header.
*/
if (VARSIZE_ANY_EXHDR(data) < offsetof(MCVList, items))
elog(ERROR, "invalid MCV size %ld (expected at least %zu)",
VARSIZE_ANY_EXHDR(data), offsetof(MCVList, items));
/* read the MCV list header */
mcvlist = (MCVList *) palloc0(sizeof(MCVList));
/* initialize pointer to the data part (skip the varlena header) */
tmp = VARDATA_ANY(data);
/* get the header and perform further sanity checks */
memcpy(mcvlist, tmp, offsetof(MCVList, items));
tmp += offsetof(MCVList, items);
if (mcvlist->magic != STATS_MCV_MAGIC)
elog(ERROR, "invalid MCV magic %u (expected %u)",
mcvlist->magic, STATS_MCV_MAGIC);
if (mcvlist->type != STATS_MCV_TYPE_BASIC)
elog(ERROR, "invalid MCV type %u (expected %u)",
mcvlist->type, STATS_MCV_TYPE_BASIC);
if (mcvlist->ndimensions == 0)
elog(ERROR, "invalid zero-length dimension array in MCVList");
else if ((mcvlist->ndimensions > STATS_MAX_DIMENSIONS) ||
(mcvlist->ndimensions < 0))
elog(ERROR, "invalid length (%d) dimension array in MCVList",
mcvlist->ndimensions);
if (mcvlist->nitems == 0)
elog(ERROR, "invalid zero-length item array in MCVList");
else if (mcvlist->nitems > STATS_MCVLIST_MAX_ITEMS)
elog(ERROR, "invalid length (%u) item array in MCVList",
mcvlist->nitems);
nitems = mcvlist->nitems;
ndims = mcvlist->ndimensions;
itemsize = ITEM_SIZE(ndims);
/*
* Check amount of data including DimensionInfo for all dimensions and
* also the serialized items (including uint16 indexes). Also, walk
* through the dimension information and add it to the sum.
*/
expected_size = offsetof(MCVList, items) +
ndims * sizeof(DimensionInfo) +
(nitems * itemsize);
/*
* Check that we have at least the dimension and info records, along with
* the items. We don't know the size of the serialized values yet. We need
* to do this check first, before accessing the dimension info.
*/
if (VARSIZE_ANY_EXHDR(data) < expected_size)
elog(ERROR, "invalid MCV size %ld (expected %zu)",
VARSIZE_ANY_EXHDR(data), expected_size);
/* Now it's safe to access the dimension info. */
info = (DimensionInfo *) (tmp);
tmp += ndims * sizeof(DimensionInfo);
/* account for the value arrays */
for (dim = 0; dim < ndims; dim++)
{
/*
* XXX I wonder if we can/should rely on asserts here. Maybe those
* checks should be done every time?
*/
Assert(info[dim].nvalues >= 0);
Assert(info[dim].nbytes >= 0);
expected_size += info[dim].nbytes;
}
/*
* Now we know the total expected MCV size, including all the pieces
* (header, dimension info. items and deduplicated data). So do the final
* check on size.
*/
if (VARSIZE_ANY_EXHDR(data) != expected_size)
elog(ERROR, "invalid MCV size %ld (expected %zu)",
VARSIZE_ANY_EXHDR(data), expected_size);
/*
* Allocate one large chunk of memory for the intermediate data, needed
* only for deserializing the MCV list (and allocate densely to minimize
* the palloc overhead).
*
* Let's see how much space we'll actually need, and also include space
* for the array with pointers.
*
* We need an array of Datum pointers values for each dimension, so that
* we can easily translate the uint16 indexes. We also need a top-level
* array of pointers to those per-dimension arrays.
*
* For byval types with size matching sizeof(Datum) we can reuse the
* serialized array directly.
*/
bufflen = sizeof(Datum **) * ndims; /* space for top-level pointers */
for (dim = 0; dim < ndims; dim++)
{
/* for full-size byval types, we reuse the serialized value */
if (!(info[dim].typbyval && info[dim].typlen == sizeof(Datum)))
bufflen += (sizeof(Datum) * info[dim].nvalues);
}
buff = palloc0(bufflen);
ptr = buff;
values = (Datum **) buff;
ptr += (sizeof(Datum *) * ndims);
/*
* XXX This uses pointers to the original data array (the types not passed
* by value), so when someone frees the memory, e.g. by doing something
* like this:
*
* bytea * data = ... fetch the data from catalog ...
*
* MCVList mcvlist = deserialize_mcv_list(data);
*
* pfree(data);
*
* then 'mcvlist' references the freed memory. Should copy the pieces.
*/
for (dim = 0; dim < ndims; dim++)
{
#ifdef USE_ASSERT_CHECKING
/* remember where data for this dimension starts */
char *start = tmp;
#endif
if (info[dim].typbyval)
{
/* passed by value / size matches Datum - just reuse the array */
if (info[dim].typlen == sizeof(Datum))
{
values[dim] = (Datum *) tmp;
tmp += info[dim].nbytes;
/* no overflow of input array */
Assert(tmp <= start + info[dim].nbytes);
}
else
{
values[dim] = (Datum *) ptr;
ptr += (sizeof(Datum) * info[dim].nvalues);
for (i = 0; i < info[dim].nvalues; i++)
{
/* just point into the array */
memcpy(&values[dim][i], tmp, info[dim].typlen);
tmp += info[dim].typlen;
/* no overflow of input array */
Assert(tmp <= start + info[dim].nbytes);
}
}
}
else
{
/* all the other types need a chunk of the buffer */
values[dim] = (Datum *) ptr;
ptr += (sizeof(Datum) * info[dim].nvalues);
/* passed by reference, but fixed length (name, tid, ...) */
if (info[dim].typlen > 0)
{
for (i = 0; i < info[dim].nvalues; i++)
{
/* just point into the array */
values[dim][i] = PointerGetDatum(tmp);
tmp += info[dim].typlen;
/* no overflow of input array */
Assert(tmp <= start + info[dim].nbytes);
}
}
else if (info[dim].typlen == -1)
{
/* varlena */
for (i = 0; i < info[dim].nvalues; i++)
{
/* just point into the array */
values[dim][i] = PointerGetDatum(tmp);
tmp += VARSIZE_ANY(tmp);
/* no overflow of input array */
Assert(tmp <= start + info[dim].nbytes);
}
}
else if (info[dim].typlen == -2)
{
/* cstring */
for (i = 0; i < info[dim].nvalues; i++)
{
/* just point into the array */
values[dim][i] = PointerGetDatum(tmp);
tmp += (strlen(tmp) + 1); /* don't forget the \0 */
/* no overflow of input array */
Assert(tmp <= start + info[dim].nbytes);
}
}
}
/* check we consumed the serialized data for this dimension exactly */
Assert((tmp - start) == info[dim].nbytes);
}
/* we should have exhausted the buffer exactly */
Assert((ptr - buff) == bufflen);
/* allocate space for all the MCV items in a single piece */
rbufflen = (sizeof(MCVItem *) + sizeof(MCVItem) +
sizeof(Datum) * ndims + sizeof(bool) * ndims) * nitems;
rbuff = palloc0(rbufflen);
rptr = rbuff;
mcvlist->items = (MCVItem * *) rbuff;
rptr += (sizeof(MCVItem *) * nitems);
/* deserialize the MCV items and translate the indexes to Datums */
for (i = 0; i < nitems; i++)
{
uint16 *indexes = NULL;
MCVItem *item = (MCVItem *) rptr;
rptr += (sizeof(MCVItem));
item->values = (Datum *) rptr;
rptr += (sizeof(Datum) * ndims);
item->isnull = (bool *) rptr;
rptr += (sizeof(bool) * ndims);
/* just point to the right place */
indexes = ITEM_INDEXES(tmp);
memcpy(item->isnull, ITEM_NULLS(tmp, ndims), sizeof(bool) * ndims);
memcpy(&item->frequency, ITEM_FREQUENCY(tmp, ndims), sizeof(double));
memcpy(&item->base_frequency, ITEM_BASE_FREQUENCY(tmp, ndims), sizeof(double));
/* translate the values */
for (dim = 0; dim < ndims; dim++)
if (!item->isnull[dim])
item->values[dim] = values[dim][indexes[dim]];
mcvlist->items[i] = item;
tmp += ITEM_SIZE(ndims);
/* check we're not overflowing the input */
Assert(tmp <= (char *) data + VARSIZE_ANY(data));
}
/* check that we processed all the data */
Assert(tmp == (char *) data + VARSIZE_ANY(data));
/* release the temporary buffer */
pfree(buff);
return mcvlist;
}
/*
* SRF with details about buckets of a histogram:
*
* - item ID (0...nitems)
* - values (string array)
* - nulls only (boolean array)
* - frequency (double precision)
* - base_frequency (double precision)
*
* The input is the OID of the statistics, and there are no rows returned if
* the statistics contains no histogram.
*/
Datum
pg_stats_ext_mcvlist_items(PG_FUNCTION_ARGS)
{
FuncCallContext *funcctx;
int call_cntr;
int max_calls;
AttInMetadata *attinmeta;
/* stuff done only on the first call of the function */
if (SRF_IS_FIRSTCALL())
{
MemoryContext oldcontext;
MCVList *mcvlist;
TupleDesc tupdesc;
/* create a function context for cross-call persistence */
funcctx = SRF_FIRSTCALL_INIT();
/* switch to memory context appropriate for multiple function calls */
oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
mcvlist = statext_mcv_deserialize(PG_GETARG_BYTEA_P(0));
funcctx->user_fctx = mcvlist;
/* total number of tuples to be returned */
funcctx->max_calls = 0;
if (funcctx->user_fctx != NULL)
funcctx->max_calls = mcvlist->nitems;
/* Build a tuple descriptor for our result type */
if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("function returning record called in context "
"that cannot accept type record")));
/*
* generate attribute metadata needed later to produce tuples from raw
* C strings
*/
attinmeta = TupleDescGetAttInMetadata(tupdesc);
funcctx->attinmeta = attinmeta;
MemoryContextSwitchTo(oldcontext);
}
/* stuff done on every call of the function */
funcctx = SRF_PERCALL_SETUP();
call_cntr = funcctx->call_cntr;
max_calls = funcctx->max_calls;
attinmeta = funcctx->attinmeta;
if (call_cntr < max_calls) /* do when there is more left to send */
{
char **values;
HeapTuple tuple;
Datum result;
StringInfoData itemValues;
StringInfoData itemNulls;
int i;
Oid *outfuncs;
FmgrInfo *fmgrinfo;
MCVList *mcvlist;
MCVItem *item;
mcvlist = (MCVList *) funcctx->user_fctx;
Assert(call_cntr < mcvlist->nitems);
item = mcvlist->items[call_cntr];
/*
* Prepare a values array for building the returned tuple. This should
* be an array of C strings which will be processed later by the type
* input functions.
*/
values = (char **) palloc0(5 * sizeof(char *));
values[0] = (char *) palloc(64 * sizeof(char)); /* item index */
values[3] = (char *) palloc(64 * sizeof(char)); /* frequency */
values[4] = (char *) palloc(64 * sizeof(char)); /* base frequency */
outfuncs = (Oid *) palloc0(sizeof(Oid) * mcvlist->ndimensions);
fmgrinfo = (FmgrInfo *) palloc0(sizeof(FmgrInfo) * mcvlist->ndimensions);
for (i = 0; i < mcvlist->ndimensions; i++)
{
bool isvarlena;
getTypeOutputInfo(mcvlist->types[i], &outfuncs[i], &isvarlena);
fmgr_info(outfuncs[i], &fmgrinfo[i]);
}
/* build the arrays of values / nulls */
initStringInfo(&itemValues);
initStringInfo(&itemNulls);
appendStringInfoChar(&itemValues, '{');
appendStringInfoChar(&itemNulls, '{');
for (i = 0; i < mcvlist->ndimensions; i++)
{
Datum val,
valout;
if (i > 0)
{
appendStringInfoString(&itemValues, ", ");
appendStringInfoString(&itemNulls, ", ");
}
if (item->isnull[i])
valout = CStringGetDatum("NULL");
else
{
val = item->values[i];
valout = FunctionCall1(&fmgrinfo[i], val);
}
appendStringInfoString(&itemValues, DatumGetCString(valout));
appendStringInfoString(&itemNulls, item->isnull[i] ? "t" : "f");
}
appendStringInfoChar(&itemValues, '}');
appendStringInfoChar(&itemNulls, '}');
snprintf(values[0], 64, "%d", call_cntr);
snprintf(values[3], 64, "%f", item->frequency);
snprintf(values[4], 64, "%f", item->base_frequency);
values[1] = itemValues.data;
values[2] = itemNulls.data;
/* build a tuple */
tuple = BuildTupleFromCStrings(attinmeta, values);
/* make the tuple into a datum */
result = HeapTupleGetDatum(tuple);
/* clean up (this is not really necessary) */
pfree(itemValues.data);
pfree(itemNulls.data);
pfree(values[0]);
pfree(values[3]);
pfree(values[4]);
pfree(values);
SRF_RETURN_NEXT(funcctx, result);
}
else /* do when there is no more left */
{
SRF_RETURN_DONE(funcctx);
}
}
/*
* pg_mcv_list_in - input routine for type pg_mcv_list.
*
* pg_mcv_list is real enough to be a table column, but it has no operations
* of its own, and disallows input too
*/
Datum
pg_mcv_list_in(PG_FUNCTION_ARGS)
{
/*
* pg_mcv_list stores the data in binary form and parsing text input is
* not needed, so disallow this.
*/
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot accept a value of type %s", "pg_mcv_list")));
PG_RETURN_VOID(); /* keep compiler quiet */
}
/*
* pg_mcv_list_out - output routine for type PG_MCV_LIST.
*
* MCV lists are serialized into a bytea value, so we simply call byteaout()
* to serialize the value into text. But it'd be nice to serialize that into
* a meaningful representation (e.g. for inspection by people).
*
* XXX This should probably return something meaningful, similar to what
* pg_dependencies_out does. Not sure how to deal with the deduplicated
* values, though - do we want to expand that or not?
*/
Datum
pg_mcv_list_out(PG_FUNCTION_ARGS)
{
return byteaout(fcinfo);
}
/*
* pg_mcv_list_recv - binary input routine for type pg_mcv_list.
*/
Datum
pg_mcv_list_recv(PG_FUNCTION_ARGS)
{
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot accept a value of type %s", "pg_mcv_list")));
PG_RETURN_VOID(); /* keep compiler quiet */
}
/*
* pg_mcv_list_send - binary output routine for type pg_mcv_list.
*
* MCV lists are serialized in a bytea value (although the type is named
* differently), so let's just send that.
*/
Datum
pg_mcv_list_send(PG_FUNCTION_ARGS)
{
return byteasend(fcinfo);
}
/*
* mcv_get_match_bitmap
* Evaluate clauses using the MCV list, and update the match bitmap.
*
* A match bitmap keeps match/mismatch status for each MCV item, and we
* update it based on additional clauses. We also use it to skip items
* that can't possibly match (e.g. item marked as "mismatch" can't change
* to "match" when evaluating AND clause list).
*
* The function also returns a flag indicating whether there was an
* equality condition for all attributes, the minimum frequency in the MCV
* list, and a total MCV frequency (sum of frequencies for all items).
*
* XXX Currently the match bitmap uses a bool for each MCV item, which is
* somewhat wasteful as we could do with just a single bit, thus reducing
* the size to ~1/8. It would also allow us to combine bitmaps simply using
* & and |, which should be faster than min/max. The bitmaps are fairly
* small, though (as we cap the MCV list size to 8k items).
*/
static bool *
mcv_get_match_bitmap(PlannerInfo *root, List *clauses,
Bitmapset *keys, MCVList * mcvlist, bool is_or)
{
int i;
ListCell *l;
bool *matches;
/* The bitmap may be partially built. */
Assert(clauses != NIL);
Assert(list_length(clauses) >= 1);
Assert(mcvlist != NULL);
Assert(mcvlist->nitems > 0);
Assert(mcvlist->nitems <= STATS_MCVLIST_MAX_ITEMS);
matches = palloc(sizeof(bool) * mcvlist->nitems);
memset(matches, (is_or) ? false : true,
sizeof(bool) * mcvlist->nitems);
/*
* Loop through the list of clauses, and for each of them evaluate all the
* MCV items not yet eliminated by the preceding clauses.
*/
foreach(l, clauses)
{
Node *clause = (Node *) lfirst(l);
/* if it's a RestrictInfo, then extract the clause */
if (IsA(clause, RestrictInfo))
clause = (Node *) ((RestrictInfo *) clause)->clause;
/*
* Handle the various types of clauses - OpClause, NullTest and
* AND/OR/NOT
*/
if (is_opclause(clause))
{
OpExpr *expr = (OpExpr *) clause;
bool varonleft = true;
bool ok;
FmgrInfo opproc;
/* get procedure computing operator selectivity */
RegProcedure oprrest = get_oprrest(expr->opno);
fmgr_info(get_opcode(expr->opno), &opproc);
ok = (NumRelids(clause) == 1) &&
(is_pseudo_constant_clause(lsecond(expr->args)) ||
(varonleft = false,
is_pseudo_constant_clause(linitial(expr->args))));
if (ok)
{
TypeCacheEntry *typecache;
FmgrInfo gtproc;
Var *var;
Const *cst;
bool isgt;
int idx;
/* extract the var and const from the expression */
var = (varonleft) ? linitial(expr->args) : lsecond(expr->args);
cst = (varonleft) ? lsecond(expr->args) : linitial(expr->args);
isgt = (!varonleft);
/* strip binary-compatible relabeling */
if (IsA(var, RelabelType))
var = (Var *) ((RelabelType *) var)->arg;
/* match the attribute to a dimension of the statistic */
idx = bms_member_index(keys, var->varattno);
/* get information about the >= procedure */
typecache = lookup_type_cache(var->vartype, TYPECACHE_GT_OPR);
fmgr_info(get_opcode(typecache->gt_opr), &gtproc);
/*
* Walk through the MCV items and evaluate the current clause.
* We can skip items that were already ruled out, and
* terminate if there are no remaining MCV items that might
* possibly match.
*/
for (i = 0; i < mcvlist->nitems; i++)
{
bool mismatch = false;
MCVItem *item = mcvlist->items[i];
/*
* For AND-lists, we can also mark NULL items as 'no
* match' (and then skip them). For OR-lists this is not
* possible.
*/
if ((!is_or) && item->isnull[idx])
matches[i] = false;
/* skip MCV items that were already ruled out */
if ((!is_or) && (matches[i] == false))
continue;
else if (is_or && (matches[i] == true))
continue;
switch (oprrest)
{
case F_EQSEL:
case F_NEQSEL:
/*
* We don't care about isgt in equality, because
* it does not matter whether it's (var op const)
* or (const op var).
*/
mismatch = !DatumGetBool(FunctionCall2Coll(&opproc,
DEFAULT_COLLATION_OID,
cst->constvalue,
item->values[idx]));
break;
case F_SCALARLTSEL: /* column < constant */
case F_SCALARLESEL: /* column <= constant */
case F_SCALARGTSEL: /* column > constant */
case F_SCALARGESEL: /* column >= constant */
/*
* First check whether the constant is below the
* lower boundary (in that case we can skip the
* bucket, because there's no overlap).
*/
if (isgt)
mismatch = !DatumGetBool(FunctionCall2Coll(&opproc,
DEFAULT_COLLATION_OID,
cst->constvalue,
item->values[idx]));
else
mismatch = !DatumGetBool(FunctionCall2Coll(&opproc,
DEFAULT_COLLATION_OID,
item->values[idx],
cst->constvalue));
break;
}
/*
* XXX The conditions on matches[i] are not needed, as we
* skip MCV items that can't become true/false, depending
* on the current flag. See beginning of the loop over MCV
* items.
*/
if ((is_or) && (!mismatch))
{
/* OR - was not a match before, matches now */
matches[i] = true;
continue;
}
else if ((!is_or) && mismatch)
{
/* AND - was a match before, does not match anymore */
matches[i] = false;
continue;
}
}
}
}
else if (IsA(clause, NullTest))
{
NullTest *expr = (NullTest *) clause;
Var *var = (Var *) (expr->arg);
/* match the attribute to a dimension of the statistic */
int idx = bms_member_index(keys, var->varattno);
/*
* Walk through the MCV items and evaluate the current clause. We
* can skip items that were already ruled out, and terminate if
* there are no remaining MCV items that might possibly match.
*/
for (i = 0; i < mcvlist->nitems; i++)
{
bool match = false; /* assume mismatch */
MCVItem *item = mcvlist->items[i];
/* if the clause mismatches the MCV item, update the bitmap */
switch (expr->nulltesttype)
{
case IS_NULL:
match = (item->isnull[idx]) ? true : match;
break;
case IS_NOT_NULL:
match = (!item->isnull[idx]) ? true : match;
break;
}
/* now, update the match bitmap, depending on OR/AND type */
if (is_or)
matches[i] = Max(matches[i], match);
else
matches[i] = Min(matches[i], match);
}
}
else if (is_orclause(clause) || is_andclause(clause))
{
/* AND/OR clause, with all subclauses being compatible */
int i;
BoolExpr *bool_clause = ((BoolExpr *) clause);
List *bool_clauses = bool_clause->args;
/* match/mismatch bitmap for each MCV item */
bool *bool_matches = NULL;
Assert(bool_clauses != NIL);
Assert(list_length(bool_clauses) >= 2);
/* build the match bitmap for the OR-clauses */
bool_matches = mcv_get_match_bitmap(root, bool_clauses, keys,
mcvlist, is_orclause(clause));
/*
* Merge the bitmap produced by mcv_get_match_bitmap into the
* current one. We need to consider if we're evaluating AND or OR
* condition when merging the results.
*/
for (i = 0; i < mcvlist->nitems; i++)
{
/* Is this OR or AND clause? */
if (is_or)
matches[i] = Max(matches[i], bool_matches[i]);
else
matches[i] = Min(matches[i], bool_matches[i]);
}
pfree(bool_matches);
}
else if (is_notclause(clause))
{
/* NOT clause, with all subclauses compatible */
int i;
BoolExpr *not_clause = ((BoolExpr *) clause);
List *not_args = not_clause->args;
/* match/mismatch bitmap for each MCV item */
bool *not_matches = NULL;
Assert(not_args != NIL);
Assert(list_length(not_args) == 1);
/* build the match bitmap for the NOT-clause */
not_matches = mcv_get_match_bitmap(root, not_args, keys,
mcvlist, false);
/*
* Merge the bitmap produced by mcv_get_match_bitmap into the
* current one.
*/
for (i = 0; i < mcvlist->nitems; i++)
{
/*
* When handling a NOT clause, we need to invert the result
* before merging it into the global result.
*/
if (not_matches[i] == false)
not_matches[i] = true;
else
not_matches[i] = false;
/* Is this OR or AND clause? */
if (is_or)
matches[i] = Max(matches[i], not_matches[i]);
else
matches[i] = Min(matches[i], not_matches[i]);
}
pfree(not_matches);
}
else if (IsA(clause, Var))
{
/* Var (has to be a boolean Var, possibly from below NOT) */
Var *var = (Var *) (clause);
/* match the attribute to a dimension of the statistic */
int idx = bms_member_index(keys, var->varattno);
Assert(var->vartype == BOOLOID);
/*
* Walk through the MCV items and evaluate the current clause. We
* can skip items that were already ruled out, and terminate if
* there are no remaining MCV items that might possibly match.
*/
for (i = 0; i < mcvlist->nitems; i++)
{
MCVItem *item = mcvlist->items[i];
bool match = false;
/* if the item is NULL, it's a mismatch */
if (!item->isnull[idx] && DatumGetBool(item->values[idx]))
match = true;
/* now, update the match bitmap, depending on OR/AND type */
if (is_or)
matches[i] = Max(matches[i], match);
else
matches[i] = Min(matches[i], match);
}
}
else
{
elog(ERROR, "unknown clause type: %d", clause->type);
}
}
return matches;
}
/*
* mcv_clauselist_selectivity
* Return the selectivity estimate computed using an MCV list.
*
* First builds a bitmap of MCV items matching the clauses, and then sums
* the frequencies of matching items.
*
* It also produces two additional interesting selectivities - total
* selectivity of all the MCV items (not just the matching ones), and the
* base frequency computed on the assumption of independence.
*/
Selectivity
mcv_clauselist_selectivity(PlannerInfo *root, StatisticExtInfo *stat,
List *clauses, int varRelid,
JoinType jointype, SpecialJoinInfo *sjinfo,
RelOptInfo *rel,
Selectivity *basesel, Selectivity *totalsel)
{
int i;
MCVList *mcv;
Selectivity s = 0.0;
/* match/mismatch bitmap for each MCV item */
bool *matches = NULL;
/* load the MCV list stored in the statistics object */
mcv = statext_mcv_load(stat->statOid);
/* build a match bitmap for the clauses */
matches = mcv_get_match_bitmap(root, clauses, stat->keys, mcv, false);
/* sum frequencies for all the matching MCV items */
*basesel = 0.0;
*totalsel = 0.0;
for (i = 0; i < mcv->nitems; i++)
{
*totalsel += mcv->items[i]->frequency;
if (matches[i] != false)
{
/* XXX Shouldn't the basesel be outside the if condition? */
*basesel += mcv->items[i]->base_frequency;
s += mcv->items[i]->frequency;
}
}
return s;
}
......@@ -1509,6 +1509,7 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
bool isnull;
bool ndistinct_enabled;
bool dependencies_enabled;
bool mcv_enabled;
int i;
statexttup = SearchSysCache1(STATEXTOID, ObjectIdGetDatum(statextid));
......@@ -1544,6 +1545,7 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
ndistinct_enabled = false;
dependencies_enabled = false;
mcv_enabled = false;
for (i = 0; i < ARR_DIMS(arr)[0]; i++)
{
......@@ -1551,6 +1553,8 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
ndistinct_enabled = true;
if (enabled[i] == STATS_EXT_DEPENDENCIES)
dependencies_enabled = true;
if (enabled[i] == STATS_EXT_MCV)
mcv_enabled = true;
}
/*
......@@ -1560,13 +1564,27 @@ pg_get_statisticsobj_worker(Oid statextid, bool missing_ok)
* statistics types on a newer postgres version, if the statistics had all
* options enabled on the original version.
*/
if (!ndistinct_enabled || !dependencies_enabled)
if (!ndistinct_enabled || !dependencies_enabled || !mcv_enabled)
{
bool gotone = false;
appendStringInfoString(&buf, " (");
if (ndistinct_enabled)
{
appendStringInfoString(&buf, "ndistinct");
else if (dependencies_enabled)
appendStringInfoString(&buf, "dependencies");
gotone = true;
}
if (dependencies_enabled)
{
appendStringInfo(&buf, "%sdependencies", gotone ? ", " : "");
gotone = true;
}
if (mcv_enabled)
appendStringInfo(&buf, "%smcv", gotone ? ", " : "");
appendStringInfoChar(&buf, ')');
}
......
......@@ -2629,7 +2629,8 @@ describeOneTableDetails(const char *schemaname,
" JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND\n"
" a.attnum = s.attnum AND NOT attisdropped)) AS columns,\n"
" 'd' = any(stxkind) AS ndist_enabled,\n"
" 'f' = any(stxkind) AS deps_enabled\n"
" 'f' = any(stxkind) AS deps_enabled,\n"
" 'm' = any(stxkind) AS mcv_enabled\n"
"FROM pg_catalog.pg_statistic_ext stat "
"WHERE stxrelid = '%s'\n"
"ORDER BY 1;",
......@@ -2666,6 +2667,12 @@ describeOneTableDetails(const char *schemaname,
if (strcmp(PQgetvalue(result, i, 6), "t") == 0)
{
appendPQExpBuffer(&buf, "%sdependencies", gotone ? ", " : "");
gotone = true;
}
if (strcmp(PQgetvalue(result, i, 7), "t") == 0)
{
appendPQExpBuffer(&buf, "%smcv", gotone ? ", " : "");
}
appendPQExpBuffer(&buf, ") ON %s FROM %s",
......
......@@ -2432,7 +2432,7 @@ psql_completion(const char *text, int start, int end)
else if (Matches("CREATE", "STATISTICS", MatchAny))
COMPLETE_WITH("(", "ON");
else if (Matches("CREATE", "STATISTICS", MatchAny, "("))
COMPLETE_WITH("ndistinct", "dependencies");
COMPLETE_WITH("ndistinct", "dependencies", "mcv");
else if (Matches("CREATE", "STATISTICS", MatchAny, "(*)"))
COMPLETE_WITH("ON");
else if (HeadMatches("CREATE", "STATISTICS", MatchAny) &&
......
......@@ -324,6 +324,12 @@
{ castsource => 'pg_dependencies', casttarget => 'text', castfunc => '0',
castcontext => 'i', castmethod => 'i' },
# pg_mcv_list can be coerced to, but not from, bytea and text
{ castsource => 'pg_mcv_list', casttarget => 'bytea', castfunc => '0',
castcontext => 'i', castmethod => 'b' },
{ castsource => 'pg_mcv_list', casttarget => 'text', castfunc => '0',
castcontext => 'i', castmethod => 'i' },
# Datetime category
{ castsource => 'date', casttarget => 'timestamp',
castfunc => 'timestamp(date)', castcontext => 'i', castmethod => 'f' },
......
......@@ -4999,6 +4999,30 @@
proname => 'pg_dependencies_send', provolatile => 's', prorettype => 'bytea',
proargtypes => 'pg_dependencies', prosrc => 'pg_dependencies_send' },
{ oid => '5018', descr => 'I/O',
proname => 'pg_mcv_list_in', prorettype => 'pg_mcv_list',
proargtypes => 'cstring', prosrc => 'pg_mcv_list_in' },
{ oid => '5019', descr => 'I/O',
proname => 'pg_mcv_list_out', prorettype => 'cstring',
proargtypes => 'pg_mcv_list', prosrc => 'pg_mcv_list_out' },
{ oid => '5020', descr => 'I/O',
proname => 'pg_mcv_list_recv', provolatile => 's',
prorettype => 'pg_mcv_list', proargtypes => 'internal',
prosrc => 'pg_mcv_list_recv' },
{ oid => '5021', descr => 'I/O',
proname => 'pg_mcv_list_send', provolatile => 's', prorettype => 'bytea',
proargtypes => 'pg_mcv_list', prosrc => 'pg_mcv_list_send' },
{ oid => '3427',
descr => 'details about MCV list items',
proname => 'pg_mcv_list_items', prorows => '1000', proisstrict => 't',
proretset => 't', provolatile => 's', prorettype => 'record',
proargtypes => 'pg_mcv_list',
proallargtypes => '{pg_mcv_list,int4,text,_bool,float8,float8}',
proargmodes => '{i,o,o,o,o,o}',
proargnames => '{mcv_list,index,values,nulls,frequency,base_frequency}',
prosrc => 'pg_stats_ext_mcvlist_items' },
{ oid => '1928', descr => 'statistics: number of scans done for table/index',
proname => 'pg_stat_get_numscans', provolatile => 's', proparallel => 'r',
prorettype => 'int8', proargtypes => 'oid',
......
......@@ -49,6 +49,7 @@ CATALOG(pg_statistic_ext,3381,StatisticExtRelationId)
* to build */
pg_ndistinct stxndistinct; /* ndistinct coefficients (serialized) */
pg_dependencies stxdependencies; /* dependencies (serialized) */
pg_mcv_list stxmcv; /* MCV (serialized) */
#endif
} FormData_pg_statistic_ext;
......@@ -64,6 +65,7 @@ typedef FormData_pg_statistic_ext *Form_pg_statistic_ext;
#define STATS_EXT_NDISTINCT 'd'
#define STATS_EXT_DEPENDENCIES 'f'
#define STATS_EXT_MCV 'm'
#endif /* EXPOSE_TO_CLIENT_CODE */
......
......@@ -165,6 +165,13 @@
typoutput => 'pg_dependencies_out', typreceive => 'pg_dependencies_recv',
typsend => 'pg_dependencies_send', typalign => 'i', typstorage => 'x',
typcollation => 'default' },
{ oid => '5017', oid_symbol => 'PGMCVLISTOID',
descr => 'multivariate MCV list',
typname => 'pg_mcv_list', typlen => '-1', typbyval => 'f',
typcategory => 'S', typinput => 'pg_mcv_list_in',
typoutput => 'pg_mcv_list_out', typreceive => 'pg_mcv_list_recv',
typsend => 'pg_mcv_list_send', typalign => 'i', typstorage => 'x',
typcollation => 'default' },
{ oid => '32', oid_symbol => 'PGDDLCOMMANDOID',
descr => 'internal type for passing CollectedCommand',
typname => 'pg_ddl_command', typlen => 'SIZEOF_POINTER', typbyval => 't',
......
......@@ -87,6 +87,7 @@ extern Bitmapset *bms_difference(const Bitmapset *a, const Bitmapset *b);
extern bool bms_is_subset(const Bitmapset *a, const Bitmapset *b);
extern BMS_Comparison bms_subset_compare(const Bitmapset *a, const Bitmapset *b);
extern bool bms_is_member(int x, const Bitmapset *a);
extern int bms_member_index(Bitmapset *a, int x);
extern bool bms_overlap(const Bitmapset *a, const Bitmapset *b);
extern bool bms_overlap_list(const Bitmapset *a, const struct List *b);
extern bool bms_nonempty_difference(const Bitmapset *a, const Bitmapset *b);
......
......@@ -58,6 +58,12 @@ extern Selectivity clause_selectivity(PlannerInfo *root,
int varRelid,
JoinType jointype,
SpecialJoinInfo *sjinfo);
extern Selectivity clauselist_selectivity_simple(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
SpecialJoinInfo *sjinfo,
Bitmapset *estimatedclauses);
extern Selectivity clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
......
......@@ -31,6 +31,15 @@ typedef struct
int tupno; /* position index for tuple it came from */
} ScalarItem;
/* (de)serialization info */
typedef struct DimensionInfo
{
int nvalues; /* number of deduplicated values */
int nbytes; /* number of bytes (serialized) */
int typlen; /* pg_type.typlen */
bool typbyval; /* pg_type.typbyval */
} DimensionInfo;
/* multi-sort */
typedef struct MultiSortSupportData
{
......@@ -44,6 +53,7 @@ typedef struct SortItem
{
Datum *values;
bool *isnull;
int count;
} SortItem;
extern MVNDistinct *statext_ndistinct_build(double totalrows,
......@@ -57,6 +67,12 @@ extern MVDependencies *statext_dependencies_build(int numrows, HeapTuple *rows,
extern bytea *statext_dependencies_serialize(MVDependencies *dependencies);
extern MVDependencies *statext_dependencies_deserialize(bytea *data);
extern MCVList *statext_mcv_build(int numrows, HeapTuple *rows,
Bitmapset *attrs, VacAttrStats **stats,
double totalrows);
extern bytea *statext_mcv_serialize(MCVList * mcv, VacAttrStats **stats);
extern MCVList * statext_mcv_deserialize(bytea *data);
extern MultiSortSupport multi_sort_init(int ndims);
extern void multi_sort_add_dimension(MultiSortSupport mss, int sortdim,
Oid oper, Oid collation);
......@@ -65,5 +81,29 @@ extern int multi_sort_compare_dim(int dim, const SortItem *a,
const SortItem *b, MultiSortSupport mss);
extern int multi_sort_compare_dims(int start, int end, const SortItem *a,
const SortItem *b, MultiSortSupport mss);
extern int compare_scalars_simple(const void *a, const void *b, void *arg);
extern int compare_datums_simple(Datum a, Datum b, SortSupport ssup);
extern void *bsearch_arg(const void *key, const void *base,
size_t nmemb, size_t size,
int (*compar) (const void *, const void *, void *),
void *arg);
extern AttrNumber *build_attnums_array(Bitmapset *attrs, int *numattrs);
extern SortItem *build_sorted_items(int numrows, int *nitems, HeapTuple *rows,
TupleDesc tdesc, MultiSortSupport mss,
int numattrs, AttrNumber *attnums);
extern Selectivity mcv_clauselist_selectivity(PlannerInfo *root,
StatisticExtInfo *stat,
List *clauses,
int varRelid,
JoinType jointype,
SpecialJoinInfo *sjinfo,
RelOptInfo *rel,
Selectivity *basesel,
Selectivity *totalsel);
#endif /* EXTENDED_STATS_INTERNAL_H */
......@@ -78,8 +78,41 @@ typedef struct MVDependencies
/* size of the struct excluding the deps array */
#define SizeOfDependencies (offsetof(MVDependencies, ndeps) + sizeof(uint32))
/* used to flag stats serialized to bytea */
#define STATS_MCV_MAGIC 0xE1A651C2 /* marks serialized bytea */
#define STATS_MCV_TYPE_BASIC 1 /* basic MCV list type */
/* max items in MCV list (mostly arbitrary number) */
#define STATS_MCVLIST_MAX_ITEMS 8192
/*
* Multivariate MCV (most-common value) lists
*
* A straightforward extension of MCV items - i.e. a list (array) of
* combinations of attribute values, together with a frequency and null flags.
*/
typedef struct MCVItem
{
double frequency; /* frequency of this combination */
double base_frequency; /* frequency if independent */
bool *isnull; /* NULL flags */
Datum *values; /* item values */
} MCVItem;
/* multivariate MCV list - essentally an array of MCV items */
typedef struct MCVList
{
uint32 magic; /* magic constant marker */
uint32 type; /* type of MCV list (BASIC) */
uint32 nitems; /* number of MCV items in the array */
AttrNumber ndimensions; /* number of dimensions */
Oid types[STATS_MAX_DIMENSIONS]; /* OIDs of data types */
MCVItem **items; /* array of MCV items */
} MCVList;
extern MVNDistinct *statext_ndistinct_load(Oid mvoid);
extern MVDependencies *statext_dependencies_load(Oid mvoid);
extern MCVList *statext_mcv_load(Oid mvoid);
extern void BuildRelationExtStatistics(Relation onerel, double totalrows,
int numrows, HeapTuple *rows,
......@@ -92,6 +125,13 @@ extern Selectivity dependencies_clauselist_selectivity(PlannerInfo *root,
SpecialJoinInfo *sjinfo,
RelOptInfo *rel,
Bitmapset **estimatedclauses);
extern Selectivity statext_clauselist_selectivity(PlannerInfo *root,
List *clauses,
int varRelid,
JoinType jointype,
SpecialJoinInfo *sjinfo,
RelOptInfo *rel,
Bitmapset **estimatedclauses);
extern bool has_stats_of_kind(List *stats, char requiredkind);
extern StatisticExtInfo *choose_best_statistics(List *stats,
Bitmapset *attnums, char requiredkind);
......
......@@ -243,7 +243,7 @@ Indexes:
Check constraints:
"ctlt1_a_check" CHECK (length(a) > 2)
Statistics objects:
"public"."ctlt_all_a_b_stat" (ndistinct, dependencies) ON a, b FROM ctlt_all
"public"."ctlt_all_a_b_stat" (ndistinct, dependencies, mcv) ON a, b FROM ctlt_all
SELECT c.relname, objsubid, description FROM pg_description, pg_index i, pg_class c WHERE classoid = 'pg_class'::regclass AND objoid = i.indexrelid AND c.oid = i.indexrelid AND i.indrelid = 'ctlt_all'::regclass ORDER BY c.relname, objsubid;
relname | objsubid | description
......
......@@ -892,11 +892,12 @@ WHERE c.castmethod = 'b' AND
pg_node_tree | text | 0 | i
pg_ndistinct | bytea | 0 | i
pg_dependencies | bytea | 0 | i
pg_mcv_list | bytea | 0 | i
cidr | inet | 0 | i
xml | text | 0 | a
xml | character varying | 0 | a
xml | character | 0 | a
(9 rows)
(10 rows)
-- **************** pg_conversion ****************
-- Look for illegal values in pg_conversion fields.
......
......@@ -58,7 +58,7 @@ ALTER TABLE ab1 DROP COLUMN a;
b | integer | | |
c | integer | | |
Statistics objects:
"public"."ab1_b_c_stats" (ndistinct, dependencies) ON b, c FROM ab1
"public"."ab1_b_c_stats" (ndistinct, dependencies, mcv) ON b, c FROM ab1
-- Ensure statistics are dropped when table is
SELECT stxname FROM pg_statistic_ext WHERE stxname LIKE 'ab1%';
......@@ -211,7 +211,7 @@ SELECT stxkind, stxndistinct
FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass;
stxkind | stxndistinct
---------+---------------------------------------------------------
{d,f} | {"3, 4": 301, "3, 6": 301, "4, 6": 301, "3, 4, 6": 301}
{d,f,m} | {"3, 4": 301, "3, 6": 301, "4, 6": 301, "3, 4, 6": 301}
(1 row)
-- Hash Aggregate, thanks to estimates improved by the statistic
......@@ -277,7 +277,7 @@ SELECT stxkind, stxndistinct
FROM pg_statistic_ext WHERE stxrelid = 'ndistinct'::regclass;
stxkind | stxndistinct
---------+-------------------------------------------------------------
{d,f} | {"3, 4": 2550, "3, 6": 800, "4, 6": 1632, "3, 4, 6": 10000}
{d,f,m} | {"3, 4": 2550, "3, 6": 800, "4, 6": 1632, "3, 4, 6": 10000}
(1 row)
-- plans using Group Aggregate, thanks to using correct esimates
......@@ -514,3 +514,284 @@ EXPLAIN (COSTS OFF)
(5 rows)
RESET random_page_cost;
-- check the number of estimated/actual rows in the top node
create function check_estimated_rows(text) returns table (estimated int, actual int)
language plpgsql as
$$
declare
ln text;
tmp text[];
first_row bool := true;
begin
for ln in
execute format('explain analyze %s', $1)
loop
if first_row then
first_row := false;
tmp := regexp_match(ln, 'rows=(\d*) .* rows=(\d*)');
return query select tmp[1]::int, tmp[2]::int;
end if;
end loop;
end;
$$;
-- MCV lists
CREATE TABLE mcv_lists (
filler1 TEXT,
filler2 NUMERIC,
a INT,
b VARCHAR,
filler3 DATE,
c INT,
d TEXT
);
-- random data (no MCV list)
INSERT INTO mcv_lists (a, b, c, filler1)
SELECT mod(i,37), mod(i,41), mod(i,43), mod(i,47) FROM generate_series(1,5000) s(i);
ANALYZE mcv_lists;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1''');
estimated | actual
-----------+--------
3 | 4
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1'' AND c = 1');
estimated | actual
-----------+--------
1 | 1
(1 row)
-- create statistics
CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
ANALYZE mcv_lists;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1''');
estimated | actual
-----------+--------
3 | 4
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1'' AND c = 1');
estimated | actual
-----------+--------
1 | 1
(1 row)
-- 100 distinct combinations, all in the MCV list
TRUNCATE mcv_lists;
DROP STATISTICS mcv_lists_stats;
INSERT INTO mcv_lists (a, b, c, filler1)
SELECT mod(i,100), mod(i,50), mod(i,25), i FROM generate_series(1,5000) s(i);
ANALYZE mcv_lists;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1''');
estimated | actual
-----------+--------
1 | 50
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a < 1 AND b < ''1''');
estimated | actual
-----------+--------
1 | 50
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a <= 0 AND b <= ''0''');
estimated | actual
-----------+--------
1 | 50
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1'' AND c = 1');
estimated | actual
-----------+--------
1 | 50
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a < 5 AND b < ''1'' AND c < 5');
estimated | actual
-----------+--------
1 | 50
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a <= 4 AND b <= ''0'' AND c <= 4');
estimated | actual
-----------+--------
1 | 50
(1 row)
-- create statistics
CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
ANALYZE mcv_lists;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1''');
estimated | actual
-----------+--------
50 | 50
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a < 1 AND b < ''1''');
estimated | actual
-----------+--------
50 | 50
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a <= 0 AND b <= ''0''');
estimated | actual
-----------+--------
50 | 50
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1'' AND c = 1');
estimated | actual
-----------+--------
50 | 50
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a < 5 AND b < ''1'' AND c < 5');
estimated | actual
-----------+--------
50 | 50
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a <= 4 AND b <= ''0'' AND c <= 4');
estimated | actual
-----------+--------
50 | 50
(1 row)
-- check change of unrelated column type does not reset the MCV statistics
ALTER TABLE mcv_lists ALTER COLUMN d TYPE VARCHAR(64);
SELECT stxmcv IS NOT NULL FROM pg_statistic_ext WHERE stxname = 'mcv_lists_stats';
?column?
----------
t
(1 row)
-- check change of column type resets the MCV statistics
ALTER TABLE mcv_lists ALTER COLUMN c TYPE numeric;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1''');
estimated | actual
-----------+--------
1 | 50
(1 row)
ANALYZE mcv_lists;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1''');
estimated | actual
-----------+--------
50 | 50
(1 row)
-- 100 distinct combinations with NULL values, all in the MCV list
TRUNCATE mcv_lists;
DROP STATISTICS mcv_lists_stats;
INSERT INTO mcv_lists (a, b, c, filler1)
SELECT
(CASE WHEN mod(i,100) = 1 THEN NULL ELSE mod(i,100) END),
(CASE WHEN mod(i,50) = 1 THEN NULL ELSE mod(i,50) END),
(CASE WHEN mod(i,25) = 1 THEN NULL ELSE mod(i,25) END),
i
FROM generate_series(1,5000) s(i);
ANALYZE mcv_lists;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL');
estimated | actual
-----------+--------
1 | 50
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL');
estimated | actual
-----------+--------
1 | 50
(1 row)
-- create statistics
CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
ANALYZE mcv_lists;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL');
estimated | actual
-----------+--------
50 | 50
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL');
estimated | actual
-----------+--------
50 | 50
(1 row)
RESET random_page_cost;
-- mcv with arrays
CREATE TABLE mcv_lists_arrays (
a TEXT[],
b NUMERIC[],
c INT[]
);
INSERT INTO mcv_lists_arrays (a, b, c)
SELECT
ARRAY[md5((i/100)::text), md5((i/100-1)::text), md5((i/100+1)::text)],
ARRAY[(i/100-1)::numeric/1000, (i/100)::numeric/1000, (i/100+1)::numeric/1000],
ARRAY[(i/100-1), i/100, (i/100+1)]
FROM generate_series(1,5000) s(i);
CREATE STATISTICS mcv_lists_arrays_stats (mcv) ON a, b, c
FROM mcv_lists_arrays;
ANALYZE mcv_lists_arrays;
-- mcv with bool
CREATE TABLE mcv_lists_bool (
a BOOL,
b BOOL,
c BOOL
);
INSERT INTO mcv_lists_bool (a, b, c)
SELECT
(mod(i,2) = 0), (mod(i,4) = 0), (mod(i,8) = 0)
FROM generate_series(1,10000) s(i);
ANALYZE mcv_lists_bool;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE a AND b AND c');
estimated | actual
-----------+--------
156 | 1250
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND c');
estimated | actual
-----------+--------
156 | 0
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a AND NOT b AND c');
estimated | actual
-----------+--------
469 | 0
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND NOT c');
estimated | actual
-----------+--------
1094 | 0
(1 row)
CREATE STATISTICS mcv_lists_bool_stats (mcv) ON a, b, c
FROM mcv_lists_bool;
ANALYZE mcv_lists_bool;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE a AND b AND c');
estimated | actual
-----------+--------
1250 | 1250
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND c');
estimated | actual
-----------+--------
1 | 0
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a AND NOT b AND c');
estimated | actual
-----------+--------
1 | 0
(1 row)
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND NOT c');
estimated | actual
-----------+--------
1 | 0
(1 row)
......@@ -72,7 +72,8 @@ WHERE p1.typtype not in ('c','d','p') AND p1.typname NOT LIKE E'\\_%'
194 | pg_node_tree
3361 | pg_ndistinct
3402 | pg_dependencies
(3 rows)
5017 | pg_mcv_list
(4 rows)
-- Make sure typarray points to a varlena array type of our own base
SELECT p1.oid, p1.typname as basetype, p2.typname as arraytype,
......
......@@ -280,3 +280,189 @@ EXPLAIN (COSTS OFF)
SELECT * FROM functional_dependencies WHERE a = 1 AND b = '1' AND c = 1;
RESET random_page_cost;
-- check the number of estimated/actual rows in the top node
create function check_estimated_rows(text) returns table (estimated int, actual int)
language plpgsql as
$$
declare
ln text;
tmp text[];
first_row bool := true;
begin
for ln in
execute format('explain analyze %s', $1)
loop
if first_row then
first_row := false;
tmp := regexp_match(ln, 'rows=(\d*) .* rows=(\d*)');
return query select tmp[1]::int, tmp[2]::int;
end if;
end loop;
end;
$$;
-- MCV lists
CREATE TABLE mcv_lists (
filler1 TEXT,
filler2 NUMERIC,
a INT,
b VARCHAR,
filler3 DATE,
c INT,
d TEXT
);
-- random data (no MCV list)
INSERT INTO mcv_lists (a, b, c, filler1)
SELECT mod(i,37), mod(i,41), mod(i,43), mod(i,47) FROM generate_series(1,5000) s(i);
ANALYZE mcv_lists;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1''');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1'' AND c = 1');
-- create statistics
CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
ANALYZE mcv_lists;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1''');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1'' AND c = 1');
-- 100 distinct combinations, all in the MCV list
TRUNCATE mcv_lists;
DROP STATISTICS mcv_lists_stats;
INSERT INTO mcv_lists (a, b, c, filler1)
SELECT mod(i,100), mod(i,50), mod(i,25), i FROM generate_series(1,5000) s(i);
ANALYZE mcv_lists;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1''');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a < 1 AND b < ''1''');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a <= 0 AND b <= ''0''');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1'' AND c = 1');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a < 5 AND b < ''1'' AND c < 5');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a <= 4 AND b <= ''0'' AND c <= 4');
-- create statistics
CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
ANALYZE mcv_lists;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1''');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a < 1 AND b < ''1''');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a <= 0 AND b <= ''0''');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1'' AND c = 1');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a < 5 AND b < ''1'' AND c < 5');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a <= 4 AND b <= ''0'' AND c <= 4');
-- check change of unrelated column type does not reset the MCV statistics
ALTER TABLE mcv_lists ALTER COLUMN d TYPE VARCHAR(64);
SELECT stxmcv IS NOT NULL FROM pg_statistic_ext WHERE stxname = 'mcv_lists_stats';
-- check change of column type resets the MCV statistics
ALTER TABLE mcv_lists ALTER COLUMN c TYPE numeric;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1''');
ANALYZE mcv_lists;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a = 1 AND b = ''1''');
-- 100 distinct combinations with NULL values, all in the MCV list
TRUNCATE mcv_lists;
DROP STATISTICS mcv_lists_stats;
INSERT INTO mcv_lists (a, b, c, filler1)
SELECT
(CASE WHEN mod(i,100) = 1 THEN NULL ELSE mod(i,100) END),
(CASE WHEN mod(i,50) = 1 THEN NULL ELSE mod(i,50) END),
(CASE WHEN mod(i,25) = 1 THEN NULL ELSE mod(i,25) END),
i
FROM generate_series(1,5000) s(i);
ANALYZE mcv_lists;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL');
-- create statistics
CREATE STATISTICS mcv_lists_stats (mcv) ON a, b, c FROM mcv_lists;
ANALYZE mcv_lists;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists WHERE a IS NULL AND b IS NULL AND c IS NULL');
RESET random_page_cost;
-- mcv with arrays
CREATE TABLE mcv_lists_arrays (
a TEXT[],
b NUMERIC[],
c INT[]
);
INSERT INTO mcv_lists_arrays (a, b, c)
SELECT
ARRAY[md5((i/100)::text), md5((i/100-1)::text), md5((i/100+1)::text)],
ARRAY[(i/100-1)::numeric/1000, (i/100)::numeric/1000, (i/100+1)::numeric/1000],
ARRAY[(i/100-1), i/100, (i/100+1)]
FROM generate_series(1,5000) s(i);
CREATE STATISTICS mcv_lists_arrays_stats (mcv) ON a, b, c
FROM mcv_lists_arrays;
ANALYZE mcv_lists_arrays;
-- mcv with bool
CREATE TABLE mcv_lists_bool (
a BOOL,
b BOOL,
c BOOL
);
INSERT INTO mcv_lists_bool (a, b, c)
SELECT
(mod(i,2) = 0), (mod(i,4) = 0), (mod(i,8) = 0)
FROM generate_series(1,10000) s(i);
ANALYZE mcv_lists_bool;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE a AND b AND c');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND c');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a AND NOT b AND c');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND NOT c');
CREATE STATISTICS mcv_lists_bool_stats (mcv) ON a, b, c
FROM mcv_lists_bool;
ANALYZE mcv_lists_bool;
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE a AND b AND c');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND c');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a AND NOT b AND c');
SELECT * FROM check_estimated_rows('SELECT * FROM mcv_lists_bool WHERE NOT a AND b AND NOT c');
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment