Commit b3d24cc0 authored by Alvaro Herrera's avatar Alvaro Herrera

Revert analyze support for partitioned tables

This reverts the following commits:
1b5617eb Describe (auto-)analyze behavior for partitioned tables
0e69f705 Set pg_class.reltuples for partitioned tables
41badeab Document ANALYZE storage parameters for partitioned tables
0827e8af autovacuum: handle analyze for partitioned tables

There are efficiency issues in this code when handling databases with
large numbers of partitions, and it doesn't look like there isn't any
trivial way to handle those.  There are some other issues as well.  It's
now too late in the cycle for nontrivial fixes, so we'll have to let
Postgres 14 users continue to manually deal with ANALYZE their
partitioned tables, and hopefully we can fix the issues for Postgres 15.

I kept [most of] be280cda ("Don't reset relhasindex for partitioned
tables on ANALYZE") because while we added it due to 0827e8af, it is
a good bugfix in its own right, since it affects manual analyze as well
as autovacuum-induced analyze, and there's no reason to revert it.

I retained the addition of relkind 'p' to tables included by
pg_stat_user_tables, because reverting that would require a catversion
bump.
Also, in pg14 only, I keep a struct member that was added to
PgStat_TabStatEntry to avoid breaking compatibility with existing stat
files.

Backpatch to 14.

Discussion: https://postgr.es/m/20210722205458.f2bug3z6qzxzpx2s@alap3.anarazel.de
parent f83d80ea
...@@ -817,12 +817,6 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu ...@@ -817,12 +817,6 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
</programlisting> </programlisting>
is compared to the total number of tuples inserted, updated, or deleted is compared to the total number of tuples inserted, updated, or deleted
since the last <command>ANALYZE</command>. since the last <command>ANALYZE</command>.
For partitioned tables, inserts, updates and deletes on partitions
are counted towards this threshold; however, DDL
operations such as <literal>ATTACH</literal>, <literal>DETACH</literal>
and <literal>DROP</literal> are not, so running a manual
<command>ANALYZE</command> is recommended if the partition added or
removed contains a statistically significant volume of data.
</para> </para>
<para> <para>
......
...@@ -1767,8 +1767,7 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse; ...@@ -1767,8 +1767,7 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
<para> <para>
Whenever you have significantly altered the distribution of data Whenever you have significantly altered the distribution of data
within a table, running <link linkend="sql-analyze"><command>ANALYZE</command></link> is strongly recommended. This within a table, running <link linkend="sql-analyze"><command>ANALYZE</command></link> is strongly recommended. This
includes bulk loading large amounts of data into the table as well as includes bulk loading large amounts of data into the table. Running
attaching, detaching or dropping partitions. Running
<command>ANALYZE</command> (or <command>VACUUM ANALYZE</command>) <command>ANALYZE</command> (or <command>VACUUM ANALYZE</command>)
ensures that the planner has up-to-date statistics about the ensures that the planner has up-to-date statistics about the
table. With no statistics or obsolete statistics, the planner might table. With no statistics or obsolete statistics, the planner might
......
...@@ -250,38 +250,20 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea ...@@ -250,38 +250,20 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
</para> </para>
<para> <para>
If the table being analyzed is partitioned, <command>ANALYZE</command> If the table being analyzed has one or more children,
will gather statistics by sampling blocks randomly from its partitions; <command>ANALYZE</command> will gather statistics twice: once on the
in addition, it will recurse into each partition and update its statistics. rows of the parent table only, and a second time on the rows of the
(However, in multi-level partitioning scenarios, each leaf partition parent table with all of its children. This second set of statistics
will only be analyzed once.) is needed when planning queries that traverse the entire inheritance
By contrast, if the table being analyzed has inheritance children, tree. The autovacuum daemon, however, will only consider inserts or
<command>ANALYZE</command> will gather statistics for it twice: updates on the parent table itself when deciding whether to trigger an
once on the rows of the parent table only, and a second time on the automatic analyze for that table. If that table is rarely inserted into
rows of the parent table with all of its children. This second set of or updated, the inheritance statistics will not be up to date unless you
statistics is needed when planning queries that traverse the entire run <command>ANALYZE</command> manually.
inheritance tree. The child tables themselves are not individually
analyzed in this case.
</para> </para>
<para> <para>
The autovacuum daemon counts inserts, updates and deletes in the If any of the child tables are foreign tables whose foreign data wrappers
partitions to determine if auto-analyze is needed. However, adding
or removing partitions does not affect autovacuum daemon decisions,
so triggering a manual <command>ANALYZE</command> is recommended
when this occurs.
</para>
<para>
Tuples changed in inheritance children do not count towards analyze
on the parent table. If the parent table is empty or rarely modified,
it may never be processed by autovacuum. It's necessary to
periodically run a manual <command>ANALYZE</command> to keep the
statistics of the table hierarchy up to date.
</para>
<para>
If any of the child tables or partitions are foreign tables whose foreign data wrappers
do not support <command>ANALYZE</command>, those child tables are ignored while do not support <command>ANALYZE</command>, those child tables are ignored while
gathering inheritance statistics. gathering inheritance statistics.
</para> </para>
......
...@@ -1374,8 +1374,8 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM ...@@ -1374,8 +1374,8 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
If a table parameter value is set and the If a table parameter value is set and the
equivalent <literal>toast.</literal> parameter is not, the TOAST table equivalent <literal>toast.</literal> parameter is not, the TOAST table
will use the table's parameter value. will use the table's parameter value.
Except where noted, these parameters are not supported on partitioned Specifying these parameters for partitioned tables is not supported,
tables; however, you can specify them on individual leaf partitions. but you may specify them for individual leaf partitions.
</para> </para>
<variablelist> <variablelist>
...@@ -1457,8 +1457,6 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM ...@@ -1457,8 +1457,6 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
If true, the autovacuum daemon will perform automatic <command>VACUUM</command> If true, the autovacuum daemon will perform automatic <command>VACUUM</command>
and/or <command>ANALYZE</command> operations on this table following the rules and/or <command>ANALYZE</command> operations on this table following the rules
discussed in <xref linkend="autovacuum"/>. discussed in <xref linkend="autovacuum"/>.
This parameter can be set for partitioned tables to prevent autovacuum
from running <command>ANALYZE</command> on them.
If false, this table will not be autovacuumed, except to prevent If false, this table will not be autovacuumed, except to prevent
transaction ID wraparound. See <xref linkend="vacuum-for-wraparound"/> for transaction ID wraparound. See <xref linkend="vacuum-for-wraparound"/> for
more about wraparound prevention. more about wraparound prevention.
...@@ -1590,7 +1588,6 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM ...@@ -1590,7 +1588,6 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
<para> <para>
Per-table value for <xref linkend="guc-autovacuum-analyze-threshold"/> Per-table value for <xref linkend="guc-autovacuum-analyze-threshold"/>
parameter. parameter.
This parameter can be set for partitioned tables.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
...@@ -1606,7 +1603,6 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM ...@@ -1606,7 +1603,6 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
<para> <para>
Per-table value for <xref linkend="guc-autovacuum-analyze-scale-factor"/> Per-table value for <xref linkend="guc-autovacuum-analyze-scale-factor"/>
parameter. parameter.
This parameter can be set for partitioned tables.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
......
...@@ -922,10 +922,8 @@ CREATE DATABASE foo WITH TEMPLATE template0; ...@@ -922,10 +922,8 @@ CREATE DATABASE foo WITH TEMPLATE template0;
<para> <para>
Once restored, it is wise to run <command>ANALYZE</command> on each Once restored, it is wise to run <command>ANALYZE</command> on each
restored table so the optimizer has useful statistics. restored table so the optimizer has useful statistics; see
If the table is a partition or an inheritance child, it may also be useful <xref linkend="vacuum-for-statistics"/> and
to analyze the parent to update statistics for the table hierarchy.
See <xref linkend="vacuum-for-statistics"/> and
<xref linkend="autovacuum"/> for more information. <xref linkend="autovacuum"/> for more information.
</para> </para>
......
...@@ -108,7 +108,7 @@ static relopt_bool boolRelOpts[] = ...@@ -108,7 +108,7 @@ static relopt_bool boolRelOpts[] =
{ {
"autovacuum_enabled", "autovacuum_enabled",
"Enables autovacuum in this relation", "Enables autovacuum in this relation",
RELOPT_KIND_HEAP | RELOPT_KIND_TOAST | RELOPT_KIND_PARTITIONED, RELOPT_KIND_HEAP | RELOPT_KIND_TOAST,
ShareUpdateExclusiveLock ShareUpdateExclusiveLock
}, },
true true
...@@ -237,7 +237,7 @@ static relopt_int intRelOpts[] = ...@@ -237,7 +237,7 @@ static relopt_int intRelOpts[] =
{ {
"autovacuum_analyze_threshold", "autovacuum_analyze_threshold",
"Minimum number of tuple inserts, updates or deletes prior to analyze", "Minimum number of tuple inserts, updates or deletes prior to analyze",
RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED, RELOPT_KIND_HEAP,
ShareUpdateExclusiveLock ShareUpdateExclusiveLock
}, },
-1, 0, INT_MAX -1, 0, INT_MAX
...@@ -411,7 +411,7 @@ static relopt_real realRelOpts[] = ...@@ -411,7 +411,7 @@ static relopt_real realRelOpts[] =
{ {
"autovacuum_analyze_scale_factor", "autovacuum_analyze_scale_factor",
"Number of tuple inserts, updates or deletes prior to analyze as a fraction of reltuples", "Number of tuple inserts, updates or deletes prior to analyze as a fraction of reltuples",
RELOPT_KIND_HEAP | RELOPT_KIND_PARTITIONED, RELOPT_KIND_HEAP,
ShareUpdateExclusiveLock ShareUpdateExclusiveLock
}, },
-1, 0.0, 100.0 -1, 0.0, 100.0
...@@ -1979,11 +1979,12 @@ bytea * ...@@ -1979,11 +1979,12 @@ bytea *
partitioned_table_reloptions(Datum reloptions, bool validate) partitioned_table_reloptions(Datum reloptions, bool validate)
{ {
/* /*
* autovacuum_enabled, autovacuum_analyze_threshold and * There are no options for partitioned tables yet, but this is able to do
* autovacuum_analyze_scale_factor are supported for partitioned tables. * some validation.
*/ */
return (bytea *) build_reloptions(reloptions, validate,
return default_reloptions(reloptions, validate, RELOPT_KIND_PARTITIONED); RELOPT_KIND_PARTITIONED,
0, NULL, 0);
} }
/* /*
......
...@@ -626,8 +626,8 @@ do_analyze_rel(Relation onerel, VacuumParams *params, ...@@ -626,8 +626,8 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
PROGRESS_ANALYZE_PHASE_FINALIZE_ANALYZE); PROGRESS_ANALYZE_PHASE_FINALIZE_ANALYZE);
/* /*
* Update pages/tuples stats in pg_class ... but not if we're doing * Update pages/tuples stats in pg_class, and report ANALYZE to the stats
* inherited stats. * collector ... but not if we're doing inherited stats.
* *
* We assume that VACUUM hasn't set pg_class.reltuples already, even * We assume that VACUUM hasn't set pg_class.reltuples already, even
* during a VACUUM ANALYZE. Although VACUUM often updates pg_class, * during a VACUUM ANALYZE. Although VACUUM often updates pg_class,
...@@ -668,47 +668,19 @@ do_analyze_rel(Relation onerel, VacuumParams *params, ...@@ -668,47 +668,19 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
InvalidMultiXactId, InvalidMultiXactId,
in_outer_xact); in_outer_xact);
} }
}
else if (onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
{
/* /*
* Partitioned tables don't have storage, so we don't set any fields * Now report ANALYZE to the stats collector.
* in their pg_class entries except for reltuples, which is necessary *
* for auto-analyze to work properly, and relhasindex. * We deliberately don't report to the stats collector when doing
* inherited stats, because the stats collector only tracks per-table
* stats.
*
* Reset the changes_since_analyze counter only if we analyzed all
* columns; otherwise, there is still work for auto-analyze to do.
*/ */
vac_update_relstats(onerel, -1, totalrows,
0, hasindex, InvalidTransactionId,
InvalidMultiXactId,
in_outer_xact);
}
/*
* Now report ANALYZE to the stats collector. For regular tables, we do
* it only if not doing inherited stats. For partitioned tables, we only
* do it for inherited stats. (We're never called for not-inherited stats
* on partitioned tables anyway.)
*
* Reset the changes_since_analyze counter only if we analyzed all
* columns; otherwise, there is still work for auto-analyze to do.
*/
if (!inh || onerel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
pgstat_report_analyze(onerel, totalrows, totaldeadrows, pgstat_report_analyze(onerel, totalrows, totaldeadrows,
(va_cols == NIL)); (va_cols == NIL));
/*
* If this is a manual analyze of all columns of a permanent leaf
* partition, and not doing inherited stats, also let the collector know
* about the ancestor tables of this partition. Autovacuum does the
* equivalent of this at the start of its run, so there's no reason to do
* it there.
*/
if (!inh && !IsAutoVacuumWorkerProcess() &&
(va_cols == NIL) &&
onerel->rd_rel->relispartition &&
onerel->rd_rel->relkind == RELKIND_RELATION &&
onerel->rd_rel->relpersistence == RELPERSISTENCE_PERMANENT)
{
pgstat_report_anl_ancestors(RelationGetRelid(onerel));
} }
/* /*
......
...@@ -335,7 +335,6 @@ typedef struct ForeignTruncateInfo ...@@ -335,7 +335,6 @@ typedef struct ForeignTruncateInfo
static void truncate_check_rel(Oid relid, Form_pg_class reltuple); static void truncate_check_rel(Oid relid, Form_pg_class reltuple);
static void truncate_check_perms(Oid relid, Form_pg_class reltuple); static void truncate_check_perms(Oid relid, Form_pg_class reltuple);
static void truncate_check_activity(Relation rel); static void truncate_check_activity(Relation rel);
static void truncate_update_partedrel_stats(List *parted_rels);
static void RangeVarCallbackForTruncate(const RangeVar *relation, static void RangeVarCallbackForTruncate(const RangeVar *relation,
Oid relId, Oid oldRelId, void *arg); Oid relId, Oid oldRelId, void *arg);
static List *MergeAttributes(List *schema, List *supers, char relpersistence, static List *MergeAttributes(List *schema, List *supers, char relpersistence,
...@@ -1739,7 +1738,6 @@ ExecuteTruncateGuts(List *explicit_rels, ...@@ -1739,7 +1738,6 @@ ExecuteTruncateGuts(List *explicit_rels,
{ {
List *rels; List *rels;
List *seq_relids = NIL; List *seq_relids = NIL;
List *parted_rels = NIL;
HTAB *ft_htab = NULL; HTAB *ft_htab = NULL;
EState *estate; EState *estate;
ResultRelInfo *resultRelInfos; ResultRelInfo *resultRelInfos;
...@@ -1888,15 +1886,9 @@ ExecuteTruncateGuts(List *explicit_rels, ...@@ -1888,15 +1886,9 @@ ExecuteTruncateGuts(List *explicit_rels,
{ {
Relation rel = (Relation) lfirst(cell); Relation rel = (Relation) lfirst(cell);
/* /* Skip partitioned tables as there is nothing to do */
* Save OID of partitioned tables for later; nothing else to do for
* them here.
*/
if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
{
parted_rels = lappend_oid(parted_rels, RelationGetRelid(rel));
continue; continue;
}
/* /*
* Build the lists of foreign tables belonging to each foreign server * Build the lists of foreign tables belonging to each foreign server
...@@ -2044,9 +2036,6 @@ ExecuteTruncateGuts(List *explicit_rels, ...@@ -2044,9 +2036,6 @@ ExecuteTruncateGuts(List *explicit_rels,
ResetSequence(seq_relid); ResetSequence(seq_relid);
} }
/* Reset partitioned tables' pg_class.reltuples */
truncate_update_partedrel_stats(parted_rels);
/* /*
* Write a WAL record to allow this set of actions to be logically * Write a WAL record to allow this set of actions to be logically
* decoded. * decoded.
...@@ -2193,40 +2182,6 @@ truncate_check_activity(Relation rel) ...@@ -2193,40 +2182,6 @@ truncate_check_activity(Relation rel)
CheckTableNotInUse(rel, "TRUNCATE"); CheckTableNotInUse(rel, "TRUNCATE");
} }
/*
* Update pg_class.reltuples for all the given partitioned tables to 0.
*/
static void
truncate_update_partedrel_stats(List *parted_rels)
{
Relation pg_class;
ListCell *lc;
pg_class = table_open(RelationRelationId, RowExclusiveLock);
foreach(lc, parted_rels)
{
Oid relid = lfirst_oid(lc);
HeapTuple tuple;
Form_pg_class rd_rel;
tuple = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(relid));
if (!HeapTupleIsValid(tuple))
elog(ERROR, "could not find tuple for relation %u", relid);
rd_rel = (Form_pg_class) GETSTRUCT(tuple);
if (rd_rel->reltuples != (float4) 0)
{
rd_rel->reltuples = (float4) 0;
heap_inplace_update(pg_class, tuple);
}
heap_freetuple(tuple);
}
table_close(pg_class, RowExclusiveLock);
}
/* /*
* storage_name * storage_name
* returns the name corresponding to a typstorage/attstorage enum value * returns the name corresponding to a typstorage/attstorage enum value
......
...@@ -75,7 +75,6 @@ ...@@ -75,7 +75,6 @@
#include "catalog/dependency.h" #include "catalog/dependency.h"
#include "catalog/namespace.h" #include "catalog/namespace.h"
#include "catalog/pg_database.h" #include "catalog/pg_database.h"
#include "catalog/pg_inherits.h"
#include "commands/dbcommands.h" #include "commands/dbcommands.h"
#include "commands/vacuum.h" #include "commands/vacuum.h"
#include "lib/ilist.h" #include "lib/ilist.h"
...@@ -1970,7 +1969,6 @@ do_autovacuum(void) ...@@ -1970,7 +1969,6 @@ do_autovacuum(void)
int effective_multixact_freeze_max_age; int effective_multixact_freeze_max_age;
bool did_vacuum = false; bool did_vacuum = false;
bool found_concurrent_worker = false; bool found_concurrent_worker = false;
bool updated = false;
int i; int i;
/* /*
...@@ -2056,19 +2054,12 @@ do_autovacuum(void) ...@@ -2056,19 +2054,12 @@ do_autovacuum(void)
/* /*
* Scan pg_class to determine which tables to vacuum. * Scan pg_class to determine which tables to vacuum.
* *
* We do this in three passes: First we let pgstat collector know about * We do this in two passes: on the first one we collect the list of plain
* the partitioned table ancestors of all partitions that have recently * relations and materialized views, and on the second one we collect
* acquired rows for analyze. This informs the second pass about the * TOAST tables. The reason for doing the second pass is that during it we
* total number of tuple count in partitioning hierarchies. * want to use the main relation's pg_class.reloptions entry if the TOAST
* * table does not have any, and we cannot obtain it unless we know
* On the second pass, we collect the list of plain relations, * beforehand what's the main table OID.
* materialized views and partitioned tables. On the third one we collect
* TOAST tables.
*
* The reason for doing the third pass is that during it we want to use
* the main relation's pg_class.reloptions entry if the TOAST table does
* not have any, and we cannot obtain it unless we know beforehand what's
* the main table OID.
* *
* We need to check TOAST tables separately because in cases with short, * We need to check TOAST tables separately because in cases with short,
* wide tables there might be proportionally much more activity in the * wide tables there might be proportionally much more activity in the
...@@ -2077,44 +2068,7 @@ do_autovacuum(void) ...@@ -2077,44 +2068,7 @@ do_autovacuum(void)
relScan = table_beginscan_catalog(classRel, 0, NULL); relScan = table_beginscan_catalog(classRel, 0, NULL);
/* /*
* First pass: before collecting the list of tables to vacuum, let stat * On the first pass, we collect main tables to vacuum, and also the main
* collector know about partitioned-table ancestors of each partition.
*/
while ((tuple = heap_getnext(relScan, ForwardScanDirection)) != NULL)
{
Form_pg_class classForm = (Form_pg_class) GETSTRUCT(tuple);
Oid relid = classForm->oid;
PgStat_StatTabEntry *tabentry;
/* Only consider permanent leaf partitions */
if (!classForm->relispartition ||
classForm->relkind == RELKIND_PARTITIONED_TABLE ||
classForm->relpersistence == RELPERSISTENCE_TEMP)
continue;
/*
* No need to do this for partitions that haven't acquired any rows.
*/
tabentry = pgstat_fetch_stat_tabentry(relid);
if (tabentry &&
tabentry->changes_since_analyze -
tabentry->changes_since_analyze_reported > 0)
{
pgstat_report_anl_ancestors(relid);
updated = true;
}
}
/* Acquire fresh stats for the next passes, if needed */
if (updated)
{
autovac_refresh_stats();
dbentry = pgstat_fetch_stat_dbentry(MyDatabaseId);
shared = pgstat_fetch_stat_dbentry(InvalidOid);
}
/*
* On the second pass, we collect main tables to vacuum, and also the main
* table relid to TOAST relid mapping. * table relid to TOAST relid mapping.
*/ */
while ((tuple = heap_getnext(relScan, ForwardScanDirection)) != NULL) while ((tuple = heap_getnext(relScan, ForwardScanDirection)) != NULL)
...@@ -2128,8 +2082,7 @@ do_autovacuum(void) ...@@ -2128,8 +2082,7 @@ do_autovacuum(void)
bool wraparound; bool wraparound;
if (classForm->relkind != RELKIND_RELATION && if (classForm->relkind != RELKIND_RELATION &&
classForm->relkind != RELKIND_MATVIEW && classForm->relkind != RELKIND_MATVIEW)
classForm->relkind != RELKIND_PARTITIONED_TABLE)
continue; continue;
relid = classForm->oid; relid = classForm->oid;
...@@ -2204,7 +2157,7 @@ do_autovacuum(void) ...@@ -2204,7 +2157,7 @@ do_autovacuum(void)
table_endscan(relScan); table_endscan(relScan);
/* third pass: check TOAST tables */ /* second pass: check TOAST tables */
ScanKeyInit(&key, ScanKeyInit(&key,
Anum_pg_class_relkind, Anum_pg_class_relkind,
BTEqualStrategyNumber, F_CHAREQ, BTEqualStrategyNumber, F_CHAREQ,
...@@ -2797,7 +2750,6 @@ extract_autovac_opts(HeapTuple tup, TupleDesc pg_class_desc) ...@@ -2797,7 +2750,6 @@ extract_autovac_opts(HeapTuple tup, TupleDesc pg_class_desc)
Assert(((Form_pg_class) GETSTRUCT(tup))->relkind == RELKIND_RELATION || Assert(((Form_pg_class) GETSTRUCT(tup))->relkind == RELKIND_RELATION ||
((Form_pg_class) GETSTRUCT(tup))->relkind == RELKIND_MATVIEW || ((Form_pg_class) GETSTRUCT(tup))->relkind == RELKIND_MATVIEW ||
((Form_pg_class) GETSTRUCT(tup))->relkind == RELKIND_PARTITIONED_TABLE ||
((Form_pg_class) GETSTRUCT(tup))->relkind == RELKIND_TOASTVALUE); ((Form_pg_class) GETSTRUCT(tup))->relkind == RELKIND_TOASTVALUE);
relopts = extractRelOptions(tup, pg_class_desc, NULL); relopts = extractRelOptions(tup, pg_class_desc, NULL);
......
...@@ -38,7 +38,6 @@ ...@@ -38,7 +38,6 @@
#include "access/transam.h" #include "access/transam.h"
#include "access/twophase_rmgr.h" #include "access/twophase_rmgr.h"
#include "access/xact.h" #include "access/xact.h"
#include "catalog/partition.h"
#include "catalog/pg_database.h" #include "catalog/pg_database.h"
#include "catalog/pg_proc.h" #include "catalog/pg_proc.h"
#include "common/ip.h" #include "common/ip.h"
...@@ -345,7 +344,6 @@ static void pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg ...@@ -345,7 +344,6 @@ static void pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg
static void pgstat_recv_autovac(PgStat_MsgAutovacStart *msg, int len); static void pgstat_recv_autovac(PgStat_MsgAutovacStart *msg, int len);
static void pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len); static void pgstat_recv_vacuum(PgStat_MsgVacuum *msg, int len);
static void pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len); static void pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len);
static void pgstat_recv_anl_ancestors(PgStat_MsgAnlAncestors *msg, int len);
static void pgstat_recv_archiver(PgStat_MsgArchiver *msg, int len); static void pgstat_recv_archiver(PgStat_MsgArchiver *msg, int len);
static void pgstat_recv_bgwriter(PgStat_MsgBgWriter *msg, int len); static void pgstat_recv_bgwriter(PgStat_MsgBgWriter *msg, int len);
static void pgstat_recv_wal(PgStat_MsgWal *msg, int len); static void pgstat_recv_wal(PgStat_MsgWal *msg, int len);
...@@ -1599,9 +1597,6 @@ pgstat_report_vacuum(Oid tableoid, bool shared, ...@@ -1599,9 +1597,6 @@ pgstat_report_vacuum(Oid tableoid, bool shared,
* *
* Caller must provide new live- and dead-tuples estimates, as well as a * Caller must provide new live- and dead-tuples estimates, as well as a
* flag indicating whether to reset the changes_since_analyze counter. * flag indicating whether to reset the changes_since_analyze counter.
* Exceptional support only changes_since_analyze for partitioned tables,
* though they don't have any data. This counter will tell us whether
* partitioned tables need autoanalyze or not.
* -------- * --------
*/ */
void void
...@@ -1623,31 +1618,21 @@ pgstat_report_analyze(Relation rel, ...@@ -1623,31 +1618,21 @@ pgstat_report_analyze(Relation rel,
* be double-counted after commit. (This approach also ensures that the * be double-counted after commit. (This approach also ensures that the
* collector ends up with the right numbers if we abort instead of * collector ends up with the right numbers if we abort instead of
* committing.) * committing.)
*
* For partitioned tables, we don't report live and dead tuples, because
* such tables don't have any data.
*/ */
if (rel->pgstat_info != NULL) if (rel->pgstat_info != NULL)
{ {
PgStat_TableXactStatus *trans; PgStat_TableXactStatus *trans;
if (rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE) for (trans = rel->pgstat_info->trans; trans; trans = trans->upper)
/* If this rel is partitioned, skip modifying */
livetuples = deadtuples = 0;
else
{ {
for (trans = rel->pgstat_info->trans; trans; trans = trans->upper) livetuples -= trans->tuples_inserted - trans->tuples_deleted;
{ deadtuples -= trans->tuples_updated + trans->tuples_deleted;
livetuples -= trans->tuples_inserted - trans->tuples_deleted;
deadtuples -= trans->tuples_updated + trans->tuples_deleted;
}
/* count stuff inserted by already-aborted subxacts, too */
deadtuples -= rel->pgstat_info->t_counts.t_delta_dead_tuples;
/* Since ANALYZE's counts are estimates, we could have underflowed */
livetuples = Max(livetuples, 0);
deadtuples = Max(deadtuples, 0);
} }
/* count stuff inserted by already-aborted subxacts, too */
deadtuples -= rel->pgstat_info->t_counts.t_delta_dead_tuples;
/* Since ANALYZE's counts are estimates, we could have underflowed */
livetuples = Max(livetuples, 0);
deadtuples = Max(deadtuples, 0);
} }
pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_ANALYZE); pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_ANALYZE);
...@@ -1659,48 +1644,6 @@ pgstat_report_analyze(Relation rel, ...@@ -1659,48 +1644,6 @@ pgstat_report_analyze(Relation rel,
msg.m_live_tuples = livetuples; msg.m_live_tuples = livetuples;
msg.m_dead_tuples = deadtuples; msg.m_dead_tuples = deadtuples;
pgstat_send(&msg, sizeof(msg)); pgstat_send(&msg, sizeof(msg));
}
/*
* pgstat_report_anl_ancestors
*
* Send list of partitioned table ancestors of the given partition to the
* collector. The collector is in charge of propagating the analyze tuple
* counts from the partition to its ancestors. This is necessary so that
* other processes can decide whether to analyze the partitioned tables.
*/
void
pgstat_report_anl_ancestors(Oid relid)
{
PgStat_MsgAnlAncestors msg;
List *ancestors;
ListCell *lc;
pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_ANL_ANCESTORS);
msg.m_databaseid = MyDatabaseId;
msg.m_tableoid = relid;
msg.m_nancestors = 0;
ancestors = get_partition_ancestors(relid);
foreach(lc, ancestors)
{
Oid ancestor = lfirst_oid(lc);
msg.m_ancestors[msg.m_nancestors] = ancestor;
if (++msg.m_nancestors >= PGSTAT_NUM_ANCESTORENTRIES)
{
pgstat_send(&msg, offsetof(PgStat_MsgAnlAncestors, m_ancestors[0]) +
msg.m_nancestors * sizeof(Oid));
msg.m_nancestors = 0;
}
}
if (msg.m_nancestors > 0)
pgstat_send(&msg, offsetof(PgStat_MsgAnlAncestors, m_ancestors[0]) +
msg.m_nancestors * sizeof(Oid));
list_free(ancestors);
} }
/* -------- /* --------
...@@ -2039,8 +1982,7 @@ pgstat_initstats(Relation rel) ...@@ -2039,8 +1982,7 @@ pgstat_initstats(Relation rel)
char relkind = rel->rd_rel->relkind; char relkind = rel->rd_rel->relkind;
/* We only count stats for things that have storage */ /* We only count stats for things that have storage */
if (!RELKIND_HAS_STORAGE(relkind) && if (!RELKIND_HAS_STORAGE(relkind))
relkind != RELKIND_PARTITIONED_TABLE)
{ {
rel->pgstat_info = NULL; rel->pgstat_info = NULL;
return; return;
...@@ -3370,10 +3312,6 @@ PgstatCollectorMain(int argc, char *argv[]) ...@@ -3370,10 +3312,6 @@ PgstatCollectorMain(int argc, char *argv[])
pgstat_recv_analyze(&msg.msg_analyze, len); pgstat_recv_analyze(&msg.msg_analyze, len);
break; break;
case PGSTAT_MTYPE_ANL_ANCESTORS:
pgstat_recv_anl_ancestors(&msg.msg_anl_ancestors, len);
break;
case PGSTAT_MTYPE_ARCHIVER: case PGSTAT_MTYPE_ARCHIVER:
pgstat_recv_archiver(&msg.msg_archiver, len); pgstat_recv_archiver(&msg.msg_archiver, len);
break; break;
...@@ -3588,7 +3526,6 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create) ...@@ -3588,7 +3526,6 @@ pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry, Oid tableoid, bool create)
result->n_live_tuples = 0; result->n_live_tuples = 0;
result->n_dead_tuples = 0; result->n_dead_tuples = 0;
result->changes_since_analyze = 0; result->changes_since_analyze = 0;
result->changes_since_analyze_reported = 0;
result->inserts_since_vacuum = 0; result->inserts_since_vacuum = 0;
result->blocks_fetched = 0; result->blocks_fetched = 0;
result->blocks_hit = 0; result->blocks_hit = 0;
...@@ -4870,7 +4807,6 @@ pgstat_recv_tabstat(PgStat_MsgTabstat *msg, int len) ...@@ -4870,7 +4807,6 @@ pgstat_recv_tabstat(PgStat_MsgTabstat *msg, int len)
tabentry->n_live_tuples = tabmsg->t_counts.t_delta_live_tuples; tabentry->n_live_tuples = tabmsg->t_counts.t_delta_live_tuples;
tabentry->n_dead_tuples = tabmsg->t_counts.t_delta_dead_tuples; tabentry->n_dead_tuples = tabmsg->t_counts.t_delta_dead_tuples;
tabentry->changes_since_analyze = tabmsg->t_counts.t_changed_tuples; tabentry->changes_since_analyze = tabmsg->t_counts.t_changed_tuples;
tabentry->changes_since_analyze_reported = 0;
tabentry->inserts_since_vacuum = tabmsg->t_counts.t_tuples_inserted; tabentry->inserts_since_vacuum = tabmsg->t_counts.t_tuples_inserted;
tabentry->blocks_fetched = tabmsg->t_counts.t_blocks_fetched; tabentry->blocks_fetched = tabmsg->t_counts.t_blocks_fetched;
tabentry->blocks_hit = tabmsg->t_counts.t_blocks_hit; tabentry->blocks_hit = tabmsg->t_counts.t_blocks_hit;
...@@ -5268,10 +5204,7 @@ pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len) ...@@ -5268,10 +5204,7 @@ pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len)
* have no good way to estimate how many of those there were. * have no good way to estimate how many of those there were.
*/ */
if (msg->m_resetcounter) if (msg->m_resetcounter)
{
tabentry->changes_since_analyze = 0; tabentry->changes_since_analyze = 0;
tabentry->changes_since_analyze_reported = 0;
}
if (msg->m_autovacuum) if (msg->m_autovacuum)
{ {
...@@ -5285,29 +5218,6 @@ pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len) ...@@ -5285,29 +5218,6 @@ pgstat_recv_analyze(PgStat_MsgAnalyze *msg, int len)
} }
} }
static void
pgstat_recv_anl_ancestors(PgStat_MsgAnlAncestors *msg, int len)
{
PgStat_StatDBEntry *dbentry;
PgStat_StatTabEntry *tabentry;
dbentry = pgstat_get_db_entry(msg->m_databaseid, true);
tabentry = pgstat_get_tab_entry(dbentry, msg->m_tableoid, true);
for (int i = 0; i < msg->m_nancestors; i++)
{
Oid ancestor_relid = msg->m_ancestors[i];
PgStat_StatTabEntry *ancestor;
ancestor = pgstat_get_tab_entry(dbentry, ancestor_relid, true);
ancestor->changes_since_analyze +=
tabentry->changes_since_analyze - tabentry->changes_since_analyze_reported;
}
tabentry->changes_since_analyze_reported = tabentry->changes_since_analyze;
}
/* ---------- /* ----------
* pgstat_recv_archiver() - * pgstat_recv_archiver() -
......
...@@ -69,7 +69,6 @@ typedef enum StatMsgType ...@@ -69,7 +69,6 @@ typedef enum StatMsgType
PGSTAT_MTYPE_AUTOVAC_START, PGSTAT_MTYPE_AUTOVAC_START,
PGSTAT_MTYPE_VACUUM, PGSTAT_MTYPE_VACUUM,
PGSTAT_MTYPE_ANALYZE, PGSTAT_MTYPE_ANALYZE,
PGSTAT_MTYPE_ANL_ANCESTORS,
PGSTAT_MTYPE_ARCHIVER, PGSTAT_MTYPE_ARCHIVER,
PGSTAT_MTYPE_BGWRITER, PGSTAT_MTYPE_BGWRITER,
PGSTAT_MTYPE_WAL, PGSTAT_MTYPE_WAL,
...@@ -107,7 +106,7 @@ typedef int64 PgStat_Counter; ...@@ -107,7 +106,7 @@ typedef int64 PgStat_Counter;
* *
* tuples_inserted/updated/deleted/hot_updated count attempted actions, * tuples_inserted/updated/deleted/hot_updated count attempted actions,
* regardless of whether the transaction committed. delta_live_tuples, * regardless of whether the transaction committed. delta_live_tuples,
* delta_dead_tuples, changed_tuples are set depending on commit or abort. * delta_dead_tuples, and changed_tuples are set depending on commit or abort.
* Note that delta_live_tuples and delta_dead_tuples can be negative! * Note that delta_live_tuples and delta_dead_tuples can be negative!
* ---------- * ----------
*/ */
...@@ -430,25 +429,6 @@ typedef struct PgStat_MsgAnalyze ...@@ -430,25 +429,6 @@ typedef struct PgStat_MsgAnalyze
PgStat_Counter m_dead_tuples; PgStat_Counter m_dead_tuples;
} PgStat_MsgAnalyze; } PgStat_MsgAnalyze;
/* ----------
* PgStat_MsgAnlAncestors Sent by the backend or autovacuum daemon
* to inform partitioned tables that are
* ancestors of a partition, to propagate
* analyze counters
* ----------
*/
#define PGSTAT_NUM_ANCESTORENTRIES \
((PGSTAT_MSG_PAYLOAD - sizeof(Oid) - sizeof(Oid) - sizeof(int)) \
/ sizeof(Oid))
typedef struct PgStat_MsgAnlAncestors
{
PgStat_MsgHdr m_hdr;
Oid m_databaseid;
Oid m_tableoid;
int m_nancestors;
Oid m_ancestors[PGSTAT_NUM_ANCESTORENTRIES];
} PgStat_MsgAnlAncestors;
/* ---------- /* ----------
* PgStat_MsgArchiver Sent by the archiver to update statistics. * PgStat_MsgArchiver Sent by the archiver to update statistics.
...@@ -697,7 +677,6 @@ typedef union PgStat_Msg ...@@ -697,7 +677,6 @@ typedef union PgStat_Msg
PgStat_MsgAutovacStart msg_autovacuum_start; PgStat_MsgAutovacStart msg_autovacuum_start;
PgStat_MsgVacuum msg_vacuum; PgStat_MsgVacuum msg_vacuum;
PgStat_MsgAnalyze msg_analyze; PgStat_MsgAnalyze msg_analyze;
PgStat_MsgAnlAncestors msg_anl_ancestors;
PgStat_MsgArchiver msg_archiver; PgStat_MsgArchiver msg_archiver;
PgStat_MsgBgWriter msg_bgwriter; PgStat_MsgBgWriter msg_bgwriter;
PgStat_MsgWal msg_wal; PgStat_MsgWal msg_wal;
...@@ -793,7 +772,7 @@ typedef struct PgStat_StatTabEntry ...@@ -793,7 +772,7 @@ typedef struct PgStat_StatTabEntry
PgStat_Counter n_live_tuples; PgStat_Counter n_live_tuples;
PgStat_Counter n_dead_tuples; PgStat_Counter n_dead_tuples;
PgStat_Counter changes_since_analyze; PgStat_Counter changes_since_analyze;
PgStat_Counter changes_since_analyze_reported; PgStat_Counter unused_counter; /* kept for ABI compatibility */
PgStat_Counter inserts_since_vacuum; PgStat_Counter inserts_since_vacuum;
PgStat_Counter blocks_fetched; PgStat_Counter blocks_fetched;
...@@ -1002,7 +981,6 @@ extern void pgstat_report_vacuum(Oid tableoid, bool shared, ...@@ -1002,7 +981,6 @@ extern void pgstat_report_vacuum(Oid tableoid, bool shared,
extern void pgstat_report_analyze(Relation rel, extern void pgstat_report_analyze(Relation rel,
PgStat_Counter livetuples, PgStat_Counter deadtuples, PgStat_Counter livetuples, PgStat_Counter deadtuples,
bool resetcounter); bool resetcounter);
extern void pgstat_report_anl_ancestors(Oid relid);
extern void pgstat_report_recovery_conflict(int reason); extern void pgstat_report_recovery_conflict(int reason);
extern void pgstat_report_deadlock(void); extern void pgstat_report_deadlock(void);
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment