Commit fd31cd26 authored by Robert Haas's avatar Robert Haas

Don't vacuum all-frozen pages.

Commit a892234f gave us enough
infrastructure to avoid vacuuming pages where every tuple on the
page is already frozen.  So, replace the notion of a scan_all or
whole-table vacuum with the less onerous notion of an "aggressive"
vacuum, which will pages that are all-visible, but still skip those
that are all-frozen.

This should greatly reduce the cost of anti-wraparound vacuuming
on large clusters where the majority of data is never touched
between one cycle and the next, because we'll no longer have to
read all of those pages only to find out that we don't need to
do anything with them.

Patch by me, reviewed by Masahiko Sawada.
parent 364a9f47
...@@ -5984,12 +5984,15 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; ...@@ -5984,12 +5984,15 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</term> </term>
<listitem> <listitem>
<para> <para>
<command>VACUUM</> performs a whole-table scan if the table's <command>VACUUM</> performs an aggressive scan if the table's
<structname>pg_class</>.<structfield>relfrozenxid</> field has reached <structname>pg_class</>.<structfield>relfrozenxid</> field has reached
the age specified by this setting. The default is 150 million the age specified by this setting. An aggressive scan differs from
transactions. Although users can set this value anywhere from zero to a regular <command>VACUUM</> in that it visits every page that might
two billions, <command>VACUUM</> will silently limit the effective value contain unfrozen XIDs or MXIDs, not just those that might contain dead
to 95% of <xref linkend="guc-autovacuum-freeze-max-age">, so that a tuples. The default is 150 million transactions. Although users can
set this value anywhere from zero to two billions, <command>VACUUM</>
will silently limit the effective value to 95% of
<xref linkend="guc-autovacuum-freeze-max-age">, so that a
periodical manual <command>VACUUM</> has a chance to run before an periodical manual <command>VACUUM</> has a chance to run before an
anti-wraparound autovacuum is launched for the table. For more anti-wraparound autovacuum is launched for the table. For more
information see information see
...@@ -6028,9 +6031,12 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; ...@@ -6028,9 +6031,12 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</term> </term>
<listitem> <listitem>
<para> <para>
<command>VACUUM</> performs a whole-table scan if the table's <command>VACUUM</> performs an aggressive scan if the table's
<structname>pg_class</>.<structfield>relminmxid</> field has reached <structname>pg_class</>.<structfield>relminmxid</> field has reached
the age specified by this setting. The default is 150 million multixacts. the age specified by this setting. An aggressive scan differs from
a regular <command>VACUUM</> in that it visits every page that might
contain unfrozen XIDs or MXIDs, not just those that might contain dead
tuples. The default is 150 million multixacts.
Although users can set this value anywhere from zero to two billions, Although users can set this value anywhere from zero to two billions,
<command>VACUUM</> will silently limit the effective value to 95% of <command>VACUUM</> will silently limit the effective value to 95% of
<xref linkend="guc-autovacuum-multixact-freeze-max-age">, so that a <xref linkend="guc-autovacuum-multixact-freeze-max-age">, so that a
......
...@@ -438,22 +438,27 @@ ...@@ -438,22 +438,27 @@
</para> </para>
<para> <para>
<command>VACUUM</> normally skips pages that don't have any dead row <command>VACUUM</> uses the <link linkend="storage-vm">visibility map</>
versions, but those pages might still have row versions with old XID to determine which pages of a relation must be scanned. Normally, it
values. To ensure all old row versions have been frozen, a will skips pages that don't have any dead row versions even if those pages
scan of the whole table is needed. might still have row versions with old XID values. Therefore, normal
<xref linkend="guc-vacuum-freeze-table-age"> controls when scans won't succeed in freezing every row version in the table.
<command>VACUUM</> does that: a whole table sweep is forced if Periodically, <command>VACUUM</> will perform an <firstterm>aggressive
the table hasn't been fully scanned for <varname>vacuum_freeze_table_age</> vacuum</>, skipping only those pages which contain neither dead rows nor
minus <varname>vacuum_freeze_min_age</> transactions. Setting it to 0 any unfrozen XID or MXID values.
forces <command>VACUUM</> to always scan all pages, effectively ignoring <xref linkend="guc-vacuum-freeze-table-age">
the visibility map. controls when <command>VACUUM</> does that: all-visible but not all-frozen
pages are scanned if the number of transactions that have passed since the
last such scan is greater than <varname>vacuum_freeze_table_age</> minus
<varname>vacuum_freeze_min_age</>. Setting
<varname>vacuum_freeze_table_age</> to 0 forces <command>VACUUM</> to
use this more aggressive strategy for all scans.
</para> </para>
<para> <para>
The maximum time that a table can go unvacuumed is two billion The maximum time that a table can go unvacuumed is two billion
transactions minus the <varname>vacuum_freeze_min_age</> value at transactions minus the <varname>vacuum_freeze_min_age</> value at
the time <command>VACUUM</> last scanned the whole table. If it were to go the time of the last aggressive vacuum. If it were to go
unvacuumed for longer than unvacuumed for longer than
that, data loss could result. To ensure that this does not happen, that, data loss could result. To ensure that this does not happen,
autovacuum is invoked on any table that might contain unfrozen rows with autovacuum is invoked on any table that might contain unfrozen rows with
...@@ -491,7 +496,7 @@ ...@@ -491,7 +496,7 @@
normal delete and update activity is run in that window. Setting it too normal delete and update activity is run in that window. Setting it too
close could lead to anti-wraparound autovacuums, even though the table close could lead to anti-wraparound autovacuums, even though the table
was recently vacuumed to reclaim space, whereas lower values lead to more was recently vacuumed to reclaim space, whereas lower values lead to more
frequent whole-table scans. frequent aggressive vacuuming.
</para> </para>
<para> <para>
...@@ -527,7 +532,7 @@ ...@@ -527,7 +532,7 @@
<structname>pg_database</>. In particular, <structname>pg_database</>. In particular,
the <structfield>relfrozenxid</> column of a table's the <structfield>relfrozenxid</> column of a table's
<structname>pg_class</> row contains the freeze cutoff XID that was used <structname>pg_class</> row contains the freeze cutoff XID that was used
by the last whole-table <command>VACUUM</> for that table. All rows by the last aggressive <command>VACUUM</> for that table. All rows
inserted by transactions with XIDs older than this cutoff XID are inserted by transactions with XIDs older than this cutoff XID are
guaranteed to have been frozen. Similarly, guaranteed to have been frozen. Similarly,
the <structfield>datfrozenxid</> column of a database's the <structfield>datfrozenxid</> column of a database's
...@@ -552,20 +557,23 @@ SELECT datname, age(datfrozenxid) FROM pg_database; ...@@ -552,20 +557,23 @@ SELECT datname, age(datfrozenxid) FROM pg_database;
</para> </para>
<para> <para>
<command>VACUUM</> normally <command>VACUUM</> normally only scans pages that have been modified
only scans pages that have been modified since the last vacuum, but since the last vacuum, but <structfield>relfrozenxid</> can only be
<structfield>relfrozenxid</> can only be advanced when the whole table is advanced when every page of the table
scanned. The whole table is scanned when <structfield>relfrozenxid</> is that might contain unfrozen XIDs is scanned. This happens when
more than <varname>vacuum_freeze_table_age</> transactions old, when <structfield>relfrozenxid</> is more than
<command>VACUUM</>'s <literal>FREEZE</> option is used, or when all pages <varname>vacuum_freeze_table_age</> transactions old, when
happen to <command>VACUUM</>'s <literal>FREEZE</> option is used, or when all
pages that are not already all-frozen happen to
require vacuuming to remove dead row versions. When <command>VACUUM</> require vacuuming to remove dead row versions. When <command>VACUUM</>
scans the whole table, after it's finished <literal>age(relfrozenxid)</> scans every page in the table that is not already all-frozen, it should
should be a little more than the <varname>vacuum_freeze_min_age</> setting set <literal>age(relfrozenxid)</> to a value just a little more than the
that was used (more by the number of transactions started since the <varname>vacuum_freeze_min_age</> setting
<command>VACUUM</> started). If no whole-table-scanning <command>VACUUM</> that was used (more by the number of transcations started since the
is issued on the table until <varname>autovacuum_freeze_max_age</> is <command>VACUUM</> started). If no <structfield>relfrozenxid</>-advancing
reached, an autovacuum will soon be forced for the table. <command>VACUUM</> is issued on the table until
<varname>autovacuum_freeze_max_age</> is reached, an autovacuum will soon
be forced for the table.
</para> </para>
<para> <para>
...@@ -634,21 +642,23 @@ HINT: Stop the postmaster and vacuum that database in single-user mode. ...@@ -634,21 +642,23 @@ HINT: Stop the postmaster and vacuum that database in single-user mode.
</para> </para>
<para> <para>
During a <command>VACUUM</> table scan, either partial or of the whole Whenever <command>VACUUM</> scans any part of a table, it will replace
table, any multixact ID older than any multixact ID it encounters which is older than
<xref linkend="guc-vacuum-multixact-freeze-min-age"> <xref linkend="guc-vacuum-multixact-freeze-min-age">
is replaced by a different value, which can be the zero value, a single by a different value, which can be the zero value, a single
transaction ID, or a newer multixact ID. For each table, transaction ID, or a newer multixact ID. For each table,
<structname>pg_class</>.<structfield>relminmxid</> stores the oldest <structname>pg_class</>.<structfield>relminmxid</> stores the oldest
possible multixact ID still appearing in any tuple of that table. possible multixact ID still appearing in any tuple of that table.
If this value is older than If this value is older than
<xref linkend="guc-vacuum-multixact-freeze-table-age">, a whole-table <xref linkend="guc-vacuum-multixact-freeze-table-age">, an aggressive
scan is forced. <function>mxid_age()</> can be used on vacuum is forced. As discussed in the previous section, an aggressive
vacuum means that only those pages which are known to be all-frozen will
be skipped. <function>mxid_age()</> can be used on
<structname>pg_class</>.<structfield>relminmxid</> to find its age. <structname>pg_class</>.<structfield>relminmxid</> to find its age.
</para> </para>
<para> <para>
Whole-table <command>VACUUM</> scans, regardless of Aggressive <command>VACUUM</> scans, regardless of
what causes them, enable advancing the value for that table. what causes them, enable advancing the value for that table.
Eventually, as all tables in all databases are scanned and their Eventually, as all tables in all databases are scanned and their
oldest multixact values are advanced, on-disk storage for older oldest multixact values are advanced, on-disk storage for older
...@@ -656,13 +666,13 @@ HINT: Stop the postmaster and vacuum that database in single-user mode. ...@@ -656,13 +666,13 @@ HINT: Stop the postmaster and vacuum that database in single-user mode.
</para> </para>
<para> <para>
As a safety device, a whole-table vacuum scan will occur for any table As a safety device, an aggressive vacuum scan will occur for any table
whose multixact-age is greater than whose multixact-age is greater than
<xref linkend="guc-autovacuum-multixact-freeze-max-age">. Whole-table <xref linkend="guc-autovacuum-multixact-freeze-max-age">. Aggressive
vacuum scans will also occur progressively for all tables, starting with vacuum scans will also occur progressively for all tables, starting with
those that have the oldest multixact-age, if the amount of used member those that have the oldest multixact-age, if the amount of used member
storage space exceeds the amount 50% of the addressable storage space. storage space exceeds the amount 50% of the addressable storage space.
Both of these kinds of whole-table scans will occur even if autovacuum is Both of these kinds of aggressive scans will occur even if autovacuum is
nominally disabled. nominally disabled.
</para> </para>
</sect3> </sect3>
...@@ -743,9 +753,9 @@ vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuple ...@@ -743,9 +753,9 @@ vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuple
<command>UPDATE</command> and <command>DELETE</command> operation. (It <command>UPDATE</command> and <command>DELETE</command> operation. (It
is only semi-accurate because some information might be lost under heavy is only semi-accurate because some information might be lost under heavy
load.) If the <structfield>relfrozenxid</> value of the table is more load.) If the <structfield>relfrozenxid</> value of the table is more
than <varname>vacuum_freeze_table_age</> transactions old, the whole than <varname>vacuum_freeze_table_age</> transactions old, an aggressive
table is scanned to freeze old tuples and advance vacuum is performed to freeze old tuples and advance
<structfield>relfrozenxid</>, otherwise only pages that have been modified <structfield>relfrozenxid</>; otherwise, only pages that have been modified
since the last vacuum are scanned. since the last vacuum are scanned.
</para> </para>
......
...@@ -106,6 +106,7 @@ typedef struct LVRelStats ...@@ -106,6 +106,7 @@ typedef struct LVRelStats
BlockNumber rel_pages; /* total number of pages */ BlockNumber rel_pages; /* total number of pages */
BlockNumber scanned_pages; /* number of pages we examined */ BlockNumber scanned_pages; /* number of pages we examined */
BlockNumber pinskipped_pages; /* # of pages we skipped due to a pin */ BlockNumber pinskipped_pages; /* # of pages we skipped due to a pin */
BlockNumber frozenskipped_pages; /* # of frozen pages we skipped */
double scanned_tuples; /* counts only tuples on scanned pages */ double scanned_tuples; /* counts only tuples on scanned pages */
double old_rel_tuples; /* previous value of pg_class.reltuples */ double old_rel_tuples; /* previous value of pg_class.reltuples */
double new_rel_tuples; /* new estimated total # of tuples */ double new_rel_tuples; /* new estimated total # of tuples */
...@@ -136,7 +137,7 @@ static BufferAccessStrategy vac_strategy; ...@@ -136,7 +137,7 @@ static BufferAccessStrategy vac_strategy;
/* non-export function prototypes */ /* non-export function prototypes */
static void lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, static void lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
Relation *Irel, int nindexes, bool scan_all); Relation *Irel, int nindexes, bool aggressive);
static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats); static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
static bool lazy_check_needs_freeze(Buffer buf, bool *hastup); static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
static void lazy_vacuum_index(Relation indrel, static void lazy_vacuum_index(Relation indrel,
...@@ -182,8 +183,8 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, ...@@ -182,8 +183,8 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
int usecs; int usecs;
double read_rate, double read_rate,
write_rate; write_rate;
bool scan_all; /* should we scan all pages? */ bool aggressive; /* should we scan all unfrozen pages? */
bool scanned_all; /* did we actually scan all pages? */ bool scanned_all_unfrozen; /* actually scanned all such pages? */
TransactionId xidFullScanLimit; TransactionId xidFullScanLimit;
MultiXactId mxactFullScanLimit; MultiXactId mxactFullScanLimit;
BlockNumber new_rel_pages; BlockNumber new_rel_pages;
...@@ -221,14 +222,14 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, ...@@ -221,14 +222,14 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
&MultiXactCutoff, &mxactFullScanLimit); &MultiXactCutoff, &mxactFullScanLimit);
/* /*
* We request a full scan if either the table's frozen Xid is now older * We request an aggressive scan if either the table's frozen Xid is now
* than or equal to the requested Xid full-table scan limit; or if the * older than or equal to the requested Xid full-table scan limit; or if
* table's minimum MultiXactId is older than or equal to the requested * the table's minimum MultiXactId is older than or equal to the requested
* mxid full-table scan limit. * mxid full-table scan limit.
*/ */
scan_all = TransactionIdPrecedesOrEquals(onerel->rd_rel->relfrozenxid, aggressive = TransactionIdPrecedesOrEquals(onerel->rd_rel->relfrozenxid,
xidFullScanLimit); xidFullScanLimit);
scan_all |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid, aggressive |= MultiXactIdPrecedesOrEquals(onerel->rd_rel->relminmxid,
mxactFullScanLimit); mxactFullScanLimit);
vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats)); vacrelstats = (LVRelStats *) palloc0(sizeof(LVRelStats));
...@@ -244,7 +245,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, ...@@ -244,7 +245,7 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
vacrelstats->hasindex = (nindexes > 0); vacrelstats->hasindex = (nindexes > 0);
/* Do the vacuuming */ /* Do the vacuuming */
lazy_scan_heap(onerel, vacrelstats, Irel, nindexes, scan_all); lazy_scan_heap(onerel, vacrelstats, Irel, nindexes, aggressive);
/* Done with indexes */ /* Done with indexes */
vac_close_indexes(nindexes, Irel, NoLock); vac_close_indexes(nindexes, Irel, NoLock);
...@@ -256,13 +257,14 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, ...@@ -256,13 +257,14 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
* NB: We need to check this before truncating the relation, because that * NB: We need to check this before truncating the relation, because that
* will change ->rel_pages. * will change ->rel_pages.
*/ */
if (vacrelstats->scanned_pages < vacrelstats->rel_pages) if ((vacrelstats->scanned_pages + vacrelstats->frozenskipped_pages)
< vacrelstats->rel_pages)
{ {
Assert(!scan_all); Assert(!aggressive);
scanned_all = false; scanned_all_unfrozen = false;
} }
else else
scanned_all = true; scanned_all_unfrozen = true;
/* /*
* Optionally truncate the relation. * Optionally truncate the relation.
...@@ -302,8 +304,8 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, ...@@ -302,8 +304,8 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
if (new_rel_allvisible > new_rel_pages) if (new_rel_allvisible > new_rel_pages)
new_rel_allvisible = new_rel_pages; new_rel_allvisible = new_rel_pages;
new_frozen_xid = scanned_all ? FreezeLimit : InvalidTransactionId; new_frozen_xid = scanned_all_unfrozen ? FreezeLimit : InvalidTransactionId;
new_min_multi = scanned_all ? MultiXactCutoff : InvalidMultiXactId; new_min_multi = scanned_all_unfrozen ? MultiXactCutoff : InvalidMultiXactId;
vac_update_relstats(onerel, vac_update_relstats(onerel,
new_rel_pages, new_rel_pages,
...@@ -358,10 +360,11 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params, ...@@ -358,10 +360,11 @@ lazy_vacuum_rel(Relation onerel, int options, VacuumParams *params,
get_namespace_name(RelationGetNamespace(onerel)), get_namespace_name(RelationGetNamespace(onerel)),
RelationGetRelationName(onerel), RelationGetRelationName(onerel),
vacrelstats->num_index_scans); vacrelstats->num_index_scans);
appendStringInfo(&buf, _("pages: %u removed, %u remain, %u skipped due to pins\n"), appendStringInfo(&buf, _("pages: %u removed, %u remain, %u skipped due to pins, %u skipped frozen\n"),
vacrelstats->pages_removed, vacrelstats->pages_removed,
vacrelstats->rel_pages, vacrelstats->rel_pages,
vacrelstats->pinskipped_pages); vacrelstats->pinskipped_pages,
vacrelstats->frozenskipped_pages);
appendStringInfo(&buf, appendStringInfo(&buf,
_("tuples: %.0f removed, %.0f remain, %.0f are dead but not yet removable\n"), _("tuples: %.0f removed, %.0f remain, %.0f are dead but not yet removable\n"),
vacrelstats->tuples_deleted, vacrelstats->tuples_deleted,
...@@ -434,7 +437,7 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats) ...@@ -434,7 +437,7 @@ vacuum_log_cleanup_info(Relation rel, LVRelStats *vacrelstats)
*/ */
static void static void
lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
Relation *Irel, int nindexes, bool scan_all) Relation *Irel, int nindexes, bool aggressive)
{ {
BlockNumber nblocks, BlockNumber nblocks,
blkno; blkno;
...@@ -450,8 +453,8 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, ...@@ -450,8 +453,8 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
int i; int i;
PGRUsage ru0; PGRUsage ru0;
Buffer vmbuffer = InvalidBuffer; Buffer vmbuffer = InvalidBuffer;
BlockNumber next_not_all_visible_block; BlockNumber next_unskippable_block;
bool skipping_all_visible_blocks; bool skipping_blocks;
xl_heap_freeze_tuple *frozen; xl_heap_freeze_tuple *frozen;
StringInfoData buf; StringInfoData buf;
...@@ -479,35 +482,39 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, ...@@ -479,35 +482,39 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage); frozen = palloc(sizeof(xl_heap_freeze_tuple) * MaxHeapTuplesPerPage);
/* /*
* We want to skip pages that don't require vacuuming according to the * Except when aggressive is set, we want to skip pages that are
* visibility map, but only when we can skip at least SKIP_PAGES_THRESHOLD * all-visible according to the visibility map, but only when we can skip
* consecutive pages. Since we're reading sequentially, the OS should be * at least SKIP_PAGES_THRESHOLD consecutive pages. Since we're reading
* doing readahead for us, so there's no gain in skipping a page now and * sequentially, the OS should be doing readahead for us, so there's no
* then; that's likely to disable readahead and so be counterproductive. * gain in skipping a page now and then; that's likely to disable
* Also, skipping even a single page means that we can't update * readahead and so be counterproductive. Also, skipping even a single
* relfrozenxid, so we only want to do it if we can skip a goodly number * page means that we can't update relfrozenxid, so we only want to do it
* of pages. * if we can skip a goodly number of pages.
* *
* Before entering the main loop, establish the invariant that * When aggressive is set, we can't skip pages just because they are
* next_not_all_visible_block is the next block number >= blkno that's not * all-visible, but we can still skip pages that are all-frozen, since
* all-visible according to the visibility map, or nblocks if there's no * such pages do not need freezing and do not affect the value that we can
* such block. Also, we set up the skipping_all_visible_blocks flag, * safely set for relfrozenxid or relminmxid.
* which is needed because we need hysteresis in the decision: once we've
* started skipping blocks, we may as well skip everything up to the next
* not-all-visible block.
* *
* Note: if scan_all is true, we won't actually skip any pages; but we * Before entering the main loop, establish the invariant that
* maintain next_not_all_visible_block anyway, so as to set up the * next_unskippable_block is the next block number >= blkno that's not we
* all_visible_according_to_vm flag correctly for each page. * can't skip based on the visibility map, either all-visible for a
* regular scan or all-frozen for an aggressive scan. We set it to
* nblocks if there's no such block. We also set up the skipping_blocks
* flag correctly at this stage.
* *
* Note: The value returned by visibilitymap_get_status could be slightly * Note: The value returned by visibilitymap_get_status could be slightly
* out-of-date, since we make this test before reading the corresponding * out-of-date, since we make this test before reading the corresponding
* heap page or locking the buffer. This is OK. If we mistakenly think * heap page or locking the buffer. This is OK. If we mistakenly think
* that the page is all-visible when in fact the flag's just been cleared, * that the page is all-visible or all-frozen when in fact the flag's just
* we might fail to vacuum the page. But it's OK to skip pages when * been cleared, we might fail to vacuum the page. It's easy to see that
* scan_all is not set, so no great harm done; the next vacuum will find * skipping a page when aggressive is not set is not a very big deal; we
* them. If we make the reverse mistake and vacuum a page unnecessarily, * might leave some dead tuples lying around, but the next vacuum will
* it'll just be a no-op. * find them. But even when aggressive *is* set, it's still OK if we miss
* a page whose all-frozen marking has just been cleared. Any new XIDs
* just added to that page are necessarily newer than the GlobalXmin we
* computed, so they'll have no effect on the value to which we can safely
* set relfrozenxid. A similar argument applies for MXIDs and relminmxid.
* *
* We will scan the table's last page, at least to the extent of * We will scan the table's last page, at least to the extent of
* determining whether it has tuples or not, even if it should be skipped * determining whether it has tuples or not, even if it should be skipped
...@@ -518,18 +525,31 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, ...@@ -518,18 +525,31 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
* the last page. This is worth avoiding mainly because such a lock must * the last page. This is worth avoiding mainly because such a lock must
* be replayed on any hot standby, where it can be disruptive. * be replayed on any hot standby, where it can be disruptive.
*/ */
for (next_not_all_visible_block = 0; for (next_unskippable_block = 0;
next_not_all_visible_block < nblocks; next_unskippable_block < nblocks;
next_not_all_visible_block++) next_unskippable_block++)
{
uint8 vmstatus;
vmstatus = visibilitymap_get_status(onerel, next_unskippable_block,
&vmbuffer);
if (aggressive)
{ {
if (!VM_ALL_VISIBLE(onerel, next_not_all_visible_block, &vmbuffer)) if ((vmstatus & VISIBILITYMAP_ALL_FROZEN) == 0)
break; break;
}
else
{
if ((vmstatus & VISIBILITYMAP_ALL_VISIBLE) == 0)
break;
}
vacuum_delay_point(); vacuum_delay_point();
} }
if (next_not_all_visible_block >= SKIP_PAGES_THRESHOLD)
skipping_all_visible_blocks = true; if (next_unskippable_block >= SKIP_PAGES_THRESHOLD)
skipping_blocks = true;
else else
skipping_all_visible_blocks = false; skipping_blocks = false;
for (blkno = 0; blkno < nblocks; blkno++) for (blkno = 0; blkno < nblocks; blkno++)
{ {
...@@ -542,7 +562,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, ...@@ -542,7 +562,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
int prev_dead_count; int prev_dead_count;
int nfrozen; int nfrozen;
Size freespace; Size freespace;
bool all_visible_according_to_vm; bool all_visible_according_to_vm = false;
bool all_visible; bool all_visible;
bool all_frozen = true; /* provided all_visible is also true */ bool all_frozen = true; /* provided all_visible is also true */
bool has_dead_tuples; bool has_dead_tuples;
...@@ -552,15 +572,28 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, ...@@ -552,15 +572,28 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
#define FORCE_CHECK_PAGE() \ #define FORCE_CHECK_PAGE() \
(blkno == nblocks - 1 && should_attempt_truncation(vacrelstats)) (blkno == nblocks - 1 && should_attempt_truncation(vacrelstats))
if (blkno == next_not_all_visible_block) if (blkno == next_unskippable_block)
{
/* Time to advance next_unskippable_block */
for (next_unskippable_block++;
next_unskippable_block < nblocks;
next_unskippable_block++)
{
uint8 vmskipflags;
vmskipflags = visibilitymap_get_status(onerel,
next_unskippable_block,
&vmbuffer);
if (aggressive)
{ {
/* Time to advance next_not_all_visible_block */ if ((vmskipflags & VISIBILITYMAP_ALL_FROZEN) == 0)
for (next_not_all_visible_block++; break;
next_not_all_visible_block < nblocks; }
next_not_all_visible_block++) else
{ {
if (!VM_ALL_VISIBLE(onerel, next_not_all_visible_block, &vmbuffer)) if ((vmskipflags & VISIBILITYMAP_ALL_VISIBLE) == 0)
break; break;
}
vacuum_delay_point(); vacuum_delay_point();
} }
...@@ -569,17 +602,44 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, ...@@ -569,17 +602,44 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
* skipping_all_visible_blocks to do the right thing at the * skipping_all_visible_blocks to do the right thing at the
* following blocks. * following blocks.
*/ */
if (next_not_all_visible_block - blkno > SKIP_PAGES_THRESHOLD) if (next_unskippable_block - blkno > SKIP_PAGES_THRESHOLD)
skipping_all_visible_blocks = true; skipping_blocks = true;
else else
skipping_all_visible_blocks = false; skipping_blocks = false;
all_visible_according_to_vm = false;
/*
* Normally, the fact that we can't skip this block must mean that
* it's not all-visible. But in an aggressive vacuum we know only
* that it's not all-frozen, so it might still be all-visible.
*/
if (aggressive && VM_ALL_VISIBLE(onerel, blkno, &vmbuffer))
all_visible_according_to_vm = true;
} }
else else
{ {
/* Current block is all-visible */ /*
if (skipping_all_visible_blocks && !scan_all && !FORCE_CHECK_PAGE()) * The current block is potentially skippable; if we've seen a
* long enough run of skippable blocks to justify skipping it, and
* we're not forced to check it, then go ahead and skip.
* Otherwise, the page must be at least all-visible if not
* all-frozen, so we can set all_visible_according_to_vm = true.
*/
if (skipping_blocks && !FORCE_CHECK_PAGE())
{
/*
* Tricky, tricky. If this is in aggressive vacuum, the page
* must have been all-frozen at the time we checked whether it
* was skippable, but it might not be any more. We must be
* careful to count it as a skipped all-frozen page in that
* case, or else we'll think we can't update relfrozenxid and
* relminmxid. If it's not an aggressive vacuum, we don't
* know whether it was all-frozen, so we have to recheck; but
* in this case an approximate answer is OK.
*/
if (aggressive || VM_ALL_FROZEN(onerel, blkno, &vmbuffer))
vacrelstats->frozenskipped_pages++;
continue; continue;
}
all_visible_according_to_vm = true; all_visible_according_to_vm = true;
} }
...@@ -628,9 +688,10 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, ...@@ -628,9 +688,10 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
* Pin the visibility map page in case we need to mark the page * Pin the visibility map page in case we need to mark the page
* all-visible. In most cases this will be very cheap, because we'll * all-visible. In most cases this will be very cheap, because we'll
* already have the correct page pinned anyway. However, it's * already have the correct page pinned anyway. However, it's
* possible that (a) next_not_all_visible_block is covered by a * possible that (a) next_unskippable_block is covered by a different
* different VM page than the current block or (b) we released our pin * VM page than the current block or (b) we released our pin and did a
* and did a cycle of index vacuuming. * cycle of index vacuuming.
*
*/ */
visibilitymap_pin(onerel, blkno, &vmbuffer); visibilitymap_pin(onerel, blkno, &vmbuffer);
...@@ -641,12 +702,12 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, ...@@ -641,12 +702,12 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
if (!ConditionalLockBufferForCleanup(buf)) if (!ConditionalLockBufferForCleanup(buf))
{ {
/* /*
* If we're not scanning the whole relation to guard against XID * If we're not performing an aggressive scan to guard against XID
* wraparound, and we don't want to forcibly check the page, then * wraparound, and we don't want to forcibly check the page, then
* it's OK to skip vacuuming pages we get a lock conflict on. They * it's OK to skip vacuuming pages we get a lock conflict on. They
* will be dealt with in some future vacuum. * will be dealt with in some future vacuum.
*/ */
if (!scan_all && !FORCE_CHECK_PAGE()) if (!aggressive && !FORCE_CHECK_PAGE())
{ {
ReleaseBuffer(buf); ReleaseBuffer(buf);
vacrelstats->pinskipped_pages++; vacrelstats->pinskipped_pages++;
...@@ -663,7 +724,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, ...@@ -663,7 +724,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
* ourselves for multiple buffers and then service whichever one * ourselves for multiple buffers and then service whichever one
* is received first. For now, this seems good enough. * is received first. For now, this seems good enough.
* *
* If we get here with scan_all false, then we're just forcibly * If we get here with aggressive false, then we're just forcibly
* checking the page, and so we don't want to insist on getting * checking the page, and so we don't want to insist on getting
* the lock; we only need to know if the page contains tuples, so * the lock; we only need to know if the page contains tuples, so
* that we can update nonempty_pages correctly. It's convenient * that we can update nonempty_pages correctly. It's convenient
...@@ -679,7 +740,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats, ...@@ -679,7 +740,7 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
vacrelstats->nonempty_pages = blkno + 1; vacrelstats->nonempty_pages = blkno + 1;
continue; continue;
} }
if (!scan_all) if (!aggressive)
{ {
/* /*
* Here, we must not advance scanned_pages; that would amount * Here, we must not advance scanned_pages; that would amount
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment