• Tom Lane's avatar
    Defend against bad relfrozenxid/relminmxid/datfrozenxid/datminmxid values. · 78db307b
    Tom Lane authored
    In commit a61daa14, we fixed pg_upgrade so
    that it would install sane relminmxid and datminmxid values, but that does
    not cure the problem for installations that were already pg_upgraded to
    9.3; they'll initially have "1" in those fields.  This is not a big problem
    so long as 1 is "in the past" compared to the current nextMultiXact
    counter.  But if an installation were more than halfway to the MXID wrap
    point at the time of upgrade, 1 would appear to be "in the future" and
    that would effectively disable tracking of oldest MXIDs in those
    tables/databases, until such time as the counter wrapped around.
    
    While in itself this isn't worse than the situation pre-9.3, where we did
    not manage MXID wraparound risk at all, the consequences of premature
    truncation of pg_multixact are worse now; so we ought to make some effort
    to cope with this.  We discussed advising users to fix the tracking values
    manually, but that seems both very tedious and very error-prone.
    
    Instead, this patch adopts two amelioration rules.  First, a relminmxid
    value that is "in the future" is allowed to be overwritten with a
    full-table VACUUM's actual freeze cutoff, ignoring the normal rule that
    relminmxid should never go backwards.  (This essentially assumes that we
    have enough defenses in place that wraparound can never occur anymore,
    and thus that a value "in the future" must be corrupt.)  Second, if we see
    any "in the future" values then we refrain from truncating pg_clog and
    pg_multixact.  This prevents loss of clog data until we have cleaned up
    all the broken tracking data.  In the worst case that could result in
    considerable clog bloat, but in practice we expect that relfrozenxid-driven
    freezing will happen soon enough to fix the problem before clog bloat
    becomes intolerable.  (Users could do manual VACUUM FREEZEs if not.)
    
    Note that this mechanism cannot save us if there are already-wrapped or
    already-truncated-away MXIDs in the table; it's only capable of dealing
    with corrupt tracking values.  But that's the situation we have with the
    pg_upgrade bug.
    
    For consistency, apply the same rules to relfrozenxid/datfrozenxid.  There
    are not known mechanisms for these to get messed up, but if they were, the
    same tactics seem appropriate for fixing them.
    78db307b
vacuum.c 44.1 KB