Commit 0408e1ed authored by Tom Lane's avatar Tom Lane

Do not select new object OIDs that match recently-dead entries.

When selecting a new OID, we take care to avoid picking one that's already
in use in the target table, so as not to create duplicates after the OID
counter has wrapped around.  However, up to now we used SnapshotDirty when
scanning for pre-existing entries.  That ignores committed-dead rows, so
that we could select an OID matching a deleted-but-not-yet-vacuumed row.
While that mostly worked, it has two problems:

* If recently deleted, the dead row might still be visible to MVCC
snapshots, creating a risk for duplicate OIDs when examining the catalogs
within our own transaction.  Such duplication couldn't be visible outside
the object-creating transaction, though, and we've heard few if any field
reports corresponding to such a symptom.

* When selecting a TOAST OID, deleted toast rows definitely *are* visible
to SnapshotToast, and will remain so until vacuumed away.  This leads to
a conflict that will manifest in errors like "unexpected chunk number 0
(expected 1) for toast value nnnnn".  We've been seeing reports of such
errors from the field for years, but the cause was unclear before.

The fix is simple: just use SnapshotAny to search for conflicting rows.
This results in a slightly longer window before object OIDs can be
recycled, but that seems unlikely to create any large problems.

Pavan Deolasee

Discussion: https://postgr.es/m/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-=h4OLosXHf9w@mail.gmail.com
parent 811969b2
......@@ -1794,7 +1794,9 @@ toast_delete_datum(Relation rel, Datum value, bool is_speculative)
/* ----------
* toastrel_valueid_exists -
*
* Test whether a toast value with the given ID exists in the toast relation
* Test whether a toast value with the given ID exists in the toast relation.
* For safety, we consider a value to exist if there are either live or dead
* toast rows with that ID; see notes for GetNewOid().
* ----------
*/
static bool
......@@ -1806,7 +1808,6 @@ toastrel_valueid_exists(Relation toastrel, Oid valueid)
int num_indexes;
int validIndex;
Relation *toastidxs;
SnapshotData SnapshotToast;
/* Fetch a valid index relation */
validIndex = toast_open_indexes(toastrel,
......@@ -1825,10 +1826,9 @@ toastrel_valueid_exists(Relation toastrel, Oid valueid)
/*
* Is there any such chunk?
*/
init_toast_snapshot(&SnapshotToast);
toastscan = systable_beginscan(toastrel,
RelationGetRelid(toastidxs[validIndex]),
true, &SnapshotToast, 1, &toastkey);
true, SnapshotAny, 1, &toastkey);
if (systable_getnext(toastscan) != NULL)
result = true;
......
......@@ -282,8 +282,12 @@ IsSharedRelation(Oid relationId)
* managed to cycle through 2^32 OIDs and generate the same OID before we
* finish inserting our row. This seems unlikely to be a problem. Note
* that if we had to *commit* the row to end the race condition, the risk
* would be rather higher; therefore we use SnapshotDirty in the test,
* so that we will see uncommitted rows.
* would be rather higher; therefore we use SnapshotAny in the test, so that
* we will see uncommitted rows. (We used to use SnapshotDirty, but that has
* the disadvantage that it ignores recently-deleted rows, creating a risk
* of transient conflicts for as long as our own MVCC snapshots think a
* recently-deleted row is live. The risk is far higher when selecting TOAST
* OIDs, because SnapshotToast considers dead rows as active indefinitely.)
*/
Oid
GetNewOid(Relation relation)
......@@ -336,7 +340,6 @@ Oid
GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
{
Oid newOid;
SnapshotData SnapshotDirty;
SysScanDesc scan;
ScanKeyData key;
bool collides;
......@@ -349,8 +352,6 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
*/
Assert(!IsBinaryUpgrade || RelationGetRelid(relation) != TypeRelationId);
InitDirtySnapshot(SnapshotDirty);
/* Generate new OIDs until we find one not in the table */
do
{
......@@ -363,9 +364,9 @@ GetNewOidWithIndex(Relation relation, Oid indexId, AttrNumber oidcolumn)
BTEqualStrategyNumber, F_OIDEQ,
ObjectIdGetDatum(newOid));
/* see notes above about using SnapshotDirty */
/* see notes above about using SnapshotAny */
scan = systable_beginscan(relation, indexId, true,
&SnapshotDirty, 1, &key);
SnapshotAny, 1, &key);
collides = HeapTupleIsValid(systable_getnext(scan));
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment