Commit 3103f9a7 authored by Heikki Linnakangas's avatar Heikki Linnakangas

The row-version chaining in Serializable Snapshot Isolation was still wrong.

On further analysis, it turns out that it is not needed to duplicate predicate
locks to the new row version at update, the lock on the version that the
transaction saw as visible is enough. However, there was a different bug in
the code that checks for dangerous structures when a new rw-conflict happens.
Fix that bug, and remove all the row-version chaining related code.

Kevin Grittner & Dan Ports, with some comment editorialization by me.
parent 5177dfef
...@@ -1529,7 +1529,6 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer, ...@@ -1529,7 +1529,6 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
OffsetNumber offnum; OffsetNumber offnum;
bool at_chain_start; bool at_chain_start;
bool valid; bool valid;
bool match_found;
if (all_dead) if (all_dead)
*all_dead = true; *all_dead = true;
...@@ -1539,7 +1538,6 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer, ...@@ -1539,7 +1538,6 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
Assert(ItemPointerGetBlockNumber(tid) == BufferGetBlockNumber(buffer)); Assert(ItemPointerGetBlockNumber(tid) == BufferGetBlockNumber(buffer));
offnum = ItemPointerGetOffsetNumber(tid); offnum = ItemPointerGetOffsetNumber(tid);
at_chain_start = true; at_chain_start = true;
match_found = false;
/* Scan through possible multiple members of HOT-chain */ /* Scan through possible multiple members of HOT-chain */
for (;;) for (;;)
...@@ -1597,9 +1595,6 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer, ...@@ -1597,9 +1595,6 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
PredicateLockTuple(relation, &heapTuple); PredicateLockTuple(relation, &heapTuple);
if (all_dead) if (all_dead)
*all_dead = false; *all_dead = false;
if (IsolationIsSerializable())
match_found = true;
else
return true; return true;
} }
...@@ -1629,7 +1624,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer, ...@@ -1629,7 +1624,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
break; /* end of chain */ break; /* end of chain */
} }
return match_found; return false;
} }
/* /*
...@@ -2855,12 +2850,6 @@ l2: ...@@ -2855,12 +2850,6 @@ l2:
END_CRIT_SECTION(); END_CRIT_SECTION();
/*
* Any existing SIREAD locks on the old tuple must be linked to the new
* tuple for conflict detection purposes.
*/
PredicateLockTupleRowVersionLink(relation, &oldtup, heaptup);
if (newbuf != buffer) if (newbuf != buffer)
LockBuffer(newbuf, BUFFER_LOCK_UNLOCK); LockBuffer(newbuf, BUFFER_LOCK_UNLOCK);
LockBuffer(buffer, BUFFER_LOCK_UNLOCK); LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
......
...@@ -612,8 +612,7 @@ index_getnext(IndexScanDesc scan, ScanDirection direction) ...@@ -612,8 +612,7 @@ index_getnext(IndexScanDesc scan, ScanDirection direction)
* any more members. Otherwise, check for continuation of the * any more members. Otherwise, check for continuation of the
* HOT-chain, and set state for next time. * HOT-chain, and set state for next time.
*/ */
if (IsMVCCSnapshot(scan->xs_snapshot) if (IsMVCCSnapshot(scan->xs_snapshot))
&& !IsolationIsSerializable())
scan->xs_next_hot = InvalidOffsetNumber; scan->xs_next_hot = InvalidOffsetNumber;
else if (HeapTupleIsHotUpdated(heapTuple)) else if (HeapTupleIsHotUpdated(heapTuple))
{ {
......
...@@ -402,6 +402,54 @@ is based on the top level xid. When looking at an xid that comes ...@@ -402,6 +402,54 @@ is based on the top level xid. When looking at an xid that comes
from a tuple's xmin or xmax, for example, we always call from a tuple's xmin or xmax, for example, we always call
SubTransGetTopmostTransaction() before doing much else with it. SubTransGetTopmostTransaction() before doing much else with it.
* PostgreSQL does not use "update in place" with a rollback log
for its MVCC implementation. Where possible it uses "HOT" updates on
the same page (if there is room and no indexed value is changed).
For non-HOT updates the old tuple is expired in place and a new tuple
is inserted at a new location. Because of this difference, a tuple
lock in PostgreSQL doesn't automatically lock any other versions of a
row. We don't try to copy or expand a tuple lock to any other
versions of the row, based on the following proof that any additional
serialization failures we would get from that would be false
positives:
o If transaction T1 reads a row (thus acquiring a predicate
lock on it) and a second transaction T2 updates that row, must a
third transaction T3 which updates the new version of the row have a
rw-conflict in from T1 to prevent anomalies? In other words, does it
matter whether this edge T1 -> T3 is there?
o If T1 has a conflict in, it certainly doesn't. Adding the
edge T1 -> T3 would create a dangerous structure, but we already had
one from the edge T1 -> T2, so we would have aborted something
anyway.
o Now let's consider the case where T1 doesn't have a
conflict in. If that's the case, for this edge T1 -> T3 to make a
difference, T3 must have a rw-conflict out that induces a cycle in
the dependency graph, i.e. a conflict out to some transaction
preceding T1 in the serial order. (A conflict out to T1 would work
too, but that would mean T1 has a conflict in and we would have
rolled back.)
o So now we're trying to figure out if there can be an
rw-conflict edge T3 -> T0, where T0 is some transaction that precedes
T1. For T0 to precede T1, there has to be has to be some edge, or
sequence of edges, from T0 to T1. At least the last edge has to be a
wr-dependency or ww-dependency rather than a rw-conflict, because T1
doesn't have a rw-conflict in. And that gives us enough information
about the order of transactions to see that T3 can't have a
rw-dependency to T0:
- T0 committed before T1 started (the wr/ww-dependency implies this)
- T1 started before T2 committed (the T1->T2 rw-conflict implies this)
- T2 committed before T3 started (otherwise, T3 would be aborted
because of an update conflict)
o That means T0 committed before T3 started, and therefore
there can't be a rw-conflict from T3 to T0.
o In both cases, we didn't need the T1 -> T3 edge.
* Predicate locking in PostgreSQL will start at the tuple level * Predicate locking in PostgreSQL will start at the tuple level
when possible, with automatic conversion of multiple fine-grained when possible, with automatic conversion of multiple fine-grained
locks to coarser granularity as need to avoid resource exhaustion. locks to coarser granularity as need to avoid resource exhaustion.
......
This diff is collapsed.
...@@ -47,7 +47,6 @@ extern void RegisterPredicateLockingXid(const TransactionId xid); ...@@ -47,7 +47,6 @@ extern void RegisterPredicateLockingXid(const TransactionId xid);
extern void PredicateLockRelation(const Relation relation); extern void PredicateLockRelation(const Relation relation);
extern void PredicateLockPage(const Relation relation, const BlockNumber blkno); extern void PredicateLockPage(const Relation relation, const BlockNumber blkno);
extern void PredicateLockTuple(const Relation relation, const HeapTuple tuple); extern void PredicateLockTuple(const Relation relation, const HeapTuple tuple);
extern void PredicateLockTupleRowVersionLink(const Relation relation, const HeapTuple oldTuple, const HeapTuple newTuple);
extern void PredicateLockPageSplit(const Relation relation, const BlockNumber oldblkno, const BlockNumber newblkno); extern void PredicateLockPageSplit(const Relation relation, const BlockNumber oldblkno, const BlockNumber newblkno);
extern void PredicateLockPageCombine(const Relation relation, const BlockNumber oldblkno, const BlockNumber newblkno); extern void PredicateLockPageCombine(const Relation relation, const BlockNumber oldblkno, const BlockNumber newblkno);
extern void ReleasePredicateLocks(const bool isCommit); extern void ReleasePredicateLocks(const bool isCommit);
......
...@@ -19,6 +19,6 @@ id txt ...@@ -19,6 +19,6 @@ id txt
1 1
step c4: COMMIT; step c4: COMMIT;
step c3: COMMIT; step c3: COMMIT;
ERROR: could not serialize access due to read/write dependencies among transactions
step wz1: UPDATE t SET txt = 'a' WHERE id = 1; step wz1: UPDATE t SET txt = 'a' WHERE id = 1;
ERROR: could not serialize access due to read/write dependencies among transactions
step c1: COMMIT; step c1: COMMIT;
# Multiple Row Versions test # Multiple Row Versions test
# #
# This test is designed to ensure that predicate locks taken on one version # This test is designed to cover some code paths which only occur with
# of a row are detected as conflicts when a later version of the row is # four or more transactions interacting with particular timings.
# updated or deleted by a transaction concurrent to the reader.
# #
# Due to long permutation setup time, we are only testing one specific # Due to long permutation setup time, we are only testing one specific
# permutation, which should get a serialization error. # permutation, which should get a serialization error.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment