Commit 5a2f154a authored by Peter Geoghegan's avatar Peter Geoghegan

Improve nbtree README's LP_DEAD section.

The description of how LP_DEAD bit setting by index scans works
following commit 2ed5b87f was rather unclear.  Clean that up a bit.

Also refer to LP_DEAD bit setting within _bt_check_unique() at the start
of the same section.  This mechanism may actually be more important than
the generic kill_prior_tuple mechanism that the section focuses on, so
it at least deserves to be mentioned in passing.
parent 52eec1c5
...@@ -429,7 +429,10 @@ allowing subsequent index scans to skip visiting the heap tuple. The ...@@ -429,7 +429,10 @@ allowing subsequent index scans to skip visiting the heap tuple. The
"known dead" marking works by setting the index item's lp_flags state "known dead" marking works by setting the index item's lp_flags state
to LP_DEAD. This is currently only done in plain indexscans, not bitmap to LP_DEAD. This is currently only done in plain indexscans, not bitmap
scans, because only plain scans visit the heap and index "in sync" and so scans, because only plain scans visit the heap and index "in sync" and so
there's not a convenient way to do it for bitmap scans. there's not a convenient way to do it for bitmap scans. Note also that
LP_DEAD bits are often set when checking a unique index for conflicts on
insert (this is simpler because it takes place when we hold an exclusive
lock on the leaf page).
Once an index tuple has been marked LP_DEAD it can actually be removed Once an index tuple has been marked LP_DEAD it can actually be removed
from the index immediately; since index scans only stop "between" pages, from the index immediately; since index scans only stop "between" pages,
...@@ -456,12 +459,15 @@ that this breaks the interlock between VACUUM and indexscans, but that is ...@@ -456,12 +459,15 @@ that this breaks the interlock between VACUUM and indexscans, but that is
not so: as long as an indexscanning process has a pin on the page where not so: as long as an indexscanning process has a pin on the page where
the index item used to be, VACUUM cannot complete its btbulkdelete scan the index item used to be, VACUUM cannot complete its btbulkdelete scan
and so cannot remove the heap tuple. This is another reason why and so cannot remove the heap tuple. This is another reason why
btbulkdelete has to get a super-exclusive lock on every leaf page, not btbulkdelete has to get a super-exclusive lock on every leaf page, not only
only the ones where it actually sees items to delete. So that we can the ones where it actually sees items to delete.
handle the cases where we attempt LP_DEAD flagging for a page after we
have released its pin, we remember the LSN of the index page when we read LP_DEAD setting by index scans cannot be sure that a TID whose index tuple
the index tuples from it; we do not attempt to flag index tuples as dead it had planned on LP_DEAD-setting has not been recycled by VACUUM if it
if the we didn't hold the pin the entire time and the LSN has changed. drops its pin in the meantime. It must conservatively also remember the
LSN of the page, and only act to set LP_DEAD bits when the LSN has not
changed at all. (Avoiding dropping the pin entirely also makes it safe, of
course.)
WAL Considerations WAL Considerations
------------------ ------------------
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment