Commit b150a767 authored by Peter Geoghegan's avatar Peter Geoghegan

Fix nbtree deduplication README commentary.

Descriptions of some aspects of how deduplication works were unclear in
a couple of places.
parent 112b006f
...@@ -780,20 +780,21 @@ order. Delaying deduplication minimizes page level fragmentation. ...@@ -780,20 +780,21 @@ order. Delaying deduplication minimizes page level fragmentation.
Deduplication in unique indexes Deduplication in unique indexes
------------------------------- -------------------------------
Very often, the range of values that can be placed on a given leaf page in Very often, the number of distinct values that can ever be placed on
a unique index is fixed and permanent. For example, a primary key on an almost any given leaf page in a unique index is fixed and permanent. For
identity column will usually only have page splits caused by the insertion example, a primary key on an identity column will usually only have leaf
of new logical rows within the rightmost leaf page. If there is a split page splits caused by the insertion of new logical rows within the
of a non-rightmost leaf page, then the split must have been triggered by rightmost leaf page. If there is a split of a non-rightmost leaf page,
inserts associated with an UPDATE of an existing logical row. Splitting a then the split must have been triggered by inserts associated with UPDATEs
leaf page purely to store multiple versions should be considered of existing logical rows. Splitting a leaf page purely to store multiple
pathological, since it permanently degrades the index structure in order versions is a false economy. In effect, we're permanently degrading the
to absorb a temporary burst of duplicates. Deduplication in unique index structure just to absorb a temporary burst of duplicates.
indexes helps to prevent these pathological page splits. Storing
duplicates in a space efficient manner is not the goal, since in the long Deduplication in unique indexes helps to prevent these pathological page
run there won't be any duplicates anyway. Rather, we're buying time for splits. Storing duplicates in a space efficient manner is not the goal,
standard garbage collection mechanisms to run before a page split is since in the long run there won't be any duplicates anyway. Rather, we're
needed. buying time for standard garbage collection mechanisms to run before a
page split is needed.
Unique index leaf pages only get a deduplication pass when an insertion Unique index leaf pages only get a deduplication pass when an insertion
(that might have to split the page) observed an existing duplicate on the (that might have to split the page) observed an existing duplicate on the
...@@ -838,13 +839,15 @@ list splits. ...@@ -838,13 +839,15 @@ list splits.
Only a few isolated extra steps are required to preserve the illusion that Only a few isolated extra steps are required to preserve the illusion that
the new item never overlapped with an existing posting list in the first the new item never overlapped with an existing posting list in the first
place: the heap TID of the incoming tuple is swapped with the rightmost/max place: the heap TID of the incoming tuple has its TID replaced with the
heap TID from the existing/originally overlapping posting list. Also, the rightmost/max heap TID from the existing/originally overlapping posting
posting-split-with-page-split case must generate a new high key based on list. Similarly, the original incoming item's TID is relocated to the
an imaginary version of the original page that has both the final new item appropriate offset in the posting list (we usually shift TIDs out of the
and the after-list-split posting tuple (page splits usually just operate way to make a hole for it). Finally, the posting-split-with-page-split
against an imaginary version that contains the new item/item that won't case must generate a new high key based on an imaginary version of the
fit). original page that has both the final new item and the after-list-split
posting tuple (page splits usually just operate against an imaginary
version that contains the new item/item that won't fit).
This approach avoids inventing an "eager" atomic posting split operation This approach avoids inventing an "eager" atomic posting split operation
that splits the posting list without simultaneously finishing the insert that splits the posting list without simultaneously finishing the insert
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment