Commit 6312c08a authored by Peter Geoghegan's avatar Peter Geoghegan

nbtree: Use raw PageAddItem() for retail inserts.

Only internal page splits need to call _bt_pgaddtup() instead of
PageAddItem(), and only for data items, one of which will end up at the
first offset (or first offset after the high key offset) on the new
right page.  This data item alone will need to be truncated in
_bt_pgaddtup().

Since there is no reason why retail inserts ever need to truncate the
incoming item, use a raw PageAddItem() call there instead.  Even
_bt_split() uses raw PageAddItem() calls for left page and right page
high keys.  Clearly the _bt_pgaddtup() shim function wasn't really
encapsulating anything.  _bt_pgaddtup() should now be thought of as a
_bt_split() helper function.

Note that the assertions from commit d1e241c2 verify that retail inserts
never insert an item at an internal page's negative infinity offset.
This invariant could only ever be violated as a result of a basic logic
error in nbtinsert.c.
parent d41202f3
...@@ -1249,7 +1249,8 @@ _bt_insertonpg(Relation rel, ...@@ -1249,7 +1249,8 @@ _bt_insertonpg(Relation rel,
if (postingoff != 0) if (postingoff != 0)
memcpy(oposting, nposting, MAXALIGN(IndexTupleSize(nposting))); memcpy(oposting, nposting, MAXALIGN(IndexTupleSize(nposting)));
if (!_bt_pgaddtup(page, itemsz, itup, newitemoff)) if (PageAddItem(page, (Item) itup, itemsz, newitemoff, false,
false) == InvalidOffsetNumber)
elog(PANIC, "failed to add new item to block %u in index \"%s\"", elog(PANIC, "failed to add new item to block %u in index \"%s\"",
BufferGetBlockNumber(buf), RelationGetRelationName(rel)); BufferGetBlockNumber(buf), RelationGetRelationName(rel));
...@@ -2528,19 +2529,30 @@ _bt_newroot(Relation rel, Buffer lbuf, Buffer rbuf) ...@@ -2528,19 +2529,30 @@ _bt_newroot(Relation rel, Buffer lbuf, Buffer rbuf)
} }
/* /*
* _bt_pgaddtup() -- add a tuple to a particular page in the index. * _bt_pgaddtup() -- add a data item to a particular page during split.
* *
* This routine adds the tuple to the page as requested. It does * This routine adds the tuple to the page as requested. It does
* not affect pin/lock status, but you'd better have a write lock * not affect pin/lock status, but you'd better have a write lock
* and pin on the target buffer! Don't forget to write and release * and pin on the target buffer! Don't forget to write and release
* the buffer afterwards, either. * the buffer afterwards, either.
* *
* The main difference between this routine and a bare PageAddItem call * The difference between this routine and a bare PageAddItem call is
* is that this code knows that the leftmost index tuple on a non-leaf * that this code knows that the leftmost data item on an internal
* btree page has a key that must be treated as minus infinity. * btree page has a key that must be treated as minus infinity.
* Therefore, it truncates away all attributes. CAUTION: this works * Therefore, it truncates away all attributes. This extra step is
* ONLY if we insert the tuples in order, so that the given itup_off * only needed during internal page splits.
* does represent the final position of the tuple! *
* Truncation of an internal page data item can be thought of as one
* of the steps used to "move" a boundary separator key during an
* internal page split. Conceptually, _bt_split() caller splits
* internal pages "inside" the firstright data item: firstright's
* separator key is used as the high key for the left page, while its
* downlink is used within the first data item (also the negative
* infinity item) for the right page. Each distinct separator key
* should appear no more than once per level of the tree.
*
* CAUTION: this works ONLY if we insert the tuples in order, so that
* the given itup_off does represent the final position of the tuple!
*/ */
static bool static bool
_bt_pgaddtup(Page page, _bt_pgaddtup(Page page,
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment