Commit 4e514c61 authored by Amit Kapila's avatar Amit Kapila

Delete empty pages in each pass during GIST VACUUM.

Earlier, we use to postpone deleting empty pages till the second stage of
vacuum to amortize the cost of scanning internal pages.  However, that can
sometimes (say vacuum is canceled or errored between first and second
stage) delay the pages to be recycled.

Another thing is that to facilitate deleting empty pages in the second
stage, we need to share the information about internal and empty pages
between different stages of vacuum.  It will be quite tricky to share this
information via DSM which is required for the upcoming parallel vacuum
patch.

Also, it will bring the logic to reclaim deleted pages closer to nbtree
where we delete empty pages in each pass.

Overall, the advantages of deleting empty pages in each pass outweigh the
advantages of postponing the same.

Author: Dilip Kumar, with changes by Amit Kapila
Reviewed-by: Sawada Masahiko and Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1LGr+MN0xHZpJ2dfS8QNQ1a_aROKowZB+MPNep8FVtwAA@mail.gmail.com
parent eae056c1
...@@ -429,18 +429,17 @@ splits during searches, we don't need a "vacuum cycle ID" concept for that ...@@ -429,18 +429,17 @@ splits during searches, we don't need a "vacuum cycle ID" concept for that
like B-tree does. like B-tree does.
While we scan all the pages, we also make note of any completely empty leaf While we scan all the pages, we also make note of any completely empty leaf
pages. We will try to unlink them from the tree in the second stage. We also pages. We will try to unlink them from the tree after the scan. We also record
record the block numbers of all internal pages; they are needed in the second the block numbers of all internal pages; they are needed to locate parents of
stage, to locate parents of the empty pages. the empty pages while unlinking them.
In the second stage, we try to unlink any empty leaf pages from the tree, so We try to unlink any empty leaf pages from the tree, so that their space can
that their space can be reused. In order to delete an empty page, its be reused. In order to delete an empty page, its downlink must be removed from
downlink must be removed from the parent. We scan all the internal pages, the parent. We scan all the internal pages, whose block numbers we memorized
whose block numbers we memorized in the first stage, and look for downlinks in the first stage, and look for downlinks to pages that we have memorized as
to pages that we have memorized as being empty. Whenever we find one, we being empty. Whenever we find one, we acquire a lock on the parent and child
acquire a lock on the parent and child page, re-check that the child page is page, re-check that the child page is still empty. Then, we remove the
still empty. Then, we remove the downlink and mark the child as deleted, and downlink and mark the child as deleted, and release the locks.
release the locks.
The insertion algorithm would get confused, if an internal page was completely The insertion algorithm would get confused, if an internal page was completely
empty. So we never delete the last child of an internal page, even if it's empty. So we never delete the last child of an internal page, even if it's
......
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment