Commit 5b9c1e6d authored by Robert Haas's avatar Robert Haas

Doc updates for index-only scans.

Document that routine vacuuming is now also important for the purpose
of index-only scans; and mention in the section that describes the
visibility map that it is used to implement index-only scans.

Marti Raudsepp, with some changes by me.
parent f70f095c
...@@ -101,6 +101,11 @@ ...@@ -101,6 +101,11 @@
<productname>PostgreSQL</productname> query planner.</simpara> <productname>PostgreSQL</productname> query planner.</simpara>
</listitem> </listitem>
<listitem>
<simpara>To update the visibility map, which speeds up index-only
scans.</simpara>
</listitem>
<listitem> <listitem>
<simpara>To protect against loss of very old data due to <simpara>To protect against loss of very old data due to
<firstterm>transaction ID wraparound</>.</simpara> <firstterm>transaction ID wraparound</>.</simpara>
...@@ -329,6 +334,33 @@ ...@@ -329,6 +334,33 @@
</tip> </tip>
</sect2> </sect2>
<sect2 id="vacuum-for-visibility-map">
<title>Updating The Visibility Map</title>
<para>
Vacuum maintains a <link linkend="storage-vm">visibility map</> for each
table to keep track of which pages contain only tuples that are known to be
visible to all active transactions (and all future transactions, until the
page is again modified). This has two purposes. First, vacuum
itself can skip such pages on the next run, since there is nothing to
clean up.
</para>
<para>
Second, it allows <productname>PostgreSQL</productname> to answer some
queries using only the index, without reference to the underlying table.
Since <productname>PostgreSQL</productname> indexes don't contain tuple
visibility information, a normal index scan fetches the heap tuple for each
matching index entry, to check whether it should be seen by the current
transaction. An <firstterm>index-only scan</>, on the other hand, checks
the visibility map first. If it's known that all tuples on the page are
visible, the heap fetch can be skipped. This is most noticeable on
large data sets where the visibility map can prevent disk accesses.
The visibility map is vastly smaller than the heap, so it can easily be
cached even when the heap is very large.
</para>
</sect2>
<sect2 id="vacuum-for-wraparound"> <sect2 id="vacuum-for-wraparound">
<title>Preventing Transaction ID Wraparound Failures</title> <title>Preventing Transaction ID Wraparound Failures</title>
......
...@@ -494,11 +494,16 @@ Note that indexes do not have VMs. ...@@ -494,11 +494,16 @@ Note that indexes do not have VMs.
<para> <para>
The visibility map simply stores one bit per heap page. A set bit means The visibility map simply stores one bit per heap page. A set bit means
that all tuples on the page are known to be visible to all transactions. that all tuples on the page are known to be visible to all transactions.
This means that the page does not contain any tuples that need to be vacuumed; This means that the page does not contain any tuples that need to be vacuumed.
in future it might also be used to avoid visiting the page for visibility This information can also be used by <firstterm>index-only scans</> to answer
checks. The map is conservative in the sense that we queries using only the index tuple.
make sure that whenever a bit is set, we know the condition is true, but if </para>
a bit is not set, it might or might not be true.
<para>
The map is conservative in the sense that we make sure that whenever a bit is
set, we know the condition is true, but if a bit is not set, it might or
might not be true. Visibility map bits are only set by vacuum, but are
cleared by any data-modifying operations on a page.
</para> </para>
</sect1> </sect1>
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment