Commit 9bd7ed82 authored by Tom Lane's avatar Tom Lane

Clean up some obsolete statements about GiST indexes, and add a section

documenting GiST crash recovery procedures, as requested some time ago
by Teodor.  (The GiST chapter doesn't seem quite the right place for
the latter, but I'm not sure what else to do with it.)
parent d1959f9f
<!-- <!--
$PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.21 2005/07/02 20:08:27 momjian Exp $ $PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.22 2005/10/21 01:41:28 tgl Exp $
--> -->
<chapter id="GiST"> <chapter id="GiST">
<title>GiST Indexes</title> <title>GiST Indexes</title>
<sect1 id="intro"> <sect1 id="gist-intro">
<title>Introduction</title> <title>Introduction</title>
<para> <para>
...@@ -44,7 +44,7 @@ $PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.21 2005/07/02 20:08:27 momjian Exp ...@@ -44,7 +44,7 @@ $PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.21 2005/07/02 20:08:27 momjian Exp
</sect1> </sect1>
<sect1 id="extensibility"> <sect1 id="gist-extensibility">
<title>Extensibility</title> <title>Extensibility</title>
<para> <para>
...@@ -92,7 +92,7 @@ $PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.21 2005/07/02 20:08:27 momjian Exp ...@@ -92,7 +92,7 @@ $PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.21 2005/07/02 20:08:27 momjian Exp
</sect1> </sect1>
<sect1 id="implementation"> <sect1 id="gist-implementation">
<title>Implementation</title> <title>Implementation</title>
<para> <para>
...@@ -180,19 +180,24 @@ $PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.21 2005/07/02 20:08:27 momjian Exp ...@@ -180,19 +180,24 @@ $PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.21 2005/07/02 20:08:27 momjian Exp
</sect1> </sect1>
<sect1 id="examples"> <sect1 id="gist-examples">
<title>Examples</title> <title>Examples</title>
<para> <para>
To see example implementations of index methods implemented using The <productname>PostgreSQL</productname> source distribution includes
<acronym>GiST</acronym>, examine the following contrib modules: several examples of index methods implemented using
<acronym>GiST</acronym>. The core system currently provides R-Tree
equivalent functionality for some of the built-in geometric datatypes
(see <filename>src/backend/access/gist/gistproc.c</>). The following
<filename>contrib</> modules also contain <acronym>GiST</acronym>
operator classes:
</para> </para>
<variablelist> <variablelist>
<varlistentry> <varlistentry>
<term>btree_gist</term> <term>btree_gist</term>
<listitem> <listitem>
<para>B-Tree</para> <para>B-Tree equivalent functionality for several datatypes</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
...@@ -213,26 +218,26 @@ $PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.21 2005/07/02 20:08:27 momjian Exp ...@@ -213,26 +218,26 @@ $PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.21 2005/07/02 20:08:27 momjian Exp
<varlistentry> <varlistentry>
<term>ltree</term> <term>ltree</term>
<listitem> <listitem>
<para>Indexing for tree-like stuctures</para> <para>Indexing for tree-like structures</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>rtree_gist</term> <term>pg_trgm</term>
<listitem> <listitem>
<para>R-Tree</para> <para>Text similarity using trigram matching</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>seg</term> <term>seg</term>
<listitem> <listitem>
<para>Storage and indexed access for <quote>float ranges</quote></para> <para>Indexing for <quote>float ranges</quote></para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>tsearch and tsearch2</term> <term>tsearch2</term>
<listitem> <listitem>
<para>Full text indexing</para> <para>Full text indexing</para>
</listitem> </listitem>
...@@ -241,4 +246,33 @@ $PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.21 2005/07/02 20:08:27 momjian Exp ...@@ -241,4 +246,33 @@ $PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.21 2005/07/02 20:08:27 momjian Exp
</sect1> </sect1>
<sect1 id="gist-recovery">
<title>Crash Recovery</title>
<para>
Usually, replay of the WAL log is sufficient to restore the integrity
of a GiST index following a database crash. However, there are some
corner cases in which the index state is not fully rebuilt. The index
will still be functionally correct, but there may be some performance
degradation. When this occurs, the index can be repaired by
<command>VACUUM</>ing its table, or by rebuilding the index using
<command>REINDEX</>. In some cases a plain <command>VACUUM</> is
not sufficient, and either <command>VACUUM FULL</> or <command>REINDEX</>
is needed. The need for one of these procedures is indicated by occurrence
of this log message during crash recovery:
<programlisting>
LOG: index NNN/NNN/NNN needs VACUUM or REINDEX to finish crash recovery
</programlisting>
or this log message during routine index insertions:
<programlisting>
LOG: index "FOO" needs VACUUM or REINDEX to finish crash recovery
</programlisting>
If a plain <command>VACUUM</> finds itself unable to complete recovery
fully, it will return a notice:
<programlisting>
NOTICE: index "FOO" needs VACUUM FULL or REINDEX to finish crash recovery
</programlisting>
</para>
</sect1>
</chapter> </chapter>
<!-- $PostgreSQL: pgsql/doc/src/sgml/indices.sgml,v 1.52 2005/09/12 19:17:45 tgl Exp $ --> <!-- $PostgreSQL: pgsql/doc/src/sgml/indices.sgml,v 1.53 2005/10/21 01:41:28 tgl Exp $ -->
<chapter id="indexes"> <chapter id="indexes">
<title id="indexes-title">Indexes</title> <title id="indexes-title">Indexes</title>
...@@ -206,14 +206,6 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> ...@@ -206,14 +206,6 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
<synopsis> <synopsis>
CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING hash (<replaceable>column</replaceable>); CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING hash (<replaceable>column</replaceable>);
</synopsis> </synopsis>
<note>
<para>
Testing has shown <productname>PostgreSQL</productname>'s hash
indexes to perform no better than B-tree indexes, and the
index size and build time for hash indexes is much worse. For
these reasons, hash index use is presently discouraged.
</para>
</note>
</para> </para>
<para> <para>
...@@ -226,15 +218,33 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> ...@@ -226,15 +218,33 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
equivalent to the R-tree operator classes, and many other GiST operator equivalent to the R-tree operator classes, and many other GiST operator
classes are available in the <literal>contrib</> collection or as separate classes are available in the <literal>contrib</> collection or as separate
projects. For more information see <xref linkend="GiST">. projects. For more information see <xref linkend="GiST">.
<note>
<para>
It is likely that the R-tree index type will be retired in a future
release, as GiST indexes appear to do everything R-trees can do with
similar or better performance. Users are encouraged to migrate
applications that use R-tree indexes to GiST indexes.
</para>
</note>
</para> </para>
<note>
<para>
Testing has shown <productname>PostgreSQL</productname>'s hash
indexes to perform no better than B-tree indexes, and the
index size and build time for hash indexes is much worse.
Furthermore, hash index operations are not presently WAL-logged,
so hash indexes may need to be rebuilt with <command>REINDEX</>
after a database crash.
For these reasons, hash index use is presently discouraged.
</para>
<para>
Similarly, R-tree indexes do not seem to have any performance
advantages compared to the equivalent operations of GiST indexes.
Like hash indexes, they are not WAL-logged and may need
<command>REINDEX</>ing after a database crash.
</para>
<para>
While the problems with hash indexes may be fixed eventually,
it is likely that the R-tree index type will be retired in a future
release. Users are encouraged to migrate applications that use R-tree
indexes to GiST indexes.
</para>
</note>
</sect1> </sect1>
...@@ -300,9 +310,12 @@ CREATE INDEX test2_mm_idx ON test2 (major, minor); ...@@ -300,9 +310,12 @@ CREATE INDEX test2_mm_idx ON test2 (major, minor);
<para> <para>
A multicolumn GiST index can only be used when there is a query condition A multicolumn GiST index can only be used when there is a query condition
on its leading column. As with B-trees, conditions on additional columns on its leading column. Conditions on additional columns restrict the
restrict the entries returned by the index, but do not in themselves aid entries returned by the index, but the condition on the first column is the
the index search. most important one for determining how much of the index needs to be
scanned. A GiST index will be relatively ineffective if its first column
has only a few distinct values, even if there are many distinct values in
additional columns.
</para> </para>
<para> <para>
......
<!-- <!--
$PostgreSQL: pgsql/doc/src/sgml/mvcc.sgml,v 2.51 2005/06/13 02:40:05 neilc Exp $ $PostgreSQL: pgsql/doc/src/sgml/mvcc.sgml,v 2.52 2005/10/21 01:41:28 tgl Exp $
--> -->
<chapter id="mvcc"> <chapter id="mvcc">
...@@ -965,41 +965,41 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222; ...@@ -965,41 +965,41 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
<variablelist> <variablelist>
<varlistentry> <varlistentry>
<term> <term>
B-tree indexes B-tree and <acronym>GiST</acronym> indexes
</term> </term>
<listitem> <listitem>
<para> <para>
Short-term share/exclusive page-level locks are used for Short-term share/exclusive page-level locks are used for
read/write access. Locks are released immediately after each read/write access. Locks are released immediately after each
index row is fetched or inserted. B-tree indexes provide index row is fetched or inserted. These index types provide
the highest concurrency without deadlock conditions. the highest concurrency without deadlock conditions.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term> <term>
<acronym>GiST</acronym> and R-tree indexes Hash indexes
</term> </term>
<listitem> <listitem>
<para> <para>
Share/exclusive index-level locks are used for read/write access. Share/exclusive hash-bucket-level locks are used for read/write
Locks are released after the command is done. access. Locks are released after the whole bucket is processed.
Bucket-level locks provide better concurrency than index-level
ones, but deadlock is possible since the locks are held longer
than one index operation.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term> <term>
Hash indexes R-tree indexes
</term> </term>
<listitem> <listitem>
<para> <para>
Share/exclusive hash-bucket-level locks are used for read/write Share/exclusive index-level locks are used for read/write access.
access. Locks are released after the whole bucket is processed. Locks are released after the entire command is done.
Bucket-level locks provide better concurrency than index-level
ones, but deadlock is possible since the locks are held longer
than one index operation.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
...@@ -1007,14 +1007,13 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222; ...@@ -1007,14 +1007,13 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
</para> </para>
<para> <para>
In short, B-tree indexes offer the best performance for concurrent Currently, B-tree indexes offer the best performance for concurrent
applications; since they also have more features than hash applications; since they also have more features than hash
indexes, they are the recommended index type for concurrent indexes, they are the recommended index type for concurrent
applications that need to index scalar data. When dealing with applications that need to index scalar data. When dealing with
non-scalar data, B-trees obviously cannot be used; in that non-scalar data, B-trees are not useful, and GiST indexes should
situation, application developers should be aware of the be used instead. R-tree indexes are deprecated and are likely
relatively poor concurrent performance of GiST and R-tree to disappear entirely in a future release.
indexes.
</para> </para>
</sect1> </sect1>
</chapter> </chapter>
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment