Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
P
Postgres FD Implementation
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Abuhujair Javed
Postgres FD Implementation
Commits
6d8b2aa8
Commit
6d8b2aa8
authored
Oct 05, 2015
by
Bruce Momjian
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
docs: update guidelines on when to use GIN and GiST indexes
Report by Tomas Vondra Backpatch through 9.5
parent
f8a5e579
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
19 additions
and
61 deletions
+19
-61
doc/src/sgml/textsearch.sgml
doc/src/sgml/textsearch.sgml
+19
-61
No files found.
doc/src/sgml/textsearch.sgml
View file @
6d8b2aa8
...
...
@@ -3192,7 +3192,7 @@ SELECT plainto_tsquery('supernovae stars');
</sect1>
<sect1 id="textsearch-indexes">
<title>G
iST and GIN
Index Types</title>
<title>G
IN and GiST
Index Types</title>
<indexterm zone="textsearch-indexes">
<primary>text search</primary>
...
...
@@ -3213,18 +3213,17 @@ SELECT plainto_tsquery('supernovae stars');
<term>
<indexterm zone="textsearch-indexes">
<primary>index</primary>
<secondary>G
iST
</secondary>
<secondary>G
IN
</secondary>
<tertiary>text search</tertiary>
</indexterm>
<literal>CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING GI
ST
(<replaceable>column</replaceable>);</literal>
<literal>CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING GI
N
(<replaceable>column</replaceable>);</literal>
</term>
<listitem>
<para>
Creates a GiST (Generalized Search Tree)-based index.
The <replaceable>column</replaceable> can be of <type>tsvector</> or
<type>tsquery</> type.
Creates a GIN (Generalized Inverted Index)-based index.
The <replaceable>column</replaceable> must be of <type>tsvector</> type.
</para>
</listitem>
</varlistentry>
...
...
@@ -3234,17 +3233,18 @@ SELECT plainto_tsquery('supernovae stars');
<term>
<indexterm zone="textsearch-indexes">
<primary>index</primary>
<secondary>G
IN
</secondary>
<secondary>G
iST
</secondary>
<tertiary>text search</tertiary>
</indexterm>
<literal>CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING GI
N
(<replaceable>column</replaceable>);</literal>
<literal>CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING GI
ST
(<replaceable>column</replaceable>);</literal>
</term>
<listitem>
<para>
Creates a GIN (Generalized Inverted Index)-based index.
The <replaceable>column</replaceable> must be of <type>tsvector</> type.
Creates a GiST (Generalized Search Tree)-based index.
The <replaceable>column</replaceable> can be of <type>tsvector</> or
<type>tsquery</> type.
</para>
</listitem>
</varlistentry>
...
...
@@ -3253,13 +3253,18 @@ SELECT plainto_tsquery('supernovae stars');
</para>
<para>
There are substantial performance differences between the two index types,
so it is important to understand their characteristics.
GIN indexes are the preferred text search index type. As inverted
indexes, they contain an index entry for each word (lexeme), with a
compressed list of matching locations. Multi-word searches can find
the first match, then use the index to remove rows that are lacking
additional words. GIN indexes store only the words (lexemes) of
<type>tsvector</> values, and not their weight labels. Thus a table
row recheck is needed when using a query that involves weights.
</para>
<para>
A GiST index is <firstterm>lossy</firstterm>, meaning that the index
m
ay
produce false matches, and it is necessary
m
ight
produce false matches, and it is necessary
to check the actual table row to eliminate such false matches.
(<productname>PostgreSQL</productname> does this automatically when needed.)
GiST indexes are lossy because each document is represented in the
...
...
@@ -3280,53 +3285,6 @@ SELECT plainto_tsquery('supernovae stars');
recommended.
</para>
<para>
GIN indexes are not lossy for standard queries, but their performance
depends logarithmically on the number of unique words.
(However, GIN indexes store only the words (lexemes) of <type>tsvector</>
values, and not their weight labels. Thus a table row recheck is needed
when using a query that involves weights.)
</para>
<para>
In choosing which index type to use, GiST or GIN, consider these
performance differences:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<para>
GIN index lookups are about three times faster than GiST
</para>
</listitem>
<listitem>
<para>
GIN indexes take about three times longer to build than GiST
</para>
</listitem>
<listitem>
<para>
GIN indexes are moderately slower to update than GiST indexes, but
about 10 times slower if fast-update support was disabled
(see <xref linkend="gin-fast-update"> for details)
</para>
</listitem>
<listitem>
<para>
GIN indexes are two-to-three times larger than GiST indexes
</para>
</listitem>
</itemizedlist>
</para>
<para>
As a rule of thumb, <acronym>GIN</acronym> indexes are best for static data
because lookups are faster. For dynamic data, GiST indexes are
faster to update. Specifically, <acronym>GiST</acronym> indexes are very
good for dynamic data and fast if the number of unique words (lexemes) is
under 100,000, while <acronym>GIN</acronym> indexes will handle 100,000+
lexemes better but are slower to update.
</para>
<para>
Note that <acronym>GIN</acronym> index build time can often be improved
by increasing <xref linkend="guc-maintenance-work-mem">, while
...
...
@@ -3335,7 +3293,7 @@ SELECT plainto_tsquery('supernovae stars');
</para>
<para>
Partitioning of big collections and the proper use of G
iST and GIN
indexes
Partitioning of big collections and the proper use of G
IN and GiST
indexes
allows the implementation of very fast searches with online update.
Partitioning can be done at the database level using table inheritance,
or by distributing documents over
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment