Commit 89a8e156 authored by Teodor Sigaev's avatar Teodor Sigaev

Update docs from Andrew J. Kopciuch <akopciuch@bddf.ca>

parent 02409a48
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><title>tsearch-v2-intro</title> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<link type="text/css" rel="stylesheet" href="tsearch-V2-intro_files/tsearch.txt"></head> <head>
<title>tsearch-v2-intro</title>
</head>
<body> <body class="content">
<div class="content"> <div class="content">
<h2>Tsearch2 - Introduction</h2> <h2>Tsearch2 - Introduction</h2>
<p><a href="
<p><a href="http://www.sai.msu.su/%7Emegera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro.html"> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro.html">
[Online version]</a> of this document is available.</p> [Online version]</a> of this document is available.</p>
<p>The tsearch2 module is available to add as an extension to <p>The tsearch2 module is available to add as an extension to the
the PostgreSQL database to allow for Full Text Indexing. This PostgreSQL database to allow for Full Text Indexing. This document
document is an introduction to installing, configuring, using is an introduction to installing, configuring, using and
and maintaining the database with the tsearch2 module maintaining the database with the tsearch2 module activated.</p>
activated.</p> <p>Please, note, tsearch2 module is fully incompatible with old
tsearch, which is deprecated in 7.4 and will be obsoleted in
<p>Please, note, tsearch2 module is fully incompatible with old 7.5.</p>
tsearch, which is deprecated in 7.4 and will be obsoleted in <h3>USING TSEARCH2 AND POSTGRESQL FOR A WEB BASED SEARCH
7.5.</p> ENGINE</h3>
<p>This documentation is provided as a short guide on how to
<h3>USING TSEARCH2 AND POSTGRESQL FOR A WEB BASED SEARCH quickly get up and running with tsearch2 and PostgreSQL, for those
ENGINE</h3> who want to implement a full text indexed based search engine. It
is not meant to be a complete in-depth guide into the full ins and
<p>This documentation is provided as a short guide on how to outs of the contrib/tsearch2 module, and is primarily aimed at
quickly get up and running with tsearch2 and PostgreSQL, for beginners who want to speed up searching of large text fields, or
those who want to implement a full text indexed based search those migrating from other database systems such as MS-SQL.</p>
engine. It is not meant to be a complete in-depth guide into <p>The README.tsearch2 file included in the contrib/tsearch2
the full ins and outs of the contrib/tsearch2 module, and is directory contains a brief overview and history behind tsearch.
primarily aimed at beginners who want to speed up searching of This can also be found online <a href="
large text fields, or those migrating from other database http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/">[right
systems such as MS-SQL.</p> here]</a>.</p>
<p>Further in depth documentation such as a full function
<p>The README.tsearch2 file included in the contrib/tsearch2 reference, and user guide can be found online at the <a href="
directory contains a brief overview and history behind tsearch. http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/">[tsearch
This can also be found online <a href="http://www.sai.msu.su/%7Emegera/postgres/gist/tsearch/V2/">[right documentation home]</a>.</p>
here]</a>.</p> <h3>ACKNOWLEDGEMENTS</h3>
<p>Further in depth documentation such as a full function <p>Robert John Shepherd originally wrote this documentation for the
reference, and user guide can be found online at the <a href="http://www.sai.msu.su/%7Emegera/postgres/gist/tsearch/V2/docs/">[tsearch previous version of tsearch module (v1) included with the postgres
documentation home]</a>.</p> release. I took his documentation and updated it to comply with the
tsearch2 modifications.</p>
<h3>ACKNOWLEDGEMENTS</h3> <p>Robert's original acknowledgements:</p>
<p>"Thanks to Oleg Bartunov for taking the time to answer many of
<p>Robert John Shepherd originally wrote this documentation for my questions regarding this module, and also to Teodor Sigaev for
the previous version of tsearch module (v1) included with the clearing up the process of making your own dictionaries. Plus of
postgres release. I took his documentation and updated it to course a big thanks to the pair of them for writing this module in
comply with the tsearch2 modifications.</p> the first place!"</p>
<p>I would also like to extend my thanks to the developers, and
<p>Robert's original acknowledgements:</p> Oleg Bartunov for all of his direction and help with the new
features of tsearch2.</p>
<p>"Thanks to Oleg Bartunov for taking the time to answer many <h3>OVERVIEW</h3>
of my questions regarding this module, and also to Teodor <p>MS-SQL provides a full text indexing (FTI) system which enables
Sigaev for clearing up the process of making your own the fast searching of text based fields, very useful for websites
dictionaries. Plus of course a big thanks to the pair of them (and other applications) that require a results set based on key
for writing this module in the first place!"</p> words. PostgreSQL ships with a contributed module called tsearch2,
which implements a special type of index that can also be used for
<p>I would also like to extend my thanks to the developers, and full text indexing. Further more, unlike MS' offering which
Oleg Bartunov for all of his direction and help with the new requires regular incremental rebuilds of the text indexes
features of tsearch2.</p> themselves, tsearch2 indexes are always up-to-date and keeping them
so induces very little overhead.</p>
<h3>OVERVIEW</h3> <p>Before we get into the details, it is recommended that you have
installed and tested PostgreSQL, are reasonably familiar with
<p>MS-SQL provides a full text indexing (FTI) system which databases, the SQL query language and also understand the basics of
enables the fast searching of text based fields, very useful connecting to PostgreSQL from the local shell. This document isn't
for websites (and other applications) that require a results intended for the complete PostgreSQL newbie, but anyone with a
set based on key words. PostgreSQL ships with a contributed reasonable grasp of the basics should be able to follow it.</p>
module called tsearch2, which implements a special type of <h3>INSTALLATION</h3>
index that can also be used for full text indexing. Further <p>Starting with PostgreSQL version 7.4 tsearch2 is now included in
more, unlike MS' offering which requires regular incremental the contrib directory with the PostgreSQL sources. contrib/tsearch2
rebuilds of the text indexes themselves, tsearch2 indexes are is where you will find everything needed to install and use
always up-to-date and keeping them so induces very little tsearch2. Please note that tsearch2 will also work with PostgreSQL
overhead.</p> version 7.3.x, but it is not the module included with the source
distribution. You will have to download the module separately and
<p>Before we get into the details, it is recommended that you install it in the same fashion.</p>
have installed and tested PostgreSQL, are reasonably familiar
with databases, the SQL query language and also understand the <p>I installed the tsearch2 module to a PostgreSQL 7.3 database
basics of connecting to PostgreSQL from the local shell. This from the contrib directory without squashing the original (old)
document isn't intended for the complete PostgreSQL newbie, but tsearch module. What I did was move the modules tsearch src
anyone with a reasonable grasp of the basics should be able to driectory into the contrib tree under the name tsearchV2.</p>
follow it.</p> <p>Step one is to download the tsearch V2 module :</p>
<p><a href="
<h3>INSTALLATION</h3> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/">[http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/]</a>
(check Development History for latest stable version !)</p>
<p>Starting with PostgreSQL version 7.4 tsearch2 is now <pre>
included in the contrib directory with the PostgreSQL sources. tar -zxvf tsearch-v2.tar.gz
contrib/tsearch2 is where you will find everything needed to mv tsearch2 $PGSQL_SRC/contrib/
install and use tsearch2. Please note that tsearch2 will also cd $PGSQL_SRC/contrib/tsearch2
work with PostgreSQL version 7.3.x, but it is not the module </pre>
included with the source distribution. You will have to <p>If you are installing from PostgreSQL version 7.4 or higher, you
download the module separately and install it in the same can skip those steps and just change to the contrib/tsearch2
fashion.</p> directory in the source tree and continue from there.</p>
<p>As of May 9, 2004 there is a source patch available for
<p>I installed the tsearch2 module to a PostgreSQL 7.3 database tsearch2. The patch provides changes to the pg_ts_ configuration
from the contrib directory without squashing the original (old) tables to allow for easy dump and restore of a database containing
tsearch module. What I did was move the modules tsearch src tsearch2. The patch is available here : <a href="
driectory into the contrib tree under the name tsearchV2.</p> http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/regprocedure_7.4.patch.gz">
[http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/regprocedure_7.4.patch.gz]</a></p>
<p>Step one is to download the tsearch V2 module :</p>
<p>To apply this patch, download the mentioned file and place it in
<p><a href="http://www.sai.msu.su/%7Emegera/postgres/gist/tsearch/V2/">[http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/]</a> your postgreSQL source tree ($PGSQL_SRC). This patch is not
(check Development History for latest stable version !)</p> required for tsearch2 to work. I would however, highly recommend it
<pre> tar -zxvf tsearch-v2.tar.gz as it makes the backup and restore procedures very simple.</p>
mv tsearch2 PGSQL_SRC/contrib/ <pre>
cd PGSQL_SRC/contrib/tsearch2 cd $PGSQL_SRC
</pre> gunzip regprocedure_7.4.patch.gz
patch -b -p1 &lt; regprocedure_7.4.patch
<p>If you are installing from PostgreSQL version 7.4 or higher, </pre>
you can skip those steps and just change to the <p>If you have a working version of tsearch2 in your database, you
contrib/tsearch2 directory in the source tree and continue from do not need to re-install the tsearch2 module. Just apply the patch
there.</p> and run make. This patch only affects the tsearch2.sql file. You
can run the SQL script found : <a href="
<p>Then continue with the regular building and installation http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/regprocedure_update.sql">
process</p> [right here]</a> This script will make the modifications found in
<pre> gmake the patch, and update the fields from the existing data. From this
point on, you can dump and restore the database in a normal
fashion. Without this patch, you must follow the instructions later
in this document for backup and restore.</p>
<p>This patch is only needed for tsearch2 in PostgreSQL versions
7.3.x and 7.4.x. The patch has been applied to the sources for
7.5.x.</p>
<p>When you have your source tree for tsearch2 ready, you can
continue with the regular building and installation process</p>
<pre>
gmake
gmake install gmake install
gmake installcheck gmake installcheck
</pre> </pre>
<p>That is pretty much all you have to do, unless of course you get
errors. However if you get those, you better go check with the
mailing lists over at <a href="
http://www.postgresql.org">http://www.postgresql.org</a> or
<a href="
http://openfts.sourceforge.net/">http://openfts.sourceforge.net/</a>
since its never failed for me.</p>
<p>If you ever need to revert this patch, and go back to the
unpatched version of tsearch2, it is simple if you followed the
above patch command. The -b option creates a backup of the original
file, so we can just copy it back.</p>
<pre>
cd $PGSQL_SRC/contrib/tsearch2
cp tsearch.sql.in.orig tsearch.sql.in
make
<p>That is pretty much all you have to do, unless of course you
get errors. However if you get those, you better go check with
the mailing lists over at <a href="http://www.postgresql.org/">http://www.postgresql.org</a> or
<a href="http://openfts.sourceforge.net/">http://openfts.sourceforge.net/</a>
since its never failed for me.</p>
<p>The directory in the contib/ and the directory from the
archive is called tsearch2. Tsearch2 is completely incompatible
with the previous version of tsearch. This means that both
versions can be installed into a single database, and migration
the new version may be much easier.</p>
<p>NOTE: the previous version of tsearch found in the
contrib/tsearch directory is depricated. ALthough it is still
available and included within PostgreSQL version 7.4. It will
be removed in version 7.5.</p>
<h3>ADDING TSEARCH2 FUNCTIONALITY TO A DATABASE</h3>
<p>We should create a database to use as an example for the
remainder of this file. We can call the database "ftstest". You
can create it from the command line like this:</p>
<pre> #createdb ftstest
</pre> </pre>
<p>If you need the patched version again, just follow the patch
<p>If you thought installation was easy, this next bit is even instructions again.</p>
easier. Change to the PGSQL_SRC/contrib/tsearch2 directory and <p>The directory in the contib/ and the directory from the archive
type:</p> is called tsearch2. Tsearch2 is completely incompatible with the
<pre> psql ftstest &lt; tsearch2.sql previous version of tsearch. This means that both versions can be
installed into a single database, and migration the new version may
be much easier.</p>
<p>NOTE: the previous version of tsearch found in the
contrib/tsearch directory is depricated. Although it is still
available and included within PostgreSQL version 7.4. It will be
removed in version 7.5.</p>
<h3>ADDING TSEARCH2 FUNCTIONALITY TO A DATABASE</h3>
<p>We should create a database to use as an example for the
remainder of this file. We can call the database "ftstest". You can
create it from the command line like this:</p>
<pre>
#createdb ftstest
</pre> </pre>
<p>If you thought installation was easy, this next bit is even
easier. Change to the PGSQL_SRC/contrib/tsearch2 directory and
type:</p>
<p>The file "tsearch2.sql" holds all the wonderful little <pre>
goodies you need to do full text indexing. It defines numerous psql ftstest &lt; tsearch2.sql
functions and operators, and creates the needed tables in the </pre>
database. There will be 4 new tables created after running the <p>The file "tsearch2.sql" holds all the wonderful little goodies
tsearch2.sql file : pg_ts_dict, pg_ts_parser, pg_ts_cfg, you need to do full text indexing. It defines numerous functions
pg_ts_cfgmap are added.</p> and operators, and creates the needed tables in the database. There
will be 4 new tables created after running the tsearch2.sql file :
<p>You can check out the tables if you like:</p> pg_ts_dict, pg_ts_parser, pg_ts_cfg, pg_ts_cfgmap are added.</p>
<pre> #psql ftstest <p>You can check out the tables if you like:</p>
<pre>
#psql ftstest
ftstest=# \d ftstest=# \d
List of relations List of relations
Schema | Name | Type | Owner Schema | Name | Type | Owner
...@@ -168,162 +188,153 @@ ...@@ -168,162 +188,153 @@
public | pg_ts_parser | table | kopciuch public | pg_ts_parser | table | kopciuch
(4 rows) (4 rows)
</pre> </pre>
<h3>TYPES AND FUNCTIONS PROVIDED BY TSEARCH2</h3>
<p>The first thing we can do is try out some of the types that are
provided for us. Lets look at the tsvector type provided for
us:</p>
<h3>TYPES AND FUNCTIONS PROVIDED BY TSEARCH2</h3> <pre>
SELECT 'Our first string used today'::tsvector;
<p>The first thing we can do is try out some of the types that
are provided for us. Lets look at the tsvector type provided
for us:</p>
<pre> SELECT 'Our first string used today'::tsvector;
tsvector tsvector
--------------------------------------- ---------------------------------------
'Our' 'used' 'first' 'today' 'string' 'Our' 'used' 'first' 'today' 'string'
(1 row) (1 row)
</pre> </pre>
<p>The results are the words used within our string. Notice they
<p>The results are the words used within our string. Notice are not in any particular order. The tsvector type returns a string
they are not in any particular order. The tsvector type returns of space separated words.</p>
a string of space separated words.</p> <pre>
<pre> SELECT 'Our first string used today first string'::tsvector; SELECT 'Our first string used today first string'::tsvector;
tsvector tsvector
----------------------------------------------- -----------------------------------------------
'Our' 'used' 'again' 'first' 'today' 'string' 'Our' 'used' 'first' 'today' 'string'
(1 row) (1 row)
</pre> </pre>
<p>Notice the results string has each unique word ('first' and
<p>Notice the results string has each unique word ('first' and 'string' only appear once in the tsvector value). Which of course
'string' only appear once in the tsvector value). Which of makes sense if you are searching the full text ... you only need to
course makes sense if you are searching the full text ... you know each unique word in the text.</p>
only need to know each unique word in the text.</p> <p>Those examples were just casting a text field to that of type
tsvector. Lets check out one of the new functions created by the
<p>Those examples were just casting a text field to that of tsearch2 module.</p>
type tsvector. Lets check out one of the new functions created <p>The function to_tsvector has 3 possible signatures:</p>
by the tsearch2 module.</p> <pre>
<p>The function to_tsvector has 3 possible signatures:</p> to_tsvector(oid, text);
<pre> to_tsvector(oid, text);
to_tsvector(text, text); to_tsvector(text, text);
to_tsvector(text); to_tsvector(text);
</pre> </pre>
<p>We will use the second method using two text fields. The
<p>We will use the second method using two text fields. The overloaded methods provide us with a way to specifiy the way the
overloaded methods provide us with a way to specifiy the way searchable text is broken up into words (Stemming process). Right
the searchable text is broken up into words (Stemming process). now we will specify the 'default' configuration. See the section on
Right now we will specify the 'default' configuration. See the TSEARCH2 CONFIGURATION to learn more about this.</p>
section on TSEARCH2 CONFIGURATION to learn more about this.</p> <pre>
<pre> SELECT to_tsvector('default', SELECT to_tsvector('default',
'Our first string used today first string'); 'Our first string used today first string');
to_tsvector to_tsvector
-------------------------------------------- --------------------------------------------
'use':4 'first':2,6 'today':5 'string':3,7 'use':4 'first':2,6 'today':5 'string':3,7
(1 row) (1 row)
</pre> </pre>
<p>The result returned from this function is of type tsvector. The
<p>The result returned from this function is of type tsvector. results came about by this reasoning: All of the words in the text
The results came about by this reasoning: All of the words in passed in are stemmed, or not used because they are stop words
the text passed in are stemmed, or not used because they are defined in our configuration. Each lower case morphed word is
stop words defined in our configuration. Each lower case returned with all of the positons in the text.</p>
morphed word is returned with all of the positons in the <p>In this case the word "Our" is a stop word in the default
text.</p> configuration. That means it will not be included in the result.
The word "first" is found at positions 2 and 6 (although "Our" is a
<p>In this case the word "Our" is a stop word in the default stop word, it's position is maintained). The word(s) positioning is
configuration. That means it will not be included in the maintained exactly as in the original string. The word "used" is
result. The word "first" is found at positions 2 and 6 morphed to the word "use" based on the default configuration for
(although "Our" is a stop word, it's position is maintained). word stemming, and is found at position 4. The rest of the results
The word(s) positioning is maintained exactly as in the follow the same logic. Just a reminder again ... the order of the
original string. The word "used" is morphed to the word "use" 'word' position in the output is not in any kind of order. (ie
based on the default configuration for word stemming, and is 'use':4 appears first)</p>
found at position 4. The rest of the results follow the same <p>If you want to view the output of the tsvector fields without
logic. Just a reminder again ... the order of the 'word' their positions, you can do so with the function
position in the output is not in any kind of order. (ie 'use':4 "strip(tsvector)".</p>
appears first)</p> <pre>
SELECT strip(to_tsvector('default',
<p>If you want to view the output of the tsvector fields
without their positions, you can do so with the function
"strip(tsvector)".</p>
<pre> SELECT strip(to_tsvector('default',
'Our first string used today first string')); 'Our first string used today first string'));
strip strip
-------------------------------- --------------------------------
'use' 'first' 'today' 'string' 'use' 'first' 'today' 'string'
</pre>
<p>If you wish to know the number of unique words returned in </pre>
the tsvector you can do so by using the function <p>If you wish to know the number of unique words returned in the
"length(tsvector)"</p> tsvector you can do so by using the function "length(tsvector)"</p>
<pre> SELECT length(to_tsvector('default', <pre>
SELECT length(to_tsvector('default',
'Our first string used today first string')); 'Our first string used today first string'));
length length
-------- --------
4 4
(1 row) (1 row)
</pre> </pre>
<p>Lets take a look at the function to_tsquery. It also has 3
<p>Lets take a look at the function to_tsquery. It also has 3 signatures which follow the same rational as the to_tsvector
signatures which follow the same rational as the to_tsvector function:</p>
function:</p> <pre>
<pre> to_tsquery(oid, text); to_tsquery(oid, text);
to_tsquery(text, text); to_tsquery(text, text);
to_tsquery(text); to_tsquery(text);
</pre> </pre>
<p>Lets try using the function with a single word :</p>
<p>Lets try using the function with a single word :</p> <pre>
<pre> SELECT to_tsquery('default', 'word'); SELECT to_tsquery('default', 'word');
to_tsquery to_tsquery
----------- -----------
'word' 'word'
(1 row) (1 row)
</pre>
<p>I call the function the same way I would a to_tsvector
function, specifying the 'default' configuration for morphing,
and the result is the stemmed output 'word'.</p>
<p>Lets attempt to use the function with a string of multiple </pre>
words:</p> <p>I call the function the same way I would a to_tsvector function,
<pre> SELECT to_tsquery('default', 'this is many words'); specifying the 'default' configuration for morphing, and the result
is the stemmed output 'word'.</p>
<p>Lets attempt to use the function with a string of multiple
words:</p>
<pre>
SELECT to_tsquery('default', 'this is many words');
ERROR: Syntax error ERROR: Syntax error
</pre> </pre>
<p>The function can not accept a space separated string. The
<p>The function can not accept a space separated string. The intention of the to_tsquery function is to return a type of
intention of the to_tsquery function is to return a type of "tsquery" used for searching a tsvector field. What we need to do
"tsquery" used for searching a tsvector field. What we need to is search for one to many words with some kind of logic (for now
do is search for one to many words with some kind of logic (for simple boolean).</p>
now simple boolean).</p> <pre>
<pre> SELECT to_tsquery('default', 'searching|sentence'); SELECT to_tsquery('default', 'searching|sentence');
to_tsquery to_tsquery
---------------------- ----------------------
'search' | 'sentenc' 'search' | 'sentenc'
(1 row) (1 row)
</pre> </pre>
<p>Notice that the words are separated by the boolean logic "OR",
<p>Notice that the words are separated by the boolean logic the text could contain boolean operators &amp;,|,!,() with their
"OR", the text could contain boolean operators &amp;,|,!,() usual meaning.</p>
with their usual meaning.</p>
<p>You can not use words defined as being a stop word in your
<p>You can not use words defined as being a stop word in your configuration. The function will not fail ... you will just get no
configuration. The function will not fail ... you will just get result, and a NOTICE like this:</p>
no result, and a NOTICE like this:</p> <pre>
<pre> SELECT to_tsquery('default', 'a|is&amp;not|!the'); SELECT to_tsquery('default', 'a|is&amp;not|!the');
NOTICE: Query contains only stopword(s) NOTICE: Query contains only stopword(s)
or doesn't contain lexem(s), ignored or doesn't contain lexem(s), ignored
to_tsquery to_tsquery
----------- -----------
(1 row) (1 row)
</pre> </pre>
<p>That is a beginning to using the types, and functions defined in
<p>That is a beginning to using the types, and functions the tsearch2 module. There are numerous more functions that I have
defined in the tsearch2 module. There are numerous more not touched on. You can read through the tsearch2.sql file built
functions that I have not touched on. You can read through the when compiling to get more familiar with what is included.</p>
tsearch2.sql file built when compiling to get more familiar <h3>INDEXING FIELDS IN A TABLE</h3>
with what is included.</p> <p>The next stage is to add a full text index to an existing table.
In this example we already have a table defined as follows:</p>
<h3>INDEXING FIELDS IN A TABLE</h3> <pre>
CREATE TABLE tblMessages
<p>The next stage is to add a full text index to an existing
table. In this example we already have a table defined as
follows:</p>
<pre> CREATE TABLE tblMessages
( (
intIndex int4, intIndex int4,
strTopic varchar(100), strTopic varchar(100),
...@@ -331,12 +342,13 @@ ...@@ -331,12 +342,13 @@
); );
</pre> </pre>
<p>We are assuming there are several rows with some kind of <p>We are assuming there are several rows with some kind of data in
data in them. Any data will do, just do several inserts with them. Any data will do, just do several inserts with test strings
test strings for a topic, and a message. here is some test data for a topic, and a message. here is some test data I inserted. (yes
I inserted. (yes I know it's completely useless stuff ;-) but I know it's completely useless stuff ;-) but it will serve our
it will serve our purpose right now).</p> purpose right now).</p>
<pre> INSERT INTO tblMessages <pre>
INSERT INTO tblMessages
VALUES ('1', 'Testing Topic', 'Testing message data input'); VALUES ('1', 'Testing Topic', 'Testing message data input');
INSERT INTO tblMessages INSERT INTO tblMessages
VALUES ('2', 'Movie', 'Breakfast at Tiffany\'s'); VALUES ('2', 'Movie', 'Breakfast at Tiffany\'s');
...@@ -369,169 +381,162 @@ ...@@ -369,169 +381,162 @@
'My computer is a pentium III 400 mHz' 'My computer is a pentium III 400 mHz'
' with 192 megabytes of RAM'); ' with 192 megabytes of RAM');
</pre> </pre>
<p>The next stage is to create a special text index which we will
<p>The next stage is to create a special text index which we use for FTI, so we can search our table of messages for words or a
will use for FTI, so we can search our table of messages for phrase. We do this using the SQL command:</p>
words or a phrase. We do this using the SQL command:</p> <pre>
<pre> ALTER TABLE tblMessages ADD COLUMN idxFTI tsvector; ALTER TABLE tblMessages ADD COLUMN idxFTI tsvector;
</pre> </pre>
<p>Note that unlike traditional indexes, this is actually a new
<p>Note that unlike traditional indexes, this is actually a new field in the same table, which is then used (through the magic of
field in the same table, which is then used (through the magic the tsearch2 operators and functions) by a special index we will
of the tsearch2 operators and functions) by a special index we create in a moment.</p>
will create in a moment.</p> <p>The general rule for the initial insertion of data will follow
four steps:</p>
<p>The general rule for the initial insertion of data will <pre>
follow four steps:</p>
<pre> 1. update table 1. update table
2. vacuum full analyze 2. vacuum full analyze
3. create index 3. create index
4. vacuum full analyze 4. vacuum full analyze
</pre> </pre>
<p>The data can be updated into the table, the vacuum full analyze
<p>The data can be updated into the table, the vacuum full will reclaim unused space. The index can be created on the table
analyze will reclaim unused space. The index can be created on after the data has been inserted. Having the index created prior to
the table after the data has been inserted. Having the index the update will slow down the process. It can be done in that
created prior to the update will slow down the process. It can manner, this way is just more efficient. After the index has been
be done in that manner, this way is just more efficient. After created on the table, vacuum full analyze is run again to update
the index has been created on the table, vacuum full analyze is postgres's statistics (ie having the index take effect).</p>
run again to update postgres's statistics (ie having the index <pre>
take effect).</p> UPDATE tblMessages SET idxFTI=to_tsvector('default', strMessage);
<pre> UPDATE tblMessages SET idxFTI=to_tsvector('default', strMessage);
VACUUM FULL ANALYZE; VACUUM FULL ANALYZE;
</pre> </pre>
<p>Note that this only inserts the field strMessage as a tsvector,
<p>Note that this only inserts the field strMessage as a so if you want to also add strTopic to the information stored, you
tsvector, so if you want to also add strTopic to the should instead do the following, which effectively concatenates the
information stored, you should instead do the following, which two fields into one before being inserted into the table:</p>
effectively concatenates the two fields into one before being <pre>
inserted into the table:</p> UPDATE tblMessages
<pre> UPDATE tblMessages
SET idxFTI=to_tsvector('default',coalesce(strTopic,'') ||' '|| coalesce(strMessage,'')); SET idxFTI=to_tsvector('default',coalesce(strTopic,'') ||' '|| coalesce(strMessage,''));
VACUUM FULL ANALYZE; VACUUM FULL ANALYZE;
</pre> </pre>
<p><strong>Using the coalesce function makes sure this
<p><strong>Using the coalesce function makes sure this concatenation also works with NULL fields.</strong></p>
concatenation also works with NULL fields.</strong></p>
<p>We need to create the index on the column idxFTI. Keep in mind
<p>We need to create the index on the column idxFTI. Keep in that the database will update the index when some action is taken.
mind that the database will update the index when some action In this case we _need_ the index (The whole point of Full Text
is taken. In this case we _need_ the index (The whole point of INDEXINGi ;-)), so don't worry about any indexing overhead. We will
Full Text INDEXINGi ;-)), so don't worry about any indexing create an index based on the gist function. GiST is an index
overhead. We will create an index based on the gist function. structure for Generalized Search Tree.</p>
GiST is an index structure for Generalized Search Tree.</p> <pre>
<pre> CREATE INDEX idxFTI_idx ON tblMessages USING gist(idxFTI); CREATE INDEX idxFTI_idx ON tblMessages USING gist(idxFTI);
VACUUM FULL ANALYZE; VACUUM FULL ANALYZE;
</pre> </pre>
<p>After you have converted all of your data and indexed the
<p>After you have converted all of your data and indexed the column, you can select some rows to see what actually happened. I
column, you can select some rows to see what actually happened. will not display output here but you can play around yourselves and
I will not display output here but you can play around see what happened.</p>
yourselves and see what happened.</p> <p>The last thing to do is set up a trigger so every time a row in
this table is changed, the text index is automatically updated.
<p>The last thing to do is set up a trigger so every time a row This is easily done using:</p>
in this table is changed, the text index is automatically <pre>
updated. This is easily done using:</p> CREATE TRIGGER tsvectorupdate BEFORE UPDATE OR INSERT ON tblMessages
<pre> CREATE TRIGGER tsvectorupdate BEFORE UPDATE OR INSERT ON tblMessages
FOR EACH ROW EXECUTE PROCEDURE tsearch2(idxFTI, strMessage); FOR EACH ROW EXECUTE PROCEDURE tsearch2(idxFTI, strMessage);
</pre> </pre>
<p>Or if you are indexing both strMessage and strTopic you should
instead do:</p>
<pre>
<p>Or if you are indexing both strMessage and strTopic you CREATE TRIGGER tsvectorupdate BEFORE UPDATE OR INSERT ON tblMessages
should instead do:</p>
<pre> CREATE TRIGGER tsvectorupdate BEFORE UPDATE OR INSERT ON tblMessages
FOR EACH ROW EXECUTE PROCEDURE FOR EACH ROW EXECUTE PROCEDURE
tsearch2(idxFTI, strTopic, strMessage); tsearch2(idxFTI, strTopic, strMessage);
</pre> </pre>
<p>Before you ask, the tsearch2 function accepts multiple fields as
<p>Before you ask, the tsearch2 function accepts multiple arguments so there is no need to concatenate the two into one like
fields as arguments so there is no need to concatenate the two we did before.</p>
into one like we did before.</p> <p>If you want to do something specific with columns, you may write
your very own trigger function using plpgsql or other procedural
<p>If you want to do something specific with columns, you may languages (but not SQL, unfortunately) and use it instead of
write your very own trigger function using plpgsql or other <em>tsearch2</em> trigger.</p>
procedural languages (but not SQL, unfortunately) and use it <p>You could however call other stored procedures from within the
instead of <em>tsearch2</em> trigger.</p> tsearch2 function. Lets say we want to create a function to remove
certain characters (like the @ symbol from all text).</p>
<p>You could however call other stored procedures from within <pre>
the tsearch2 function. Lets say we want to create a function to CREATE FUNCTION dropatsymbol(text)
remove certain characters (like the @ symbol from all
text).</p>
<pre> CREATE FUNCTION dropatsymbol(text)
RETURNS text AS 'select replace($1, \'@\', \' \');' LANGUAGE SQL; RETURNS text AS 'select replace($1, \'@\', \' \');' LANGUAGE SQL;
</pre> </pre>
<p>Now we can use this function within the tsearch2 function on the
trigger.</p>
<p>Now we can use this function within the tsearch2 function on <pre>
the trigger.</p> DROP TRIGGER tsvectorupdate ON tblmessages;
<pre> DROP TRIGGER tsvectorupdate ON tblmessages;
CREATE TRIGGER tsvectorupdate BEFORE UPDATE OR INSERT ON tblMessages CREATE TRIGGER tsvectorupdate BEFORE UPDATE OR INSERT ON tblMessages
FOR EACH ROW EXECUTE PROCEDURE tsearch2(idxFTI, dropatsymbol, strMessage); FOR EACH ROW EXECUTE PROCEDURE tsearch2(idxFTI, dropatsymbol, strMessage);
INSERT INTO tblmessages VALUES (69, 'Attempt for dropatsymbol', 'Test@test.com'); INSERT INTO tblmessages VALUES (69, 'Attempt for dropatsymbol', 'Test@test.com');
</pre> </pre>
<p>If at this point you receive an error stating: ERROR: Can't find
<p>If at this point you receive an error stating: ERROR: Can't tsearch config by locale</p>
find tsearch config by locale</p> <p>Do not worry. You have done nothing wrong. And tsearch2 is not
broken. All that has happened here is that the configuration is
<p>Do not worry. You have done nothing wrong. And tsearch2 is setup to use a configuration based on the locale of the server. All
not broken. All that has happened here is that the you have to do is change your default configuration, or add a new
configuration is setup to use a configuration based on the one for your specific locale. See the section on TSEARCH2
locale of the server. All you have to do is change your default CONFIGURATION.</p>
configuration, or add a new one for your specific locale. See <pre class="real">
the section on TSEARCH2 CONFIGURATION.</p> SELECT * FROM tblmessages WHERE intindex = 69;
<pre class="real"> SELECT * FROM tblmessages WHERE intindex = 69;
intindex | strtopic | strmessage | idxfti intindex | strtopic | strmessage | idxfti
----------+--------------------------+---------------+----------------------- ----------+--------------------------+---------------+-----------------------
69 | Attempt for dropatsymbol | Test@test.com | 'test':1 'test.com':2 69 | Attempt for dropatsymbol | Test@test.com | 'test':1 'test.com':2
(1 row) (1 row)
</pre>Notice that the string content was passed throught the stored </pre>
Notice that the string content was passed throught the stored
procedure dropatsymbol. The '@' character was replaced with a procedure dropatsymbol. The '@' character was replaced with a
single space ... and the output from the procedure was then stored single space ... and the output from the procedure was then stored
in the tsvector column. in the tsvector column.
<p>This could be useful for removing other characters from indexed
<p>This could be useful for removing other characters from text, or any kind of preprocessing needed to be done on the text
indexed text, or any kind of preprocessing needed to be done on prior to insertion into the index.</p>
the text prior to insertion into the index.</p> <h3>QUERYING A TABLE</h3>
<h3>QUERYING A TABLE</h3> <p>There are some examples in the README.tsearch2 file for querying
a table. One major difference between tsearch and tsearch2 is the
<p>There are some examples in the README.tsearch2 file for operator ## is no longer available. Only the operator @@ is
querying a table. One major difference between tsearch and defined, using the types tsvector on one side and tsquery on the
tsearch2 is the operator ## is no longer available. Only the other side.</p>
operator @@ is defined, using the types tsvector on one side <p>Lets search the indexed data for the word "Test". I indexed
and tsquery on the other side.</p> based on the the concatenation of the strTopic, and the
strMessage:</p>
<p>Lets search the indexed data for the word "Test". I indexed <pre>
based on the the concatenation of the strTopic, and the SELECT intindex, strtopic FROM tblmessages
strMessage:</p>
<pre> SELECT intindex, strtopic FROM tblmessages
WHERE idxfti @@ 'test'::tsquery; WHERE idxfti @@ 'test'::tsquery;
intindex | strtopic intindex | strtopic
----------+--------------- ----------+---------------
1 | Testing Topic 1 | Testing Topic
(1 row) (1 row)
</pre> </pre>
<p>The only result that matched was the row with a topic "Testing
<p>The only result that matched was the row with a topic Topic". Notice that the word I search for was all lowercase. Let's
"Testing Topic". Notice that the word I search for was all see what happens when I query for uppercase "Test".</p>
lowercase. Let's see what happens when I query for uppercase <pre>
"Test".</p> SELECT intindex, strtopic FROM tblmessages
<pre> SELECT intindex, strtopic FROM tblmessages
WHERE idxfti @@ 'Test'::tsquery; WHERE idxfti @@ 'Test'::tsquery;
intindex | strtopic intindex | strtopic
----------+---------- ----------+----------
(0 rows) (0 rows)
</pre> </pre>
<p>We get zero rows returned. The reason is because when the text
<p>We get zero rows returned. The reason is because when the was inserted, it was morphed to my default configuration (because
text was inserted, it was morphed to my default configuration of the call to to_tsvector in the UPDATE statement). If there was
(because of the call to to_tsvector in the UPDATE statement). no morphing done, and the tsvector field(s) contained the word
If there was no morphing done, and the tsvector field(s) 'Text', a match would have been found.</p>
contained the word 'Text', a match would have been found.</p> <p>Most likely the best way to query the field is to use the
to_tsquery function on the right hand side of the @@ operator like
<p>Most likely the best way to query the field is to use the this:</p>
to_tsquery function on the right hand side of the @@ operator
like this:</p> <pre>
<pre> SELECT intindex, strtopic FROM tblmessages SELECT intindex, strtopic FROM tblmessages
WHERE idxfti @@ to_tsquery('default', 'Test | Zeppelin'); WHERE idxfti @@ to_tsquery('default', 'Test | Zeppelin');
intindex | strtopic intindex | strtopic
----------+-------------------- ----------+--------------------
...@@ -539,20 +544,18 @@ in the tsvector column. ...@@ -539,20 +544,18 @@ in the tsvector column.
7 | Classic Rock Bands 7 | Classic Rock Bands
(2 rows) (2 rows)
</pre> </pre>
<p>That query searched for all instances of "Test" OR "Zeppelin".
<p>That query searched for all instances of "Test" OR It returned two rows: the "Testing Topic" row, and the "Classic
"Zeppelin". It returned two rows: the "Testing Topic" row, and Rock Bands" row. The to_tsquery function performed the correct
the "Classic Rock Bands" row. The to_tsquery function performed morphology upon the parameters, and searched the tsvector field
the correct morphology upon the parameters, and searched the appropriately.</p>
tsvector field appropriately.</p> <p>The last example here relates to searching for a phrase, for
example "minority report". This poses a problem with regard to
<p>The last example here relates to searching for a phrase, for tsearch2, as it doesn't index phrases, only words. But there is a
example "minority report". This poses a problem with regard to way around which doesn't appear to have a significant impact on
tsearch2, as it doesn't index phrases, only words. But there is query time, and that is to use a query such as the following:</p>
a way around which doesn't appear to have a significant impact <pre>
on query time, and that is to use a query such as the SELECT intindex, strTopic FROM tblmessages
following:</p>
<pre> SELECT intindex, strTopic FROM tblmessages
WHERE idxfti @@ to_tsquery('default', 'gettysburg &amp; address') WHERE idxfti @@ to_tsquery('default', 'gettysburg &amp; address')
AND strMessage ~* '.*men are created equal.*'; AND strMessage ~* '.*men are created equal.*';
intindex | strtopic intindex | strtopic
...@@ -566,77 +569,75 @@ in the tsvector column. ...@@ -566,77 +569,75 @@ in the tsvector column.
----------+---------- ----------+----------
(0 rows) (0 rows)
</pre> </pre>
<p>Of course if your indexing both strTopic and strMessage, and
<p>Of course if your indexing both strTopic and strMessage, and want to search for this phrase on both, then you will have to get
want to search for this phrase on both, then you will have to out the brackets and extend this query a little more.</p>
get out the brackets and extend this query a little more.</p>
<h3>TSEARCH2 CONFIGURATION</h3>
<h3>TSEARCH2 CONFIGURATION</h3> <p>Some words such as "and", "the", and "who" are automatically not
indexed, since they belong to a pre-existing dictionary of "Stop
<p>Some words such as "and", "the", and "who" are automatically Words" which tsearch2 does not perform indexing on. If someone
not indexed, since they belong to a pre-existing dictionary of needs to search for "The Who" in your database, they are going to
"Stop Words" which tsearch2 does not perform indexing on. If have a tough time coming up with any results, since both are
someone needs to search for "The Who" in your database, they ignored in the indexes. But there is a solution.</p>
are going to have a tough time coming up with any results, <p>Lets say we want to add a word into the stop word list for
since both are ignored in the indexes. But there is a english stemming. We could edit the file
solution.</p> :'/usr/local/pgsql/share/english.stop' and add a word to the list.
I edited mine to exclude my name from indexing:</p>
<p>Lets say we want to add a word into the stop word list for <pre>
english stemming. We could edit the file - Edit /usr/local/pgsql/share/english.stop
:'/usr/local/pgsql/share/english.stop' and add a word to the
list. I edited mine to exclude my name from indexing:</p>
<pre> - Edit /usr/local/pgsql/share/english.stop
- Add 'andy' to the list - Add 'andy' to the list
- Save the file. - Save the file.
</pre> </pre>
<p>When you connect to the database, the dict_init procedure is run
<p>When you connect to the database, the dict_init procedure is during initialization. And in my configuration it will read the
run during initialization. And in my configuration it will read stop words from the file I just edited. If you were connected to
the stop words from the file I just edited. If you were the DB while editing the stop words, you will need to end the
connected to the DB while editing the stop words, you will need current session and re-connect. When you re-connect to the
to end the current session and re-connect. When you re-connect database, 'andy' is no longer indexed:</p>
to the database, 'andy' is no longer indexed:</p> <pre>
<pre> SELECT to_tsvector('default', 'Andy'); SELECT to_tsvector('default', 'Andy');
to_tsvector to_tsvector
------------ ------------
(1 row) (1 row)
</pre> </pre>
<p>Originally I would get the result :</p>
<p>Originally I would get the result :</p> <pre>
<pre> SELECT to_tsvector('default', 'Andy'); SELECT to_tsvector('default', 'Andy');
to_tsvector to_tsvector
------------ ------------
'andi':1 'andi':1
(1 row) (1 row)
</pre> </pre>
<p>But since I added it as a stop word, it would be ingnored on the
<p>But since I added it as a stop word, it would be ingnored on indexing. The stop word added was used in the dictionary "en_stem".
the indexing. The stop word added was used in the dictionary If I were to use a different configuration such as 'simple', the
"en_stem". If I were to use a different configuration such as results would be different. There are no stop words for the simple
'simple', the results would be different. There are no stop dictionary. It will just convert to lower case, and index every
words for the simple dictionary. It will just convert to lower unique word.</p>
case, and index every unique word.</p> <pre>
<pre> SELECT to_tsvector('simple', 'Andy andy The the in out'); SELECT to_tsvector('simple', 'Andy andy The the in out');
to_tsvector to_tsvector
------------------------------------- -------------------------------------
'in':5 'out':6 'the':3,4 'andy':1,2 'in':5 'out':6 'the':3,4 'andy':1,2
(1 row) (1 row)
</pre> </pre>
<p>All this talk about which configuration to use is leading us
<p>All this talk about which configuration to use is leading us into the actual configuration of tsearch2. In the examples in this
into the actual configuration of tsearch2. In the examples in document the configuration has always been specified when using the
this document the configuration has always been specified when tsearch2 functions:</p>
using the tsearch2 functions:</p> <pre>
<pre> SELECT to_tsvector('default', 'Testing the default config'); SELECT to_tsvector('default', 'Testing the default config');
SELECT to_tsvector('simple', 'Example of simple Config'); SELECT to_tsvector('simple', 'Example of simple Config');
</pre> </pre>
<p>The pg_ts_cfg table holds each configuration you can use with
the tsearch2 functions. As you can see the ts_name column contains
both the 'default' configurations based on the 'C' locale. And the
'simple' configuration which is not based on any locale.</p>
<p>The pg_ts_cfg table holds each configuration you can use <pre>
with the tsearch2 functions. As you can see the ts_name column SELECT * from pg_ts_cfg;
contains both the 'default' configurations based on the 'C'
locale. And the 'simple' configuration which is not based on
any locale.</p>
<pre> SELECT * from pg_ts_cfg;
ts_name | prs_name | locale ts_name | prs_name | locale
-----------------+----------+-------------- -----------------+----------+--------------
default | default | C default | default | C
...@@ -644,203 +645,199 @@ in the tsvector column. ...@@ -644,203 +645,199 @@ in the tsvector column.
simple | default | simple | default |
(3 rows) (3 rows)
</pre> </pre>
<p>Each row in the pg_ts_cfg table contains the name of the
<p>Each row in the pg_ts_cfg table contains the name of the tsearch2 configuration, the name of the parser to use, and the
tsearch2 configuration, the name of the parser to use, and the locale mapped to the configuration. There is only one parser to
locale mapped to the configuration. There is only one parser to choose from the table pg_ts_parser called 'default'. More parsers
choose from the table pg_ts_parser called 'default'. More could be written, but for our needs we will use the default.</p>
parsers could be written, but for our needs we will use the <p>There are 3 configurations installed by tsearch2 initially. If
default.</p> your locale is set to 'en_US' for example (like my laptop), then as
you can see there is currently no dictionary configured to use with
<p>There are 3 configurations installed by tsearch2 initially. that locale. You can either set up a new configuration or just use
If your locale is set to 'en_US' for example (like my laptop), one that already exists. If I do not specify which configuration to
then as you can see there is currently no dictionary configured use in the to_tsvector function, I receive the following error.</p>
to use with that locale. You can either set up a new <pre>
configuration or just use one that already exists. If I do not SELECT to_tsvector('learning tsearch is like going to school');
specify which configuration to use in the to_tsvector function,
I receive the following error.</p>
<pre> SELECT to_tsvector('learning tsearch is like going to school');
ERROR: Can't find tsearch config by locale ERROR: Can't find tsearch config by locale
</pre> </pre>
<p>We will create a new configuration for use with the server
<p>We will create a new configuration for use with the server encoding 'en_US'. The first step is to add a new configuration into
encoding 'en_US'. The first step is to add a new configuration the pg_ts_cfg table. We will call the configuration
into the pg_ts_cfg table. We will call the configuration 'default_english', with the default parser and use the locale
'default_english', with the default parser and use the locale 'en_US'.</p>
'en_US'.</p> <pre>
<pre> INSERT INTO pg_ts_cfg (ts_name, prs_name, locale) INSERT INTO pg_ts_cfg (ts_name, prs_name, locale)
VALUES ('default_english', 'default', 'en_US'); VALUES ('default_english', 'default', 'en_US');
</pre>
<p>We have only declared that there is a configuration called </pre>
'default_english'. We need to set the configuration of how <p>We have only declared that there is a configuration called
'default_english' will work. The next step is creating a new 'default_english'. We need to set the configuration of how
dictionary to use. The configuration of the dictionary is 'default_english' will work. The next step is creating a new
completlely different in tsearch2. In the prior versions to dictionary to use. The configuration of the dictionary is
make changes, you would have to re-compile your changes into completlely different in tsearch2. In the prior versions to make
the tsearch.so. All of the configuration has now been moved changes, you would have to re-compile your changes into the
into the system tables created by executing the SQL code from tsearch.so. All of the configuration has now been moved into the
tsearch2.sql</p> system tables created by executing the SQL code from
tsearch2.sql</p>
<p>Lets take a first look at the pg_ts_dict table</p> <p>Lets take a first look at the pg_ts_dict table</p>
<pre> ftstest=# \d pg_ts_dict <pre>
Table "public.pg_ts_dict" ftstest=# \d pg_ts_dict
Column | Type | Modifiers Table "public.pg_ts_dict"
-----------------+---------+----------- Column | Type | Modifiers
dict_name | text | not null -----------------+--------------+-----------
dict_init | oid | dict_name | text | not null
dict_initoption | text | dict_init | regprocedure |
dict_lexize | oid | not null dict_initoption | text |
dict_comment | text | dict_lexize | regprocedure | not null
Indexes: pg_ts_dict_idx unique btree (dict_name) dict_comment | text |
</pre> Indexes: pg_ts_dict_pkey primary key btree (dict_name)
</pre>
<p>The dict_name column is the name of the dictionary, for <p>The dict_name column is the name of the dictionary, for example
example 'simple', 'en_stem' or 'ru_stem'. The dict_init column 'simple', 'en_stem' or 'ru_stem'. The dict_init column is a text
is an OID of a stored procedure to run for initialization of representation of a stored procedure to run for initialization of
that dictionary, for example 'snb_en_init' or 'snb_ru_init'. that dictionary, for example 'snb_en_init(text)' or
The dict_init option is used for options passed to the init 'snb_ru_init(text)'. The initial configuration of tsearch2 had the
function for the stored procedure. In the cases of 'en_stem' or dict_init and dict_lexize columns as type oid. The patch mentioned
'ru_stem' it is a path to a stopword file for that dictionary, in the Installation Notes changes these types to regprocedure. The
for example '/usr/local/pgsql/share/english.stop'. This is data inserted, or updated can still be the oid of the stored
however dictated by the dictionary. ISpell dictionaries may procedure. The representation is just different. This makes backup
require different options. The dict_lemmatize column is another and restore procedures much easier for tsearch2. The dict_init
OID of a stored procedure to the function used to lemmitize, option is used for options passed to the init function for the
for example 'snb_lemmatize'. The dict_comment column is just a stored procedure. In the cases of 'en_stem' or 'ru_stem' it is a
comment.</p> path to a stopword file for that dictionary, for example
'/usr/local/pgsql/share/english.stop'. This is however dictated by
<p>Next we will configure the use of a new dictionary based on the dictionary. ISpell dictionaries may require different options.
ISpell. We will assume you have ISpell installed on you The dict_lexize column is another OID of a stored procedure to the
machine. (in /usr/local/lib)</p> function used to lexize, for example 'snb_lexize(internal,
internal, integer)'. The dict_comment column is just a comment.</p>
<p>There has been some confusion in the past as to which files <p>Next we will configure the use of a new dictionary based on
are used from ISpell. ISpell operates using a hash file. This ISpell. We will assume you have ISpell installed on you machine.
is a binary file created by the ISpell command line utility (in /usr/local/lib)</p>
"buildhash". This utility accepts a file containing the words <p>There has been some confusion in the past as to which files are
from the dictionary, and the affixes file and the output is the used from ISpell. ISpell operates using a hash file. This is a
hash file. The default installation of ISPell installs the binary file created by the ISpell command line utility "buildhash".
english hash file english.hash, which is the exact same file as This utility accepts a file containing the words from the
american.hash. ISpell uses this as the fallback dictionary to dictionary, and the affixes file and the output is the hash file.
use.</p> The default installation of ISPell installs the english hash file
english.hash, which is the exact same file as american.hash. ISpell
<p>This hash file is not what tsearch2 requires as the ISpell uses this as the fallback dictionary to use.</p>
interface. The file(s) needed are those used to create the <p>This hash file is not what tsearch2 requires as the ISpell
hash. Tsearch uses the dictionary words for morphology, so the interface. The file(s) needed are those used to create the hash.
listing is needed not spellchecking. Regardless, these files Tsearch uses the dictionary words for morphology, so the listing is
are included in the ISpell sources, and you can use them to needed not spellchecking. Regardless, these files are included in
integrate into tsearch2. This is not complicated, but is not the ISpell sources, and you can use them to integrate into
very obvious to begin with. The tsearch2 ISpell interface needs tsearch2. This is not complicated, but is not very obvious to begin
only the listing of dictionary words, it will parse and load with. The tsearch2 ISpell interface needs only the listing of
those words, and use the ISpell dictionary for lexem dictionary words, it will parse and load those words, and use the
processing.</p> ISpell dictionary for lexem processing.</p>
<p>I found the ISPell make system to be very finicky. Their <p>I found the ISPell make system to be very finicky. Their
documentation actually states this to be the case. So I just documentation actually states this to be the case. So I just did
did things the command line way. In the ISpell source tree things the command line way. In the ISpell source tree under
under langauges/english there are several files in this langauges/english there are several files in this directory. For a
directory. For a complete description, please read the ISpell complete description, please read the ISpell README. Basically for
README. Basically for the english dictionary there is the the english dictionary there is the option to create the small,
option to create the small, medium, large and extra large medium, large and extra large dictionaries. The medium dictionary
dictionaries. The medium dictionary is recommended. If the make is recommended. If the make system is configured correctly, it
system is configured correctly, it would build and install the would build and install the english.has file from the medium size
english.has file from the medium size dictionary. Since we are dictionary. Since we are only concerned with the dictionary word
only concerned with the dictionary word listing ... it can be listing ... it can be created from the /languages/english directory
created from the /languages/english directory with the with the following command:</p>
following command:</p> <pre>
<pre> sort -u -t/ +0f -1 +0 -T /usr/tmp -o english.med english.0 english.1 sort -u -t/ +0f -1 +0 -T /usr/tmp -o english.med english.0 english.1
</pre> </pre>
<p>This will create a file called english.med. You can copy this
<p>This will create a file called english.med. You can copy file to whever you like. I placed mine in /usr/local/lib so it
this file to whever you like. I place mine in /usr/local/lib so coincides with the ISpell hash files. You can now add the tsearch2
it coincides with the ISpell hash files. You can now add the configuration entry for the ISpell english dictionary. We will also
tsearch2 configuration entry for the ISpell english dictionary. continue to use the english word stop file that was installed for
We will also continue to use the english word stop file that the en_stem dictionary. You could use a different one if you like.
was installed for the en_stem dictionary. You could use a The ISpell configuration is based on the "ispell_template"
different one if you like. The ISpell configuration is based on dictionary installed by default with tsearch2. We will use the OIDs
the "ispell_template" dictionary installed by default with to the stored procedures from the row where the dict_name =
tsearch2. We will use the OIDs to the stored procedures from 'ispell_template'.</p>
the row where the dict_name = 'ispell_template'.</p> <pre>
<pre> INSERT INTO pg_ts_dict INSERT INTO pg_ts_dict
(SELECT 'en_ispell', (SELECT 'en_ispell',
dict_init, dict_init,
'DictFile="/usr/local/lib/english.med",' 'DictFile="/usr/local/lib/english.med",'
'AffFile="/usr/local/lib/english.aff",' 'AffFile="/usr/local/lib/english.aff",'
'StopFile="/usr/local/pgsql/share/english.stop"', 'StopFile="/usr/local/pgsql/share/contrib/english.stop"',
dict_lexize dict_lexize
FROM pg_ts_dict FROM pg_ts_dict
WHERE dict_name = 'ispell_template'); WHERE dict_name = 'ispell_template');
</pre> </pre>
<p>Now that we have a dictionary we can specify it's use in a query
<p>Now that we have a dictionary we can specify it's use in a to get a lexem. For this we will use the lexize function. The
query to get a lexem. For this we will use the lexize function. lexize function takes the name of the dictionary to use as an
The lexize function takes the name of the dictionary to use as argument. Just as the other tsearch2 functions operate. You will
an argument. Just as the other tsearch2 functions operate.</p> need to stop your psql session and start it again in order for this
<pre> SELECT lexize('en_ispell', 'program'); modification to take place.</p>
<pre>
SELECT lexize('en_ispell', 'program');
lexize lexize
----------- -----------
{program} {program}
(1 row) (1 row)
</pre>
<p>If you wanted to always use the ISpell english dictionary
you have installed, you can configure tsearch2 to always use a
specific dictionary.</p>
<pre> SELCECT set_curdict('en_ispell');
</pre> </pre>
<p>If you wanted to always use the ISpell english dictionary you
<p>Lexize is meant to turn a word into a lexem. It is possible have installed, you can configure tsearch2 to always use a specific
to receive more than one lexem returned for a single word.</p> dictionary.</p>
<pre> SELECT lexize('en_ispell', 'conditionally'); <pre>
SELECT set_curdict('en_ispell');
</pre>
<p>Lexize is meant to turn a word into a lexem. It is possible to
receive more than one lexem returned for a single word.</p>
<pre>
SELECT lexize('en_ispell', 'conditionally');
lexize lexize
----------------------------- -----------------------------
{conditionally,conditional} {conditionally,conditional}
(1 row) (1 row)
</pre> </pre>
<p>The lexize function is not meant to take a full string as an
<p>The lexize function is not meant to take a full string as an argument to return lexems for. If you passed in an entire sentence,
argument to return lexems for. If you passed in an entire it attempts to find that entire sentence in the dictionary. Since
sentence, it attempts to find that entire sentence in the the dictionary contains only words, you will receive an empty
dictionary. SInce the dictionary contains only words, you will result set back.</p>
receive an empty result set back.</p> <pre>
<pre> SELECT lexize('en_ispell', 'This is a senctece to lexize'); SELECT lexize('en_ispell', 'This is a senctece to lexize');
lexize lexize
-------- --------
(1 row) (1 row)
If you parse a lexem from a word not in the dictionary, then you will receive an empty result. This makes sense because the word "tsearch" is not int the english dictionary. You can create your own additions to the dictionary if you like. This may be useful for scientific or technical glossaries that need to be indexed. SELECT lexize('en_ispell', 'tsearch'); lexize -------- (1 row) If you parse a lexem from a word not in the dictionary, then you will receive an empty result. This makes sense because the word "tsearch" is not in the english dictionary. You can create your own additions to the dictionary if you like. This may be useful for scientific or technical glossaries that need to be indexed. SELECT lexize('en_ispell', 'tsearch'); lexize -------- (1 row)
</pre>
<p>This is not to say that tsearch will be ignored when adding </pre>
text information to the the tsvector index column. This will be <p>This is not to say that tsearch will be ignored when adding text
explained in greater detail with the table pg_ts_cfgmap.</p> information to the the tsvector index column. This will be
explained in greater detail with the table pg_ts_cfgmap.</p>
<p>Next we need to set up the configuration for mapping the <p>Next we need to set up the configuration for mapping the
dictionay use to the lexxem parsings. This will be done by dictionay use to the lexxem parsings. This will be done by altering
altering the pg_ts_cfgmap table. We will insert several rows, the pg_ts_cfgmap table. We will insert several rows, specifying to
specifying to using the new dictionary we installed and use the new dictionary we installed and configured for lexizing
configured for use within tsearch2. There are several type of within tsearch2. There are several type of lexims we would be
lexims we would be concerned with forcing the use of the ISpell concerned with forcing the use of the ISpell dictionary.</p>
dictionary.</p> <pre>
<pre> INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name)
VALUES ('default_english', 'lhword', '{en_ispell,en_stem}'); VALUES ('default_english', 'lhword', '{en_ispell,en_stem}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name)
VALUES ('default_english', 'lpart_hword', '{en_ispell,en_stem}'); VALUES ('default_english', 'lpart_hword', '{en_ispell,en_stem}');
INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name) INSERT INTO pg_ts_cfgmap (ts_name, tok_alias, dict_name)
VALUES ('default_english', 'lword', '{en_ispell,en_stem}'); VALUES ('default_english', 'lword', '{en_ispell,en_stem}');
</pre> </pre>
<p>We have just inserted 3 records to the configuration mapping,
<p>We have just inserted 3 records to the configuration specifying that the lexem types for "lhword, lpart_hword and lword"
mapping, specifying that the lexem types for "lhword, are to be stemmed using the 'en_ispell' dictionary we added into
lpart_hword and lword" are to be stemmed using the 'en_ispell' pg_ts_dict, when using the configuration ' default_english' which
dictionary we added into pg_ts_dict, when using the we added to pg_ts_cfg.</p>
configuration ' default_english' which we added to <p>There are several other lexem types used that we do not need to
pg_ts_cfg.</p> specify as using the ISpell dictionary. We can simply insert values
using the 'simple' stemming process dictionary.</p>
<p>There are several other lexem types used that we do not need <pre>
to specify as using the ISpell dictionary. We can simply insert INSERT INTO pg_ts_cfgmap
values using the 'simple' stemming process dictionary.</p>
<pre> INSERT INTO pg_ts_cfgmap
VALUES ('default_english', 'url', '{simple}'); VALUES ('default_english', 'url', '{simple}');
INSERT INTO pg_ts_cfgmap INSERT INTO pg_ts_cfgmap
VALUES ('default_english', 'host', '{simple}'); VALUES ('default_english', 'host', '{simple}');
...@@ -874,11 +871,12 @@ If you parse a lexem from a word not in the dictionary, then you will receive an ...@@ -874,11 +871,12 @@ If you parse a lexem from a word not in the dictionary, then you will receive an
VALUES ('default_english', 'version', '{simple}'); VALUES ('default_english', 'version', '{simple}');
</pre> </pre>
<p>Our addition of a configuration for 'default_english' is now <p>Our addition of a configuration for 'default_english' is now
complete. We have successfully created a new tsearch2 complete. We have successfully created a new tsearch2
configuration. At the same time we have also set the new configuration. At the same time we have also set the new
configuration to be our default for en_US locale.</p> configuration to be our default for en_US locale.</p>
<pre> SELECT to_tsvector('default_english', <pre>
SELECT to_tsvector('default_english',
'learning tsearch is like going to school'); 'learning tsearch is like going to school');
to_tsvector to_tsvector
-------------------------------------------------- --------------------------------------------------
...@@ -889,123 +887,186 @@ If you parse a lexem from a word not in the dictionary, then you will receive an ...@@ -889,123 +887,186 @@ If you parse a lexem from a word not in the dictionary, then you will receive an
'go':5 'like':4 'learn':1 'school':7 'tsearch':2 'go':5 'like':4 'learn':1 'school':7 'tsearch':2
(1 row) (1 row)
</pre> </pre>
<p>Notice here that words like "tsearch" are still parsed and
<p>Notice here that words like "tsearch" are still parsed and indexed in the tsvector column. There is a lexem returned for the
indexed in the tsvector column. There is a lexem returned for word becuase in the configuration mapping table, we specify words
the word becuase in the configuration mapping table, we specify to be used from the 'en_ispell' dictionary first, but as a fallback
words to be used from the 'en_ispell' dictionary first, but as to use the 'en_stem' dictionary. Therefore a lexem is not returned
a fallback to use the 'en_stem' dictionary. Therefore a lexem from en_ispell, but is returned from en_stem, and added to the
is not returned from en_ispell, but is returned from en_stem, tsvector.</p>
and added to the tsvector.</p> <pre>
<pre> SELECT to_tsvector('learning tsearch is like going to computer school'); SELECT to_tsvector('learning tsearch is like going to computer school');
to_tsvector to_tsvector
--------------------------------------------------------------------------- ---------------------------------------------------------------------------
'go':5 'like':4 'learn':1 'school':8 'compute':7 'tsearch':2 'computer':7 'go':5 'like':4 'learn':1 'school':8 'compute':7 'tsearch':2 'computer':7
(1 row) (1 row)
</pre> </pre>
<p>Notice in this last example I added the word "computer" to the
<p>Notice in this last example I added the word "computer" to text to be converted into a tsvector. Because we have setup our
the text to be converted into a tsvector. Because we have setup default configuration to use the ISpell english dictionary, the
our default configuration to use the ISpell english dictionary, words are lexized, and computer returns 2 lexems at the same
the words are lexized, and computer returns 2 lexems at the position. 'compute':7 and 'computer':7 are now both indexed for the
same position. 'compute':7 and 'computer':7 are now both word computer.</p>
indexed for the word computer.</p> <p>You can create additional dictionary lists, or use the extra
large dictionary from ISpell. You can read through the ISpell
<p>You can create additional dictionarynlists, or use the extra documents, and source tree to make modifications as you see
large dictionary from ISpell. You can read through the ISpell fit.</p>
documents, and source tree to make modifications as you see <p>In the case that you already have a configuration set for the
fit.</p> locale, and you are changing it to your new dictionary
configuration. You will have to set the old locale to NULL. If we
<p>In the case that you already have a configuration set for are using the 'C' locale then we would do this:</p>
the locale, and you are changing it to your new dictionary
configuration. You will have to set the old locale to NULL. If <pre>
we are using the 'C' locale then we would do this:</p> UPDATE pg_ts_cfg SET locale=NULL WHERE locale = 'C';
<pre> UPDATE pg_ts_cfg SET locale=NULL WHERE locale = 'C';
</pre> </pre>
<p>That about wraps up the configuration of tsearch2. There is much
more you can do with the tables provided. This was just an
introduction to get things working rather quickly.</p>
<h3>ADDING NEW DICTIONARIES TO TSEARCH2</h3>
<p>To aid in the addition of new dictionaries to the tsearch2
module you can use another additional module in combination with
tsearch2. The gendict module is included into tsearch2 distribution
and is available from gendict/ subdirectory.</p>
<p>I will not go into detail about installation and instructions on
how to use gendict to it's fullest extent right now. You can read
the README.gendict ... it has all of the instructions and
information you will need.</p>
<h3>BACKING UP AND RESTORING DATABASES THAT FEATURE TSEARCH2</h3>
<p><strong>Never rely on anyone elses instructions to backup and
restore a database system, always develop and understand your own
methodology, and test it numerous times before you need to do it
for real.</strong></p>
<p>The backup and restore procedure has changed over time. This is
not meant to be the bible for tsearch2 back up and restore. Please
read all sections so you have a complete understanding of some
backup and restore issues. Please test your own procedures, and do
not rely on these instructions solely.</p>
<p>If you come accross some issues in your own procedures, please
feel free to bring the question up on the Open-FTS, and PostgreSQL
mailing lists.</p>
<h3>ORIGINAL BACKUP PROCEDURES</h3>
<p>Originally, tsearch2 had problems when using the pg_dump, and or
the pg_dumpall utilities. The problem lies within the original use
of OIDs for column types. Since OIDs are not consistent accross
pg_dumps, when you reload the data values into the pg_ts_dict
table, for example, those oids no longer point to anything. You
would then end up trying to use a "broken" tsearch2
configuration.</p>
<p>The solution was to backup and restore a database using the
tsearch2 module into small unique parts, and then load them in the
correct order. You would have to edit the schema and remove the
tsearch stored procedure references in the sql file. You would have
to load your global objects, then the tsearch2 objects. You had to
re-create the tsearch module before restoring your schema so no
conflicts would arise. Then you could restore your data (all
schemas, and types needed for the data were now available).</p>
<p><strong>The original backup instructions were as
follows</strong></p>
<p>1) Backup any global database objects such as users and groups
(this step is usually only necessary when you will be restoring to
a virgin system)</p>
<pre>
pg_dumpall -g &gt; GLOBALobjects.sql
<p>That about wraps up the configuration of tsearch2. There is
much more you can do with the tables provided. This was just an
introduction to get things working rather quickly.</p>
<h3>ADDING NEW DICTIONARIES TO TSEARCH2</h3>
<p>To aid in the addition of new dictionaries to the tsearch2
module you can use another additional module in combination
with tsearch2. The gendict module is included into tsearch2
distribution and is available from gendict/ subdirectory.</p>
<p>I will not go into detail about installation and
instructions on how to use gendict to it's fullest extent right
now. You can read the README.gendict ... it has all of the
instructions and information you will need.</p>
<h3>BACKING UP AND RESTORING DATABASES THAT FEATURE
TSEARCH2</h3>
<p>Believe it or not, this isn't as straight forward as it
should be, and you will have problems trying to backup and
restore any database which uses tsearch2 unless you take the
steps shown below. And before you ask using pg_dumpall will
result in failure every time. These took a lot of trial and
error to get working, but the process as laid down below has
been used a dozen times now in live production environments so
it should work fine.</p>
<p>HOWEVER never rely on anyone elses instructions to backup
and restore a database system, always develop and understand
your own methodology, and test it numerous times before you
need to do it for real.</p>
<p>To Backup a PostgreSQL database that uses the tsearch2
module:</p>
<p>1) Backup any global database objects such as users and
groups (this step is usually only necessary when you will be
restoring to a virgin system)</p>
<pre> pg_dumpall -g &gt; GLOBALobjects.sql
</pre> </pre>
<p>2) Backup the full database schema using pg_dump</p>
<p>2) Backup the full database schema using pg_dump</p> <pre>
<pre> pg_dump -s DATABASE &gt; DATABASEschema.sql pg_dump -s DATABASE &gt; DATABASEschema.sql
</pre> </pre>
<p>3) Backup the full database using pg_dump</p>
<p>3) Backup the full database using pg_dump</p> <pre>
<pre> pg_dump -Fc DATABASE &gt; DATABASEdata.tar pg_dump -Fc DATABASE &gt; DATABASEdata.tar
</pre> </pre>
<p>To Restore a PostgreSQL database that uses the tsearch2 <p><strong>The original restore procedures were as
module:</p> follows</strong></p>
<p>1) Create the blank database</p>
<p>1) Create the blank database</p> <pre>
<pre> createdb DATABASE createdb DATABASE
</pre> </pre>
<p>2) Restore any global database objects such as users and groups
<p>2) Restore any global database objects such as users and (this step is usually only necessary when you will be restoring to
groups (this step is usually only necessary when you will be a virgin system)</p>
restoring to a virgin system)</p> <pre>
<pre> psql DATABASE &lt; GLOBALobjects.sql psql DATABASE &lt; GLOBALobjects.sql
</pre> </pre>
<p>3) Create the tsearch2 objects, functions and operators</p>
<p>3) Create the tsearch2 objects, functions and operators</p> <pre>
<pre> psql DATABASE &lt; tsearch2.sql psql DATABASE &lt; tsearch2.sql
</pre> </pre>
<p>4) Edit the backed up database schema and delete all SQL
<p>4) Edit the backed up database schema and delete all SQL commands which create tsearch2 related functions, operators and
commands which create tsearch2 related functions, operators and data types, BUT NOT fields in table definitions that specify
data types, BUT NOT fields in table definitions that specify tsvector types. If your not sure what these are, they are the ones
tsvector types. If your not sure what these are, they are the listed in tsearch2.sql. Then restore the edited schema to the
ones listed in tsearch2.sql. Then restore the edited schema to database</p>
the database</p> <pre>
<pre> psql DATABASE &lt; DATABASEschema.sql psql DATABASE &lt; DATABASEschema.sql
</pre> </pre>
<p>5) Restore the data for the database</p>
<pre>
<p>5) Restore the data for the database</p> pg_restore -N -a -d DATABASE DATABASEdata.tar
<pre> pg_restore -N -a -d DATABASE DATABASEdata.tar
</pre> </pre>
<p>If you get any errors in step 4, it will most likely be because
<p>If you get any errors in step 4, it will most likely be you forgot to remove an object that was created in tsearch2.sql.
because you forgot to remove an object that was created in Any errors in step 5 will mean the database schema was probably
tsearch2.sql. Any errors in step 5 will mean the database restored wrongly.</p>
schema was probably restored wrongly.</p> <p><strong>Issues with this procedure</strong></p>
</div> <p>As I mentioned before, it is vital that you test out your own
</body></html> backup and restore procedures. These procedures were originally
\ No newline at end of file adopted from this document's orignal author. Robert John Shepherd.
It makes use of the pg_dump custom archive functionality. I am not
that familiar with the formatting output of pg_dump, and using
pg_restore. I have always had the luxury of using text files
(Everything is DATABASE.sql).</p>
<p>One issue not forseen in the case of using a binary dump is the
when you have added more than the default tsearch2 configurations.
Upon reload of the data it will fail due to duplicate primary keys.
If you load the tsearch2 module, and then delete the data loaded by
tsearch2 into the configuration tables, the data will restore. The
configurations are incorrect because you can not remove the data
using OID references from the custom archive.</p>
<p>It would be very simple to fix this problem if the data was not
in an archive format. I do believe all of your data would have been
restored properly and you can get things working fairly easy. All
one would have to do is create the configurations as in the
tsearch2.sql file. And then create your custom configurations
again.</p>
<p>I have read in the pg_dump man page that if the tar archive
format is used, it is possible to limit which data is restored
using pg_restore. If anyone has more experience with pg_dump
archives, and pg_restore. Please feel free to test and contribute
your procedure(s).</p>
<h3>CURRENT BACKUP AND RESTORE PROCEDURES</h3>
<p>Currently a backup and restore of a database using the tsearch2
module can be quite simple. If you have applied the patch mentioned
in the installation instructions prior to tsearch2 installation.
This patch removes the use of the oid column. The text
representation for the stored procedures used are dumped with the
data and the restoration of the data works seemlessly.</p>
<p>1) to backup the database</p>
<pre>
pg_dump DATABASE &gt; DATABASE.sql
</pre>
<p>1) to restore the database</p>
<pre>
createdb DATABASE
psql -d DATABASE -f DATABASE.sql
</pre>
<p>This procedure is now like any normal backup and restore
procedure. I can say whether this has been proven using the pg_dump
archive, and restoring with pg_restore. In theory there should be
no problems with any format after the patch is applied.</p>
<p>This restoration procedure should never be an issue with the
patch applied to version 7.5 of PostgreSQL. Only versions 7.3 and
7.4 are affected. You can avoid any troubles by applying the patch
prior to installation, or running the SQL script provided to live
database before backup and restoring is done.</p>
</div>
</body>
</html>
...@@ -248,11 +248,11 @@ Each parser is defined by a record in the <tt>pg_ts_parser</tt> table: ...@@ -248,11 +248,11 @@ Each parser is defined by a record in the <tt>pg_ts_parser</tt> table:
<pre>create table pg_ts_parser ( <pre>create table pg_ts_parser (
prs_name text not null, prs_name text not null,
prs_start oid not null, prs_start regprocedure not null,
prs_nexttoken oid not null, prs_nexttoken regprocedure not null,
prs_end oid not null, prs_end regprocedure not null,
prs_headline oid not null, prs_headline regprocedure not null,
prs_lextype oid not null, prs_lextype regprocedure not null,
prs_comment text prs_comment text
);</pre> );</pre>
...@@ -318,9 +318,9 @@ Each dictionary is defined by an entry in the <tt>pg_ts_dict</tt> table: ...@@ -318,9 +318,9 @@ Each dictionary is defined by an entry in the <tt>pg_ts_dict</tt> table:
<pre>CREATE TABLE pg_ts_dict ( <pre>CREATE TABLE pg_ts_dict (
dict_name text not null, dict_name text not null,
dict_init oid, dict_init regprocedure,
dict_initoption text, dict_initoption text,
dict_lexize oid not null, dict_lexize regprocedure not null,
dict_comment text dict_comment text
);</pre> );</pre>
...@@ -454,4 +454,4 @@ The two ranking functions currently available are: ...@@ -454,4 +454,4 @@ The two ranking functions currently available are:
</pre> </pre>
</dd></dl> </dd></dl>
</body></html> </body></html>
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment