Commit 53e99f57 authored by Tom Lane's avatar Tom Lane

Make an editorial pass over the newly SGML-ified contrib documentation.

Fix lots of bad markup, bad English, bad explanations.

This commit covers only about half the contrib modules, but I grow weary...
parent a37a0a41
<!-- $PostgreSQL: pgsql/doc/src/sgml/adminpack.sgml,v 1.3 2007/12/06 04:12:09 tgl Exp $ -->
<sect1 id="adminpack"> <sect1 id="adminpack">
<title>adminpack</title> <title>adminpack</title>
<indexterm zone="adminpack"> <indexterm zone="adminpack">
<primary>adminpack</primary> <primary>adminpack</primary>
</indexterm> </indexterm>
<para> <para>
adminpack is a PostgreSQL standard module that implements a number of <filename>adminpack</> provides a number of support functions which
support functions which pgAdmin and other administration and management tools <application>pgAdmin</> and other administration and management tools can
can use to provide additional functionality if installed on a server. use to provide additional functionality, such as remote management
of server log files.
</para> </para>
<sect2> <sect2>
<title>Functions implemented</title> <title>Functions implemented</title>
<para> <para>
Functions implemented by adminpack can only be run by a superuser. Here's a The functions implemented by <filename>adminpack</> can only be run by a
list of these functions: superuser. Here's a list of these functions:
</para>
<para> <programlisting>
<programlisting> int8 pg_catalog.pg_file_write(fname text, data text, append bool)
int8 pg_catalog.pg_file_write(fname text, data text, append bool) bool pg_catalog.pg_file_rename(oldname text, newname text, archivename text)
bool pg_catalog.pg_file_rename(oldname text, newname text, archivname text) bool pg_catalog.pg_file_rename(oldname text, newname text)
bool pg_catalog.pg_file_rename(oldname text, newname text) bool pg_catalog.pg_file_unlink(fname text)
bool pg_catalog.pg_file_unlink(fname text) setof record pg_catalog.pg_logdir_ls()
setof record pg_catalog.pg_logdir_ls()
/* Renaming of existing backend functions for pgAdmin compatibility */
/* Renaming of existing backend functions for pgAdmin compatibility */ int8 pg_catalog.pg_file_read(fname text, data text, append bool)
int8 pg_catalog.pg_file_read(fname text, data text, append bool) bigint pg_catalog.pg_file_length(text)
bigint pg_catalog.pg_file_length(text) int4 pg_catalog.pg_logfile_rotate()
int4 pg_catalog.pg_logfile_rotate() </programlisting>
</programlisting>
</para> </para>
</sect2> </sect2>
</sect1> </sect1>
<!-- $PostgreSQL: pgsql/doc/src/sgml/btree-gist.sgml,v 1.4 2007/12/06 04:12:09 tgl Exp $ -->
<sect1 id="btree-gist"> <sect1 id="btree-gist">
<title>btree_gist</title> <title>btree_gist</title>
<indexterm zone="btree-gist"> <indexterm zone="btree-gist">
<primary>btree_gist</primary> <primary>btree_gist</primary>
</indexterm> </indexterm>
<para> <para>
btree_gist is a B-Tree implementation using GiST that supports the int2, int4, <filename>btree_gist</> provides sample GiST operator classes that
int8, float4, float8 timestamp with/without time zone, time implement B-Tree equivalent behavior for the data types
with/without time zone, date, interval, oid, money, macaddr, char, <type>int2</>, <type>int4</>, <type>int8</>, <type>float4</>,
varchar/text, bytea, numeric, bit, varbit and inet/cidr types. <type>float8</>, <type>numeric</>, <type>timestamp with time zone</>,
<type>timestamp without time zone</>, <type>time with time zone</>,
<type>time without time zone</>, <type>date</>, <type>interval</>,
<type>oid</>, <type>money</>, <type>char</>,
<type>varchar</>, <type>text</>, <type>bytea</>, <type>bit</>,
<type>varbit</>, <type>macaddr</>, <type>inet</>, and <type>cidr</>.
</para>
<para>
In general, these operator classes will not outperform the equivalent
standard btree index methods, and they lack one major feature of the
standard btree code: the ability to enforce uniqueness. However,
they are useful for GiST testing and as a base for developing other
GiST operator classes.
</para> </para>
<sect2> <sect2>
<title>Example usage</title> <title>Example usage</title>
<programlisting>
CREATE TABLE test (a int4); <programlisting>
-- create index CREATE TABLE test (a int4);
CREATE INDEX testidx ON test USING gist (a); -- create index
-- query CREATE INDEX testidx ON test USING gist (a);
SELECT * FROM test WHERE a &lt; 10; -- query
</programlisting> SELECT * FROM test WHERE a &lt; 10;
</programlisting>
</sect2> </sect2>
<sect2> <sect2>
<title>Authors</title> <title>Authors</title>
<para> <para>
All work was done by Teodor Sigaev (<email>teodor@stack.net</email>) , Teodor Sigaev (<email>teodor@stack.net</email>) ,
Oleg Bartunov (<email>oleg@sai.msu.su</email>), Janko Richter Oleg Bartunov (<email>oleg@sai.msu.su</email>), and
(<email>jankorichter@yahoo.de</email>). See Janko Richter (<email>jankorichter@yahoo.de</email>). See
<ulink url="http://www.sai.msu.su/~megera/postgres/gist"></ulink> for additional <ulink url="http://www.sai.msu.su/~megera/postgres/gist"></ulink>
information. for additional information.
</para> </para>
</sect2> </sect2>
</sect1> </sect1>
<!-- $PostgreSQL: pgsql/doc/src/sgml/chkpass.sgml,v 1.2 2007/12/06 04:12:09 tgl Exp $ -->
<sect1 id="chkpass"> <sect1 id="chkpass">
<title>chkpass</title> <title>chkpass</title>
<!--
<indexterm zone="chkpass"> <indexterm zone="chkpass">
<primary>chkpass</primary> <primary>chkpass</primary>
</indexterm> </indexterm>
-->
<para> <para>
chkpass is a password type that is automatically checked and converted upon This module implements a data type <type>chkpass</> that is
entry. It is stored encrypted. To compare, simply compare against a clear designed for storing encrypted passwords.
Each password is automatically converted to encrypted form upon entry,
and is always stored encrypted. To compare, simply compare against a clear
text password and the comparison function will encrypt it before comparing. text password and the comparison function will encrypt it before comparing.
It also returns an error if the code determines that the password is easily
crackable. This is currently a stub that does nothing.
</para> </para>
<para> <para>
Note that the chkpass data type is not indexable. There are provisions in the code to report an error if the password is
<!-- determined to be easily crackable. However, this is currently just
I haven't worried about making this type indexable. I doubt that anyone a stub that does nothing.
would ever need to sort a file in order of encrypted password.
-->
</para> </para>
<para> <para>
If you precede the string with a colon, the encryption and checking are If you precede an input string with a colon, it is assumed to be an
skipped so that you can enter existing passwords into the field. already-encrypted password, and is stored without further encryption.
This allows entry of previously-encrypted passwords.
</para> </para>
<para> <para>
On output, a colon is prepended. This makes it possible to dump and reload On output, a colon is prepended. This makes it possible to dump and reload
passwords without re-encrypting them. If you want the password (encrypted) passwords without re-encrypting them. If you want the encrypted password
without the colon then use the raw() function. This allows you to use the without the colon then use the <function>raw()</> function.
This allows you to use the
type with things like Apache's Auth_PostgreSQL module. type with things like Apache's Auth_PostgreSQL module.
</para> </para>
<para> <para>
The encryption uses the standard Unix function crypt(), and so it suffers The encryption uses the standard Unix function <function>crypt()</>,
and so it suffers
from all the usual limitations of that function; notably that only the from all the usual limitations of that function; notably that only the
first eight characters of a password are considered. first eight characters of a password are considered.
</para> </para>
<para> <para>
Here is some sample usage: Note that the chkpass data type is not indexable.
<!--
I haven't worried about making this type indexable. I doubt that anyone
would ever need to sort a file in order of encrypted password.
-->
</para> </para>
<programlisting> <para>
Sample usage:
</para>
<programlisting>
test=# create table test (p chkpass); test=# create table test (p chkpass);
CREATE TABLE CREATE TABLE
test=# insert into test values ('hello'); test=# insert into test values ('hello');
...@@ -72,13 +82,14 @@ test=# select p = 'goodbye' from test; ...@@ -72,13 +82,14 @@ test=# select p = 'goodbye' from test;
---------- ----------
f f
(1 row) (1 row)
</programlisting> </programlisting>
<sect2> <sect2>
<title>Author</title> <title>Author</title>
<para> <para>
D'Arcy J.M. Cain <email>darcy@druid.net</email> D'Arcy J.M. Cain (<email>darcy@druid.net</email>)
</para> </para>
</sect2> </sect2>
</sect1>
</sect1>
<!-- $PostgreSQL: pgsql/doc/src/sgml/contrib-spi.sgml,v 1.1 2007/12/03 04:18:47 tgl Exp $ --> <!-- $PostgreSQL: pgsql/doc/src/sgml/contrib-spi.sgml,v 1.2 2007/12/06 04:12:09 tgl Exp $ -->
<sect1 id="contrib-spi"> <sect1 id="contrib-spi">
<title>spi</title> <title>spi</title>
...@@ -29,27 +29,28 @@ ...@@ -29,27 +29,28 @@
<para> <para>
<function>check_primary_key()</> checks the referencing table. <function>check_primary_key()</> checks the referencing table.
To use, create a BEFORE INSERT OR UPDATE trigger using this To use, create a <literal>BEFORE INSERT OR UPDATE</> trigger using this
function on a table referencing another table. You are to specify function on a table referencing another table. Specify as the trigger
as trigger arguments: triggered table column names which correspond arguments: the referencing table's column name(s) which form the foreign
to foreign key, referenced table name and column names in referenced key, the referenced table name, and the column names in the referenced table
table which correspond to primary/unique key. To handle multiple which form the primary/unique key. To handle multiple foreign
foreign keys, create a trigger for each reference. keys, create a trigger for each reference.
</para> </para>
<para> <para>
<function>check_foreign_key()</> checks the referenced table. <function>check_foreign_key()</> checks the referenced table.
To use, create a BEFORE DELETE OR UPDATE trigger using this To use, create a <literal>BEFORE DELETE OR UPDATE</> trigger using this
function on a table referenced by other table(s). You are to specify function on a table referenced by other table(s). Specify as the trigger
as trigger arguments: number of references for which the function has to arguments: the number of referencing tables for which the function has to
perform checking, action if referencing key found ('cascade' &mdash; to delete perform checking, the action if a referencing key is found
corresponding foreign key, 'restrict' &mdash; to abort transaction if foreign keys (<literal>cascade</> &mdash; to delete the referencing row,
exist, 'setnull' &mdash; to set foreign key referencing primary/unique key <literal>restrict</> &mdash; to abort transaction if referencing keys
being deleted to null), triggered table column names which correspond exist, <literal>setnull</> &mdash; to set referencing key fields to null),
to primary/unique key, then referencing table name and column names the triggered table's column names which form the primary/unique key, then
corresponding to foreign key (repeated for as many referencing tables/keys the referencing table name and column names (repeated for as many
as were specified by first argument). Note that the primary/unique key referencing tables as were specified by first argument). Note that the
columns should be marked NOT NULL and should have a unique index. primary/unique key columns should be marked NOT NULL and should have a
unique index.
</para> </para>
<para> <para>
...@@ -64,60 +65,65 @@ ...@@ -64,60 +65,65 @@
Long ago, <productname>PostgreSQL</> had a built-in time travel feature Long ago, <productname>PostgreSQL</> had a built-in time travel feature
that kept the insert and delete times for each tuple. This can be that kept the insert and delete times for each tuple. This can be
emulated using these functions. To use these functions, emulated using these functions. To use these functions,
you are to add to a table two columns of <type>abstime</> type to store you must add to a table two columns of <type>abstime</> type to store
the date when a tuple was inserted (start_date) and changed/deleted the date when a tuple was inserted (start_date) and changed/deleted
(stop_date): (stop_date):
<programlisting> <programlisting>
CREATE TABLE mytab ( CREATE TABLE mytab (
... ... ... ...
start_date abstime default now(), start_date abstime,
stop_date abstime default 'infinity' stop_date abstime
... ... ... ...
); );
</programlisting> </programlisting>
So, tuples being inserted with unspecified start_date/stop_date will get The columns can be named whatever you like, but in this discussion
the current time in start_date and <literal>infinity</> in we'll call them start_date and stop_date.
stop_date. </para>
<para>
When a new row is inserted, start_date should normally be set to
current time, and stop_date to <literal>infinity</>. The trigger
will automatically substitute these values if the inserted data
contains nulls in these columns. Generally, inserting explicit
non-null data in these columns should only be done when re-loading
dumped data.
</para> </para>
<para> <para>
Tuples with stop_date equal to <literal>infinity</> are <quote>valid Tuples with stop_date equal to <literal>infinity</> are <quote>valid
now</quote>: when trigger will be fired for UPDATE/DELETE of a tuple with now</quote>, and can be modified. Tuples with a finite stop_date cannot
stop_date NOT equal to <literal>infinity</> then be modified anymore &mdash; the trigger will prevent it. (If you need
this tuple will not be changed/deleted! to do that, you can turn off time travel as shown below.)
</para> </para>
<para> <para>
If stop_date is equal to <literal>infinity</> then on For a modifiable row, on update only the stop_date in the tuple being
update only the stop_date in the tuple being updated will be changed (to updated will be changed (to current time) and a new tuple with the modified
current time) and a new tuple with new data (coming from SET ... in UPDATE) data will be inserted. Start_date in this new tuple will be set to current
will be inserted. Start_date in this new tuple will be set to current time time and stop_date to <literal>infinity</>.
and stop_date to <literal>infinity</>.
</para> </para>
<para> <para>
A delete does not actually remove the tuple but only set its stop_date A delete does not actually remove the tuple but only sets its stop_date
to current time. to current time.
</para> </para>
<para> <para>
To query for tuples <quote>valid now</quote>, include To query for tuples <quote>valid now</quote>, include
<literal>stop_date = 'infinity'</> in the query's WHERE condition. <literal>stop_date = 'infinity'</> in the query's WHERE condition.
(You might wish to incorporate that in a view.) (You might wish to incorporate that in a view.) Similarly, you can
</para> query for tuples valid at any past time with suitable conditions on
start_date and stop_date.
<para>
You can't change start/stop date columns with UPDATE!
Use set_timetravel (below) if you need this.
</para> </para>
<para> <para>
<function>timetravel()</> is the general trigger function that supports <function>timetravel()</> is the general trigger function that supports
this behavior. Create a BEFORE INSERT OR UPDATE OR DELETE trigger using this this behavior. Create a <literal>BEFORE INSERT OR UPDATE OR DELETE</>
function on each time-traveled table. You are to specify two trigger arguments: trigger using this function on each time-traveled table. Specify two
name of start_date column and name of stop_date column in triggered table. trigger arguments: the actual
names of the start_date and stop_date columns.
Optionally, you can specify one to three more arguments, which must refer Optionally, you can specify one to three more arguments, which must refer
to columns of type <type>text</>. The trigger will store the name of to columns of type <type>text</>. The trigger will store the name of
the current user into the first of these columns during INSERT, the the current user into the first of these columns during INSERT, the
...@@ -130,7 +136,9 @@ CREATE TABLE mytab ( ...@@ -130,7 +136,9 @@ CREATE TABLE mytab (
<literal>set_timetravel('mytab', 1)</> will turn TT ON for table mytab. <literal>set_timetravel('mytab', 1)</> will turn TT ON for table mytab.
<literal>set_timetravel('mytab', 0)</> will turn TT OFF for table mytab. <literal>set_timetravel('mytab', 0)</> will turn TT OFF for table mytab.
In both cases the old status is reported. While TT is off, you can modify In both cases the old status is reported. While TT is off, you can modify
the start_date and stop_date columns freely. the start_date and stop_date columns freely. Note that the on/off status
is local to the current database session &mdash; fresh sessions will
always start out with TT ON for all tables.
</para> </para>
<para> <para>
...@@ -156,9 +164,9 @@ CREATE TABLE mytab ( ...@@ -156,9 +164,9 @@ CREATE TABLE mytab (
</para> </para>
<para> <para>
To use, create a BEFORE INSERT (or optionally BEFORE INSERT OR UPDATE) To use, create a <literal>BEFORE INSERT</> (or optionally <literal>BEFORE
trigger using this function. You are to specify INSERT OR UPDATE</>) trigger using this function. Specify two
as trigger arguments: the name of the integer column to be modified, trigger arguments: the name of the integer column to be modified,
and the name of the sequence object that will supply values. and the name of the sequence object that will supply values.
(Actually, you can specify any number of pairs of such names, if (Actually, you can specify any number of pairs of such names, if
you'd like to update more than one autoincrementing column.) you'd like to update more than one autoincrementing column.)
...@@ -180,8 +188,8 @@ CREATE TABLE mytab ( ...@@ -180,8 +188,8 @@ CREATE TABLE mytab (
</para> </para>
<para> <para>
To use, create a BEFORE INSERT and/or UPDATE To use, create a <literal>BEFORE INSERT</> and/or <literal>UPDATE</>
trigger using this function. You are to specify a single trigger trigger using this function. Specify a single trigger
argument: the name of the text column to be modified. argument: the name of the text column to be modified.
</para> </para>
...@@ -201,8 +209,8 @@ CREATE TABLE mytab ( ...@@ -201,8 +209,8 @@ CREATE TABLE mytab (
</para> </para>
<para> <para>
To use, create a BEFORE UPDATE To use, create a <literal>BEFORE UPDATE</>
trigger using this function. You are to specify a single trigger trigger using this function. Specify a single trigger
argument: the name of the <type>timestamp</> column to be modified. argument: the name of the <type>timestamp</> column to be modified.
</para> </para>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/contrib.sgml,v 1.7 2007/12/03 04:18:47 tgl Exp $ --> <!-- $PostgreSQL: pgsql/doc/src/sgml/contrib.sgml,v 1.8 2007/12/06 04:12:09 tgl Exp $ -->
<appendix id="contrib"> <appendix id="contrib">
<title>Additional Supplied Modules</title> <title>Additional Supplied Modules</title>
...@@ -44,7 +44,7 @@ ...@@ -44,7 +44,7 @@
<para> <para>
Many modules supply new user-defined functions, operators, or types. Many modules supply new user-defined functions, operators, or types.
To make use of one of these modules, after you have installed the code To make use of one of these modules, after you have installed the code
you need to register the new objects in the database you need to register the new objects in the database
system by running the SQL commands in the <literal>.sql</> file system by running the SQL commands in the <literal>.sql</> file
supplied by the module. For example, supplied by the module. For example,
...@@ -54,6 +54,7 @@ psql -d dbname -f <replaceable>SHAREDIR</>/contrib/<replaceable>module</>.sql ...@@ -54,6 +54,7 @@ psql -d dbname -f <replaceable>SHAREDIR</>/contrib/<replaceable>module</>.sql
Here, <replaceable>SHAREDIR</> means the installation's <quote>share</> Here, <replaceable>SHAREDIR</> means the installation's <quote>share</>
directory (<literal>pg_config --sharedir</> will tell you what this is). directory (<literal>pg_config --sharedir</> will tell you what this is).
In most cases the script must be run by a database superuser.
</para> </para>
<para> <para>
......
This diff is collapsed.
This diff is collapsed.
<!-- $PostgreSQL: pgsql/doc/src/sgml/dict-int.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="dict-int"> <sect1 id="dict-int">
<title>dict_int</title> <title>dict_int</title>
<indexterm zone="dict-int"> <indexterm zone="dict-int">
<primary>dict_int</primary> <primary>dict_int</primary>
</indexterm> </indexterm>
<para> <para>
The motivation for this example dictionary is to control the indexing of <filename>dict_int</> is an example of an add-on dictionary template
integers (signed and unsigned), and, consequently, to minimize the number of for full-text search. The motivation for this example dictionary is to
unique words which greatly affect the performance of searching. control the indexing of integers (signed and unsigned), allowing such
numbers to be indexed while preventing excessive growth in the number of
unique words, which greatly affects the performance of searching.
</para> </para>
<sect2> <sect2>
<title>Configuration</title> <title>Configuration</title>
<para> <para>
The dictionary accepts two options: The dictionary accepts two options:
</para> </para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para> <para>
The MAXLEN parameter specifies the maximum length (number of digits) The <literal>maxlen</> parameter specifies the maximum number of
allowed in an integer word. The default value is 6. digits allowed in an integer word. The default value is 6.
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
<para> <para>
The REJECTLONG parameter specifies if an overlength integer should be The <literal>rejectlong</> parameter specifies whether an overlength
truncated or ignored. If REJECTLONG=FALSE (default), the dictionary returns integer should be truncated or ignored. If <literal>rejectlong</> is
the first MAXLEN digits of the integer. If REJECTLONG=TRUE, the <literal>false</> (the default), the dictionary returns the first
dictionary treats an overlength integer as a stop word, so that it will <literal>maxlen</> digits of the integer. If <literal>rejectlong</> is
not be indexed. <literal>true</>, the dictionary treats an overlength integer as a stop
word, so that it will not be indexed. Note that this also means that
such an integer cannot be searched for.
</para> </para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/dict-xsyn.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="dict-xsyn"> <sect1 id="dict-xsyn">
<title>dict_xsyn</title> <title>dict_xsyn</title>
<indexterm zone="dict-xsyn"> <indexterm zone="dict-xsyn">
<primary>dict_xsyn</primary> <primary>dict_xsyn</primary>
</indexterm> </indexterm>
<para> <para>
The Extended Synonym Dictionary module replaces words with groups of their <filename>dict_xsyn</> (Extended Synonym Dictionary) is an example of an
synonyms, and so makes it possible to search for a word using any of its add-on dictionary template for full-text search. This dictionary type
synonyms. replaces words with groups of their synonyms, and so makes it possible to
search for a word using any of its synonyms.
</para> </para>
<sect2> <sect2>
<title>Configuration</title> <title>Configuration</title>
<para> <para>
A <literal>dict_xsyn</> dictionary accepts the following options: A <literal>dict_xsyn</> dictionary accepts the following options:
</para> </para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para> <para>
KEEPORIG controls whether the original word is included, or only its <literal>keeporig</> controls whether the original word is included (if
synonyms. Default is 'true'. <literal>true</>), or only its synonyms (if <literal>false</>). Default
is <literal>true</>.
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
<para> <para>
RULES is the base name of the file containing the list of synonyms. <literal>rules</> is the base name of the file containing the list of
This file must be in $(prefix)/share/tsearch_data/, and its name must synonyms. This file must be stored in
end in ".rules" (which is not included in the RULES parameter). <filename>$SHAREDIR/tsearch_data/</> (where <literal>$SHAREDIR</> means
the <productname>PostgreSQL</> installation's shared-data directory).
Its name must end in <literal>.rules</> (which is not to be included in
the <literal>rules</> parameter).
</para> </para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
...@@ -38,41 +46,63 @@ ...@@ -38,41 +46,63 @@
<listitem> <listitem>
<para> <para>
Each line represents a group of synonyms for a single word, which is Each line represents a group of synonyms for a single word, which is
given first on the line. Synonyms are separated by whitespace: given first on the line. Synonyms are separated by whitespace, thus:
</para>
<programlisting> <programlisting>
word syn1 syn2 syn3 word syn1 syn2 syn3
</programlisting> </programlisting>
</para>
</listitem> </listitem>
<listitem> <listitem>
<para> <para>
Sharp ('#') sign is a comment delimiter. It may appear at any position The sharp (<literal>#</>) sign is a comment delimiter. It may appear at
inside the line. The rest of the line will be skipped. any position in a line. The rest of the line will be skipped.
</para> </para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para> <para>
Look at xsyn_sample.rules, which is installed in $(prefix)/share/tsearch_data/, Look at <filename>xsyn_sample.rules</>, which is installed in
for an example. <filename>$SHAREDIR/tsearch_data/</>, for an example.
</para> </para>
</sect2> </sect2>
<sect2> <sect2>
<title>Usage</title> <title>Usage</title>
<programlisting>
mydb=# SELECT ts_lexize('xsyn','word');
ts_lexize
----------------
{word,syn1,syn2,syn3)
</programlisting>
<para> <para>
Change dictionary options: Running the installation script creates a text search template
</para> <literal>xsyn_template</> and a dictionary <literal>xsyn</>
<programlisting> based on it, with default parameters. You can alter the
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (KEEPORIG=false); parameters, for example
<programlisting>
mydb# ALTER TEXT SEARCH DICTIONARY xsyn (RULES='my_rules', KEEPORIG=false);
ALTER TEXT SEARCH DICTIONARY ALTER TEXT SEARCH DICTIONARY
</programlisting> </programlisting>
or create new dictionaries based on the template.
</para>
<para>
To test the dictionary, you can try
<programlisting>
mydb=# SELECT ts_lexize('xsyn', 'word');
ts_lexize
-----------------------
{word,syn1,syn2,syn3}
</programlisting>
but real-world usage will involve including it in a text search
configuration as described in <xref linkend="textsearch">.
That might look like this:
<programlisting>
ALTER TEXT SEARCH CONFIGURATION english
ALTER MAPPING FOR word, asciiword WITH xsyn, english_stem;
</programlisting>
</para>
</sect2> </sect2>
</sect1> </sect1>
This diff is collapsed.
<!-- $PostgreSQL: pgsql/doc/src/sgml/fuzzystrmatch.sgml,v 1.3 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="fuzzystrmatch"> <sect1 id="fuzzystrmatch">
<title>fuzzystrmatch</title> <title>fuzzystrmatch</title>
<indexterm zone="fuzzystrmatch">
<primary>fuzzystrmatch</primary>
</indexterm>
<para> <para>
This section describes the fuzzystrmatch module which provides different The <filename>fuzzystrmatch</> module provides several
functions to determine similarities and distance between strings. functions to determine similarities and distance between strings.
</para> </para>
<sect2> <sect2>
<title>Soundex</title> <title>Soundex</title>
<para> <para>
The Soundex system is a method of matching similar sounding names The Soundex system is a method of matching similar-sounding names
(or any words) to the same code. It was initially used by the by converting them to the same code. It was initially used by the
United States Census in 1880, 1900, and 1910, but it has little use United States Census in 1880, 1900, and 1910. Note that Soundex
beyond English names (or the English pronunciation of names), and is not very useful for non-English names.
it is not a linguistic tool.
</para> </para>
<para> <para>
When comparing two soundex values to determine similarity, the The <filename>fuzzystrmatch</> module provides two functions
difference function reports how close the match is on a scale for working with Soundex codes:
from zero to four, with zero being no match and four being an
exact match.
</para> </para>
<programlisting>
soundex(text) returns text
difference(text, text) returns int
</programlisting>
<para>
The <function>soundex</> function converts a string to its Soundex code.
The <function>difference</> function converts two strings to their Soundex
codes and then reports the number of matching code positions. Since
Soundex codes have four characters, the result ranges from zero to four,
with zero being no match and four being an exact match. (Thus, the
function is misnamed &mdash; <function>similarity</> would have been
a better name.)
</para>
<para> <para>
The following are some usage examples: Here are some usage examples:
</para> </para>
<programlisting> <programlisting>
SELECT soundex('hello world!'); SELECT soundex('hello world!');
...@@ -41,81 +62,106 @@ INSERT INTO s VALUES ('jack'); ...@@ -41,81 +62,106 @@ INSERT INTO s VALUES ('jack');
SELECT * FROM s WHERE soundex(nm) = soundex('john'); SELECT * FROM s WHERE soundex(nm) = soundex('john');
SELECT a.nm, b.nm FROM s a, s b WHERE soundex(a.nm) = soundex(b.nm) AND a.oid &lt;&gt; b.oid; SELECT * FROM s WHERE difference(s.nm, 'john') &gt; 2;
</programlisting>
CREATE FUNCTION text_sx_eq(text, text) RETURNS boolean AS </sect2>
'select soundex($1) = soundex($2)'
LANGUAGE SQL;
CREATE FUNCTION text_sx_lt(text, text) RETURNS boolean AS
'select soundex($1) &lt; soundex($2)'
LANGUAGE SQL;
CREATE FUNCTION text_sx_gt(text, text) RETURNS boolean AS
'select soundex($1) &gt; soundex($2)'
LANGUAGE SQL;
CREATE FUNCTION text_sx_le(text, text) RETURNS boolean AS
'select soundex($1) &lt;= soundex($2)'
LANGUAGE SQL;
CREATE FUNCTION text_sx_ge(text, text) RETURNS boolean AS
'select soundex($1) &gt;= soundex($2)'
LANGUAGE SQL;
CREATE FUNCTION text_sx_ne(text, text) RETURNS boolean AS <sect2>
'select soundex($1) &lt;&gt; soundex($2)' <title>Levenshtein</title>
LANGUAGE SQL;
DROP OPERATOR #= (text, text); <para>
This function calculates the Levenshtein distance between two strings:
</para>
CREATE OPERATOR #= (leftarg=text, rightarg=text, procedure=text_sx_eq, commutator = #=); <programlisting>
levenshtein(text source, text target) returns int
</programlisting>
SELECT * FROM s WHERE text_sx_eq(nm, 'john'); <para>
Both <literal>source</literal> and <literal>target</literal> can be any
non-null string, with a maximum of 255 characters.
</para>
SELECT * FROM s WHERE s.nm #= 'john'; <para>
Example:
</para>
SELECT * FROM s WHERE difference(s.nm, 'john') &gt; 2; <programlisting>
test=# SELECT levenshtein('GUMBO', 'GAMBOL');
levenshtein
-------------
2
(1 row)
</programlisting> </programlisting>
</sect2> </sect2>
<sect2> <sect2>
<title>levenshtein</title> <title>Metaphone</title>
<para>
Metaphone, like Soundex, is based on the idea of constructing a
representative code for an input string. Two strings are then
deemed similar if they have the same codes.
</para>
<para> <para>
This function calculates the levenshtein distance between two strings: This function calculates the metaphone code of an input string:
</para> </para>
<programlisting> <programlisting>
int levenshtein(text source, text target) metaphone(text source, int max_output_length) returns text
</programlisting> </programlisting>
<para> <para>
Both <literal>source</literal> and <literal>target</literal> can be any <literal>source</literal> has to be a non-null string with a maximum of
NOT NULL string with a maximum of 255 characters. 255 characters. <literal>max_output_length</literal> sets the maximum
length of the output metaphone code; if longer, the output is truncated
to this length.
</para> </para>
<para> <para>
Example: Example:
</para> </para>
<programlisting> <programlisting>
SELECT levenshtein('GUMBO','GAMBOL'); test=# SELECT metaphone('GUMBO', 4);
metaphone
-----------
KM
(1 row)
</programlisting> </programlisting>
</sect2> </sect2>
<sect2> <sect2>
<title>metaphone</title> <title>Double Metaphone</title>
<para> <para>
This function calculates and returns the metaphone code of an input string: The Double Metaphone system computes two <quote>sounds like</> strings
for a given input string &mdash; a <quote>primary</> and an
<quote>alternate</>. In most cases they are the same, but for non-English
names especially they can be a bit different, depending on pronunciation.
These functions compute the primary and alternate codes:
</para> </para>
<programlisting> <programlisting>
text metahpone(text source, int max_output_length) dmetaphone(text source) returns text
dmetaphone_alt(text source) returns text
</programlisting> </programlisting>
<para> <para>
<literal>source</literal> has to be a NOT NULL string with a maximum of There is no length limit on the input strings.
255 characters. <literal>max_output_length</literal> fixes the maximum </para>
length of the output metaphone code; if longer, the output is truncated
to this length. <para>
Example:
</para> </para>
<para>Example</para>
<programlisting> <programlisting>
SELECT metaphone('GUMBO',4); test=# select dmetaphone('gumbo');
dmetaphone
------------
KMP
(1 row)
</programlisting> </programlisting>
</sect2> </sect2>
......
This diff is collapsed.
<!-- $PostgreSQL: pgsql/doc/src/sgml/lo.sgml,v 1.3 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="lo"> <sect1 id="lo">
<title>lo</title> <title>lo</title>
<indexterm zone="lo"> <indexterm zone="lo">
<primary>lo</primary> <primary>lo</primary>
</indexterm> </indexterm>
<para> <para>
PostgreSQL type extension for managing Large Objects The <filename>lo</> module provides support for managing Large Objects
(also called LOs or BLOBs). This includes a data type <type>lo</>
and a trigger <function>lo_manage</>.
</para> </para>
<sect2> <sect2>
<title>Overview</title> <title>Rationale</title>
<para> <para>
One of the problems with the JDBC driver (and this affects the ODBC driver One of the problems with the JDBC driver (and this affects the ODBC driver
also), is that the specification assumes that references to BLOBS (Binary also), is that the specification assumes that references to BLOBs (Binary
Large OBjectS) are stored within a table, and if that entry is changed, the Large OBjects) are stored within a table, and if that entry is changed, the
associated BLOB is deleted from the database. associated BLOB is deleted from the database.
</para> </para>
<para> <para>
As PostgreSQL stands, this doesn't occur. Large objects are treated as As <productname>PostgreSQL</> stands, this doesn't occur. Large objects
objects in their own right; a table entry can reference a large object by are treated as objects in their own right; a table entry can reference a
OID, but there can be multiple table entries referencing the same large large object by OID, but there can be multiple table entries referencing
object OID, so the system doesn't delete the large object just because you the same large object OID, so the system doesn't delete the large object
change or remove one such entry. just because you change or remove one such entry.
</para>
<para>
Now this is fine for new PostgreSQL-specific applications, but existing ones
using JDBC or ODBC won't delete the objects, resulting in orphaning - objects
that are not referenced by anything, and simply occupy disk space.
</para> </para>
</sect2>
<sect2>
<title>The Fix</title>
<para> <para>
I've fixed this by creating a new data type 'lo', some support functions, and Now this is fine for <productname>PostgreSQL</>-specific applications, but
a Trigger which handles the orphaning problem. The trigger essentially just standard code using JDBC or ODBC won't delete the objects, resulting in
does a 'lo_unlink' whenever you delete or modify a value referencing a large orphan objects &mdash; objects that are not referenced by anything, and
object. When you use this trigger, you are assuming that there is only one simply occupy disk space.
database reference to any large object that is referenced in a
trigger-controlled column!
</para> </para>
<para> <para>
The 'lo' type was created because we needed to differentiate between plain The <filename>lo</> module allows fixing this by attaching a trigger
OIDs and Large Objects. Currently the JDBC driver handles this dilemma easily, to tables that contain LO reference columns. The trigger essentially just
but (after talking to Byron), the ODBC driver needed a unique type. They had does a <function>lo_unlink</> whenever you delete or modify a value
created an 'lo' type, but not the solution to orphaning. referencing a large object. When you use this trigger, you are assuming
that there is only one database reference to any large object that is
referenced in a trigger-controlled column!
</para> </para>
<para> <para>
You don't actually have to use the 'lo' type to use the trigger, but it may be The module also provides a data type <type>lo</>, which is really just
convenient to use it to keep track of which columns in your database represent a domain of the <type>oid</> type. This is useful for differentiating
large objects that you are managing with the trigger. database columns that hold large object references from those that are
OIDs of other things. You don't have to use the <type>lo</> type to
use the trigger, but it may be convenient to use it to keep track of which
columns in your database represent large objects that you are managing with
the trigger. It is also rumored that the ODBC driver gets confused if you
don't use <type>lo</> for BLOB columns.
</para> </para>
</sect2> </sect2>
<sect2> <sect2>
<title>How to Use</title> <title>How to Use It</title>
<para> <para>
The easiest way is by an example: Here's a simple example of usage:
</para> </para>
<programlisting> <programlisting>
CREATE TABLE image (title TEXT, raster lo); CREATE TABLE image (title TEXT, raster lo);
CREATE TRIGGER t_raster BEFORE UPDATE OR DELETE ON image CREATE TRIGGER t_raster BEFORE UPDATE OR DELETE ON image
FOR EACH ROW EXECUTE PROCEDURE lo_manage(raster); FOR EACH ROW EXECUTE PROCEDURE lo_manage(raster);
</programlisting> </programlisting>
<para> <para>
Create a trigger for each column that contains a lo type, and give the column For each column that will contain unique references to large objects,
name as the trigger procedure argument. You can have more than one trigger on create a <literal>BEFORE UPDATE OR DELETE</> trigger, and give the column
a table if you need multiple lo columns in the same table, but don't forget to name as the sole trigger argument. If you need multiple <type>lo</>
give a different name to each trigger. columns in the same table, create a separate trigger for each one,
remembering to give a different name to each trigger on the same table.
</para> </para>
</sect2> </sect2>
<sect2> <sect2>
<title>Issues</title> <title>Limitations</title>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para> <para>
Dropping a table will still orphan any objects it contains, as the trigger Dropping a table will still orphan any objects it contains, as the trigger
is not executed. is not executed. You can avoid this by preceding the <command>DROP
TABLE</> with <command>DELETE FROM <replaceable>table</></command>.
</para> </para>
<para> <para>
Avoid this by preceding the 'drop table' with 'delete from {table}'. <command>TRUNCATE</> has the same hazard.
</para> </para>
<para> <para>
If you already have, or suspect you have, orphaned large objects, see If you already have, or suspect you have, orphaned large objects, see the
the contrib/vacuumlo module to help you clean them up. It's a good idea <filename>contrib/vacuumlo</> module (<xref linkend="vacuumlo">) to help
to run contrib/vacuumlo occasionally as a back-stop to the lo_manage you clean them up. It's a good idea to run <application>vacuumlo</>
trigger. occasionally as a back-stop to the <function>lo_manage</> trigger.
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
<para> <para>
Some frontends may create their own tables, and will not create the Some frontends may create their own tables, and will not create the
associated trigger(s). Also, users may not remember (or know) to create associated trigger(s). Also, users may not remember (or know) to create
the triggers. the triggers.
</para> </para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para>
As the ODBC driver needs a permanent lo type (&amp; JDBC could be optimised to
use it if it's Oid is fixed), and as the above issues can only be fixed by
some internal changes, I feel it should become a permanent built-in type.
</para>
</sect2> </sect2>
<sect2> <sect2>
<title>Author</title> <title>Author</title>
<para> <para>
Peter Mount <email>peter@retep.org.uk</email> June 13 1998 Peter Mount <email>peter@retep.org.uk</email>
</para> </para>
</sect2> </sect2>
</sect1>
</sect1>
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
<!-- $PostgreSQL: pgsql/doc/src/sgml/test-parser.sgml,v 1.1 2007/12/03 04:18:47 tgl Exp $ --> <!-- $PostgreSQL: pgsql/doc/src/sgml/test-parser.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="test-parser"> <sect1 id="test-parser">
<title>test_parser</title> <title>test_parser</title>
...@@ -8,15 +8,18 @@ ...@@ -8,15 +8,18 @@
</indexterm> </indexterm>
<para> <para>
This is an example of a custom parser for full text search. <filename>test_parser</> is an example of a custom parser for full-text
search. It doesn't do anything especially useful, but can serve as
a starting point for developing your own parser.
</para> </para>
<para> <para>
It recognizes space-delimited words and returns just two token types: <filename>test_parser</> recognizes words separated by white space,
and returns just two token types:
<programlisting> <programlisting>
mydb=# SELECT * FROM ts_token_type('testparser'); mydb=# SELECT * FROM ts_token_type('testparser');
tokid | alias | description tokid | alias | description
-------+-------+--------------- -------+-------+---------------
3 | word | Word 3 | word | Word
12 | blank | Space symbols 12 | blank | Space symbols
...@@ -41,16 +44,16 @@ mydb=# SELECT * FROM ts_token_type('testparser'); ...@@ -41,16 +44,16 @@ mydb=# SELECT * FROM ts_token_type('testparser');
<programlisting> <programlisting>
mydb=# SELECT * FROM ts_parse('testparser', 'That''s my first own parser'); mydb=# SELECT * FROM ts_parse('testparser', 'That''s my first own parser');
tokid | token tokid | token
-------+-------- -------+--------
3 | That's 3 | That's
12 | 12 |
3 | my 3 | my
12 | 12 |
3 | first 3 | first
12 | 12 |
3 | own 3 | own
12 | 12 |
3 | parser 3 | parser
</programlisting> </programlisting>
</para> </para>
...@@ -68,14 +71,14 @@ mydb-# ADD MAPPING FOR word WITH english_stem; ...@@ -68,14 +71,14 @@ mydb-# ADD MAPPING FOR word WITH english_stem;
ALTER TEXT SEARCH CONFIGURATION ALTER TEXT SEARCH CONFIGURATION
mydb=# SELECT to_tsvector('testcfg', 'That''s my first own parser'); mydb=# SELECT to_tsvector('testcfg', 'That''s my first own parser');
to_tsvector to_tsvector
------------------------------- -------------------------------
'that':1 'first':3 'parser':5 'that':1 'first':3 'parser':5
(1 row) (1 row)
mydb=# SELECT ts_headline('testcfg', 'Supernovae stars are the brightest phenomena in galaxies', mydb=# SELECT ts_headline('testcfg', 'Supernovae stars are the brightest phenomena in galaxies',
mydb(# to_tsquery('testcfg', 'star')); mydb(# to_tsquery('testcfg', 'star'));
ts_headline ts_headline
----------------------------------------------------------------- -----------------------------------------------------------------
Supernovae &lt;b&gt;stars&lt;/b&gt; are the brightest phenomena in galaxies Supernovae &lt;b&gt;stars&lt;/b&gt; are the brightest phenomena in galaxies
(1 row) (1 row)
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/tsearch2.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="tsearch2"> <sect1 id="tsearch2">
<title>tsearch2</title> <title>tsearch2</title>
<indexterm zone="tsearch2"> <indexterm zone="tsearch2">
<primary>tsearch2</primary> <primary>tsearch2</primary>
</indexterm> </indexterm>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/uuid-ossp.sgml,v 1.2 2007/12/06 04:12:10 tgl Exp $ -->
<sect1 id="uuid-ossp"> <sect1 id="uuid-ossp">
<title>uuid-ossp</title> <title>uuid-ossp</title>
<indexterm zone="uuid-ossp"> <indexterm zone="uuid-ossp">
<primary>uuid-ossp</primary> <primary>uuid-ossp</primary>
</indexterm> </indexterm>
<para> <para>
This module provides functions to generate universally unique The <filename>uuid-ossp</> module provides functions to generate universally
identifiers (UUIDs) using one of the several standard algorithms, as unique identifiers (UUIDs) using one of several standard algorithms. There
well as functions to produce certain special UUID constants. are also functions to produce certain special UUID constants.
</para>
<para>
This module depends on the OSSP UUID library, which can be found at
<ulink url="http://www.ossp.org/pkg/lib/uuid/"></ulink>.
</para> </para>
<sect2> <sect2>
<title>UUID Generation</title> <title><literal>uuid-ossp</literal> Functions</title>
<para> <para>
The relevant standards ITU-T Rec. X.667, ISO/IEC 9834-8:2005, and RFC The relevant standards ITU-T Rec. X.667, ISO/IEC 9834-8:2005, and RFC
4122 specify four algorithms for generating UUIDs, identified by the 4122 specify four algorithms for generating UUIDs, identified by the
...@@ -23,7 +30,7 @@ ...@@ -23,7 +30,7 @@
</para> </para>
<table> <table>
<title><literal>uuid-ossp</literal> functions</title> <title>Functions for UUID Generation</title>
<tgroup cols="2"> <tgroup cols="2">
<thead> <thead>
<row> <row>
...@@ -59,22 +66,9 @@ ...@@ -59,22 +66,9 @@
<para> <para>
This function generates a version 3 UUID in the given namespace using This function generates a version 3 UUID in the given namespace using
the specified input name. The namespace should be one of the special the specified input name. The namespace should be one of the special
constants produced by the uuid_ns_*() functions shown below. (It constants produced by the <function>uuid_ns_*()</> functions shown
could be any UUID in theory.) The name is an identifier in the below. (It could be any UUID in theory.) The name is an identifier
selected namespace. For example: in the selected namespace.
</para>
</entry>
</row>
<row>
<entry><literal>uuid_generate_v3(uuid_ns_url(), 'http://www.postgresql.org')</literal></entry>
<entry>
<para>
The name parameter will be MD5-hashed, so the cleartext cannot be
derived from the generated UUID.
</para>
<para>
The generation of UUIDs by this method has no random or
environment-dependent element and is therefore reproducible.
</para> </para>
</entry> </entry>
</row> </row>
...@@ -102,15 +96,28 @@ ...@@ -102,15 +96,28 @@
</tgroup> </tgroup>
</table> </table>
<para>
For example:
<programlisting>
SELECT uuid_generate_v3(uuid_ns_url(), 'http://www.postgresql.org');
</programlisting>
The name parameter will be MD5-hashed, so the cleartext cannot be
derived from the generated UUID.
The generation of UUIDs by this method has no random or
environment-dependent element and is therefore reproducible.
</para>
<table> <table>
<title>UUID Constants</title> <title>Functions Returning UUID Constants</title>
<tgroup cols="2"> <tgroup cols="2">
<tbody> <tbody>
<row> <row>
<entry><literal>uuid_nil()</literal></entry> <entry><literal>uuid_nil()</literal></entry>
<entry> <entry>
<para> <para>
A "nil" UUID constant, which does not occur as a real UUID. A <quote>nil</> UUID constant, which does not occur as a real UUID.
</para> </para>
</entry> </entry>
</row> </row>
...@@ -135,8 +142,8 @@ ...@@ -135,8 +142,8 @@
<entry> <entry>
<para> <para>
Constant designating the ISO object identifier (OID) namespace for Constant designating the ISO object identifier (OID) namespace for
UUIDs. (This pertains to ASN.1 OIDs, unrelated to the OIDs used in UUIDs. (This pertains to ASN.1 OIDs, which are unrelated to the OIDs
PostgreSQL.) used in <productname>PostgreSQL</>.)
</para> </para>
</entry> </entry>
</row> </row>
...@@ -153,11 +160,14 @@ ...@@ -153,11 +160,14 @@
</tgroup> </tgroup>
</table> </table>
</sect2> </sect2>
<sect2> <sect2>
<title>Author</title> <title>Author</title>
<para> <para>
Peter Eisentraut <email>peter_e@gmx.net</email> Peter Eisentraut <email>peter_e@gmx.net</email>
</para> </para>
</sect2> </sect2>
</sect1>
</sect1>
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment