Commit 8828689a authored by Tom Lane's avatar Tom Lane

Make an editorial pass over the newly SGML-ified contrib documentation.

Fix lots of bad markup, bad English, bad explanations.

Second round of commits.  pgcrypto and pgstandby still to go...
parent a3102ce1
<!-- $PostgreSQL: pgsql/doc/src/sgml/intagg.sgml,v 1.3 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="intagg">
<title>intagg</title>
<indexterm zone="intagg">
<primary>intagg</primary>
</indexterm>
<para>
This section describes the <literal>intagg</literal> module which provides an integer aggregator and an enumerator.
The <filename>intagg</filename> module provides an integer aggregator and an
enumerator.
</para>
<sect2>
<title>Functions</title>
<para>
The aggregator is an aggregate function
<function>int_array_aggregate(integer)</>
that produces an integer array
containing exactly the integers it is fed.
Here is a not-tremendously-useful example:
</para>
<programlisting>
test=# select int_array_aggregate(i) from
test-# generate_series(1,10,2) i;
int_array_aggregate
---------------------
{1,3,5,7,9}
(1 row)
</programlisting>
<para>
Many database systems have the notion of a one to many table. Such a table usually sits between two indexed tables, as:
The enumerator is a function
<function>int_array_enum(integer[])</>
that returns <type>setof integer</>. It is essentially the reverse
operation of the aggregator: given an array of integers, expand it
into a set of rows. For example,
</para>
<programlisting>
test=# select * from int_array_enum(array[1,3,5,7,9]);
int_array_enum
----------------
1
3
5
7
9
(5 rows)
</programlisting>
</sect2>
<sect2>
<title>Sample Uses</title>
<para>
Many database systems have the notion of a one to many table. Such a table
usually sits between two indexed tables, for example:
</para>
<programlisting>
CREATE TABLE one_to_many(left INT, right INT) ;
CREATE TABLE left (id INT PRIMARY KEY, ...);
CREATE TABLE right (id INT PRIMARY KEY, ...);
CREATE TABLE one_to_many(left INT REFERENCES left, right INT REFERENCES right);
</programlisting>
<para>
And it is used like this:
It is typically used like this:
</para>
<programlisting>
SELECT right.* from right JOIN one_to_many ON (right.id = one_to_many.right)
WHERE one_to_many.left = item;
SELECT right.* from right JOIN one_to_many ON (right.id = one_to_many.right)
WHERE one_to_many.left = <replaceable>item</>;
</programlisting>
<para>
This will return all the items in the right hand table for an entry
This will return all the items in the right hand table for an entry
in the left hand table. This is a very common construct in SQL.
</para>
<para>
Now, this methodology can be cumbersome with a very large number of
entries in the one_to_many table. Depending on the order in which
data was entered, a join like this could result in an index scan
entries in the <structname>one_to_many</> table. Often,
a join like this would result in an index scan
and a fetch for each right hand entry in the table for a particular
left hand entry. If you have a very dynamic system, there is not much you
left hand entry. If you have a very dynamic system, there is not much you
can do. However, if you have some data which is fairly static, you can
create a summary table with the aggregator.
</para>
<programlisting>
CREATE TABLE summary as SELECT left, int_array_aggregate(right)
AS right FROM one_to_many GROUP BY left;
CREATE TABLE summary as
SELECT left, int_array_aggregate(right) AS right
FROM one_to_many
GROUP BY left;
</programlisting>
<para>
This will create a table with one row per left item, and an array
of right items. Now this is pretty useless without some way of using
the array, thats why there is an array enumerator.
the array; that's why there is an array enumerator. You can do
</para>
<programlisting>
SELECT left, int_array_enum(right) FROM summary WHERE left = item;
SELECT left, int_array_enum(right) FROM summary WHERE left = <replaceable>item</>;
</programlisting>
<para>
The above query using int_array_enum, produces the same results as:
The above query using <function>int_array_enum</> produces the same results
as
</para>
<programlisting>
SELECT left, right FROM one_to_many WHERE left = item;
SELECT left, right FROM one_to_many WHERE left = <replaceable>item</>;
</programlisting>
<para>
The difference is that the query against the summary table has to get
only one row from the table, where as the query against "one_to_many"
must index scan and fetch a row for each entry.
only one row from the table, whereas the direct query against
<structname>one_to_many</> must index scan and fetch a row for each entry.
</para>
<para>
On our system, an EXPLAIN shows a query with a cost of 8488 gets reduced
to a cost of 329. The query is a join between the one_to_many table,
On one system, an <command>EXPLAIN</> showed a query with a cost of 8488 was
reduced to a cost of 329. The original query was a join involving the
<structname>one_to_many</> table, which was replaced by:
</para>
<programlisting>
SELECT right, count(right) FROM
(
SELECT left, int_array_enum(right) AS right FROM summary JOIN
(SELECT left FROM left_table WHERE left = item) AS lefts
ON (summary.left = lefts.left )
) AS list GROUP BY right ORDER BY count DESC ;
SELECT right, count(right) FROM
( SELECT left, int_array_enum(right) AS right
FROM summary JOIN (SELECT left FROM left_table WHERE left = <replaceable>item</>) AS lefts
ON (summary.left = lefts.left)
) AS list
GROUP BY right
ORDER BY count DESC;
</programlisting>
</sect1>
</sect2>
</sect1>
<!-- $PostgreSQL: pgsql/doc/src/sgml/intarray.sgml,v 1.5 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="intarray">
<title>intarray</title>
<indexterm zone="intarray">
<primary>intarray</primary>
</indexterm>
<para>
This is an implementation of RD-tree data structure using GiST interface
of PostgreSQL. It has built-in lossy compression.
</para>
<para>
Current implementation provides index support for one-dimensional array of
integers: gist__int_ops, suitable for small and medium size of arrays (used by
default), and gist__intbig_ops for indexing large arrays (we use superimposed
signature with length of 4096 bits to represent sets). There is also a
non-default gin__int_ops for GIN indexes on integer arrays.
The <filename>intarray</> module provides a number of useful functions
and operators for manipulating one-dimensional arrays of integers.
There is also support for indexed searches using some of the operators.
</para>
<sect2>
<title>Functions</title>
<itemizedlist>
<listitem>
<para>
<literal>int icount(int[])</literal> - the number of elements in intarray
</para>
<programlisting>
test=# select icount('{1,2,3}'::int[]);
icount
--------
3
(1 row)
</programlisting>
</listitem>
<listitem>
<para>
<literal>int[] sort(int[], 'asc' | 'desc')</literal> - sort intarray
</para>
<programlisting>
test=# select sort('{1,2,3}'::int[],'desc');
sort
---------
{3,2,1}
(1 row)
</programlisting>
</listitem>
<listitem>
<para>
<literal>int[] sort(int[])</literal> - sort in ascending order
</para>
</listitem>
<listitem>
<para>
<literal>int[] sort_asc(int[]),sort_desc(int[])</literal> - shortcuts for sort
</para>
</listitem>
<listitem>
<para>
<literal>int[] uniq(int[])</literal> - returns unique elements
</para>
<programlisting>
test=# select uniq(sort('{1,2,3,2,1}'::int[]));
uniq
---------
{1,2,3}
(1 row)
</programlisting>
</listitem>
<listitem>
<para>
<literal>int idx(int[], int item)</literal> - returns index of first
intarray matching element to item, or '0' if matching failed.
</para>
<programlisting>
test=# select idx('{1,2,3,2,1}'::int[],2);
idx
-----
2
(1 row)
</programlisting>
</listitem>
<listitem>
<para>
<literal>int[] subarray(int[],int START [, int LEN])</literal> - returns
part of intarray starting from element number START (from 1) and length LEN.
</para>
<programlisting>
test=# select subarray('{1,2,3,2,1}'::int[],2,3);
subarray
----------
{2,3,2}
(1 row)
</programlisting>
</listitem>
<listitem>
<para>
<literal>int[] intset(int4)</literal> - casting int4 to int[]
</para>
<programlisting>
test=# select intset(1);
intset
--------
{1}
(1 row)
</programlisting>
</listitem>
</itemizedlist>
</sect2>
<title><filename>intarray</> Functions and Operators</title>
<sect2>
<title>Operations</title>
<table>
<title>Operations</title>
<tgroup cols="2">
<table id="intarray-func-table">
<title><filename>intarray</> Functions</title>
<tgroup cols="5">
<thead>
<row>
<entry>Function</entry>
<entry>Return Type</entry>
<entry>Description</entry>
<entry>Example</entry>
<entry>Result</entry>
</row>
</thead>
<tbody>
<row>
<entry><function>icount(int[])</function></entry>
<entry><type>int</type></entry>
<entry>number of elements in array</entry>
<entry><literal>icount('{1,2,3}'::int[])</literal></entry>
<entry><literal>3</literal></entry>
</row>
<row>
<entry><function>sort(int[], text dir)</function></entry>
<entry><type>int[]</type></entry>
<entry>sort array &mdash; <parameter>dir</> must be <literal>asc</> or <literal>desc</></entry>
<entry><literal>sort('{1,2,3}'::int[], 'desc')</literal></entry>
<entry><literal>{3,2,1}</literal></entry>
</row>
<row>
<entry><function>sort(int[])</function></entry>
<entry><type>int[]</type></entry>
<entry>sort in ascending order</entry>
<entry><literal>sort(array[11,77,44])</literal></entry>
<entry><literal>{11,44,77}</literal></entry>
</row>
<row>
<entry><function>sort_asc(int[])</function></entry>
<entry><type>int[]</type></entry>
<entry>sort in ascending order</entry>
<entry><literal></literal></entry>
<entry><literal></literal></entry>
</row>
<row>
<entry><function>sort_desc(int[])</function></entry>
<entry><type>int[]</type></entry>
<entry>sort in descending order</entry>
<entry><literal></literal></entry>
<entry><literal></literal></entry>
</row>
<row>
<entry><function>uniq(int[])</function></entry>
<entry><type>int[]</type></entry>
<entry>remove adjacent duplicates</entry>
<entry><literal>uniq(sort('{1,2,3,2,1}'::int[]))</literal></entry>
<entry><literal>{1,2,3}</literal></entry>
</row>
<row>
<entry><function>idx(int[], int item)</function></entry>
<entry><type>int</type></entry>
<entry>index of first element matching <parameter>item</> (0 if none)</entry>
<entry><literal>idx(array[11,22,33,22,11], 22)</literal></entry>
<entry><literal>2</literal></entry>
</row>
<row>
<entry><function>subarray(int[], int start, int len)</function></entry>
<entry><type>int[]</type></entry>
<entry>portion of array starting at position <parameter>start</>, len <parameter>elements</></entry>
<entry><literal>subarray('{1,2,3,2,1}'::int[], 2, 3)</literal></entry>
<entry><literal>{2,3,2}</literal></entry>
</row>
<row>
<entry><function>subarray(int[], int start)</function></entry>
<entry><type>int[]</type></entry>
<entry>portion of array starting at position <parameter>start</></entry>
<entry><literal>subarray('{1,2,3,2,1}'::int[], 2)</literal></entry>
<entry><literal>{2,3,2,1}</literal></entry>
</row>
<row>
<entry><function>intset(int)</function></entry>
<entry><type>int[]</type></entry>
<entry>make single-element array</entry>
<entry><literal>intset(42)</literal></entry>
<entry><literal>{42}</literal></entry>
</row>
</tbody>
</tgroup>
</table>
<table id="intarray-op-table">
<title><filename>intarray</> Operators</title>
<tgroup cols="3">
<thead>
<row>
<entry>Operator</entry>
<entry>Returns</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>int[] &amp;&amp; int[]</literal></entry>
<entry>overlap - returns TRUE if arrays have at least one common element</entry>
<entry><type>boolean</type></entry>
<entry>overlap &mdash; <literal>true</> if arrays have at least one common element</entry>
</row>
<row>
<entry><literal>int[] @&gt; int[]</literal></entry>
<entry>contains - returns TRUE if left array contains right array</entry>
<entry><type>boolean</type></entry>
<entry>contains &mdash; <literal>true</> if left array contains right array</entry>
</row>
<row>
<entry><literal>int[] &lt;@ int[]</literal></entry>
<entry>contained - returns TRUE if left array is contained in right array</entry>
<entry><type>boolean</type></entry>
<entry>contained &mdash; <literal>true</> if left array is contained in right array</entry>
</row>
<row>
<entry><literal># int[]</literal></entry>
<entry>returns the number of elements in array</entry>
<entry><type>int</type></entry>
<entry>number of elements in array</entry>
</row>
<row>
<entry><literal>int[] # int</literal></entry>
<entry><type>int</type></entry>
<entry>index (same as <function>idx</> function)</entry>
</row>
<row>
<entry><literal>int[] + int</literal></entry>
<entry>push element to array ( add to end of array)</entry>
<entry><type>int[]</type></entry>
<entry>push element onto array (add it to end of array)</entry>
</row>
<row>
<entry><literal>int[] + int[] </literal></entry>
<entry>merge of arrays (right array added to the end of left one)</entry>
<entry><type>int[]</type></entry>
<entry>array concatenation (right array added to the end of left one)</entry>
</row>
<row>
<entry><literal>int[] - int</literal></entry>
<entry>remove entries matched by right argument from array</entry>
<entry><type>int[]</type></entry>
<entry>remove entries matching right argument from array</entry>
</row>
<row>
<entry><literal>int[] - int[]</literal></entry>
<entry>remove right array from left</entry>
<entry><type>int[]</type></entry>
<entry>remove elements of right array from left</entry>
</row>
<row>
<entry><literal>int[] | int</literal></entry>
<entry>returns intarray - union of arguments</entry>
<entry><type>int[]</type></entry>
<entry>union of arguments</entry>
</row>
<row>
<entry><literal>int[] | int[]</literal></entry>
<entry>returns intarray as a union of two arrays</entry>
<entry><type>int[]</type></entry>
<entry>union of arrays</entry>
</row>
<row>
<entry><literal>int[] &amp; int[]</literal></entry>
<entry>returns intersection of arrays</entry>
<entry><type>int[]</type></entry>
<entry>intersection of arrays</entry>
</row>
<row>
<entry><literal>int[] @@ query_int</literal></entry>
<entry>
returns TRUE if array satisfies query (like
<literal>'1&amp;(2|3)'</literal>)
</entry>
<entry><type>boolean</type></entry>
<entry><literal>true</> if array satisfies query (see below)</entry>
</row>
<row>
<entry><literal>query_int ~~ int[]</literal></entry>
<entry>returns TRUE if array satisfies query (commutator of @@)</entry>
<entry><type>boolean</type></entry>
<entry><literal>true</> if array satisfies query (commutator of <literal>@@</>)</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
(Before PostgreSQL 8.2, the containment operators @&gt; and &lt;@ were
respectively called @ and ~. These names are still available, but are
......@@ -198,85 +209,102 @@ test=# select intset(1);
are reversed from the convention formerly followed by the core geometric
datatypes!)
</para>
<para>
The <literal>@@</> and <literal>~~</> operators test whether an array
satisfies a <firstterm>query</>, which is expressed as a value of a
specialized data type <type>query_int</>. A <firstterm>query</>
consists of integer values that are checked against the elements of
the array, possibly combined using the operators <literal>&amp;</>
(AND), <literal>|</> (OR), and <literal>!</> (NOT). Parentheses
can be used as needed. For example,
the query <literal>1&amp;(2|3)</> matches arrays that contain 1
and also contain either 2 or 3.
</para>
</sect2>
<sect2>
<title>Index Support</title>
<para>
<filename>intarray</> provides index support for the
<literal>&amp;&amp;</>, <literal>@&gt;</>, <literal>&lt;@</>,
and <literal>@@</> operators, as well as regular array equality.
The implementation uses an RD-tree data structure with
built-in lossy compression.
</para>
<para>
Two GiST index operator classes are provided:
<literal>gist__int_ops</> (used by default) is suitable for
small and medium-size arrays, while
<literal>gist__intbig_ops</> uses a larger signature and is more
suitable for indexing large arrays.
</para>
<para>
There is also a non-default GIN operator class
<literal>gin__int_ops</>.
</para>
<para>
The choice between GiST and GIN indexing depends on the relative
performance characteristics of GiST and GIN, which are discussed elsewhere.
As a rule of thumb, a GIN index is faster to search than a GiST index, but
slower to build or update; so GIN is better suited for static data and GiST
for often-updated data.
</para>
</sect2>
<sect2>
<title>Example</title>
<programlisting>
CREATE TABLE message (mid INT NOT NULL,sections INT[]);
CREATE TABLE message_section_map (mid INT NOT NULL,sid INT NOT NULL);
-- a message can be in one or more <quote>sections</>
CREATE TABLE message (mid INT PRIMARY KEY, sections INT[], ...);
-- create indices
CREATE unique index message_key ON message ( mid );
CREATE unique index message_section_map_key2 ON message_section_map (sid, mid );
CREATE INDEX message_rdtree_idx ON message USING GIST ( sections gist__int_ops);
-- create specialized index
CREATE INDEX message_rdtree_idx ON message USING GIST (sections gist__int_ops);
-- select some messages with section in 1 OR 2 - OVERLAP operator
SELECT message.mid FROM message WHERE message.sections &amp;&amp; '{1,2}';
-- select messages in section 1 OR 2 - OVERLAP operator
SELECT message.mid FROM message WHERE message.sections &amp;&amp; '{1,2}';
-- select messages contains in sections 1 AND 2 - CONTAINS operator
-- select messages in sections 1 AND 2 - CONTAINS operator
SELECT message.mid FROM message WHERE message.sections @&gt; '{1,2}';
-- the same, CONTAINED operator
SELECT message.mid FROM message WHERE '{1,2}' &lt;@ message.sections;
-- the same, using QUERY operator
SELECT message.mid FROM message WHERE message.sections @@ '1&amp;2'::query_int;
</programlisting>
</sect2>
<sect2>
<title>Benchmark</title>
<para>
subdirectory bench contains benchmark suite.
The source directory <filename>contrib/intarray/bench</> contains a
benchmark test suite. To run:
</para>
<programlisting>
cd ./bench
1. createdb TEST
2. psql TEST &lt; ../_int.sql
3. ./create_test.pl | psql TEST
4. ./bench.pl - perl script to benchmark queries, supports OR, AND queries
with/without RD-Tree. Run script without arguments to
see availbale options.
a)test without RD-Tree (OR)
./bench.pl -d TEST -c -s 1,2 -v
b)test with RD-Tree
./bench.pl -d TEST -c -s 1,2 -v -r
BENCHMARKS:
Size of table &lt;message>: 200000
Size of table &lt;message_section_map>: 269133
Distribution of messages by sections:
section 0: 74377 messages
section 1: 16284 messages
section 50: 1229 messages
section 99: 683 messages
old - without RD-Tree support,
new - with RD-Tree
+----------+---------------+----------------+
|Search set|OR, time in sec|AND, time in sec|
| +-------+-------+--------+-------+
| | old | new | old | new |
+----------+-------+-------+--------+-------+
| 1| 0.625| 0.101| -| -|
+----------+-------+-------+--------+-------+
| 99| 0.018| 0.017| -| -|
+----------+-------+-------+--------+-------+
| 1,2| 0.766| 0.133| 0.628| 0.045|
+----------+-------+-------+--------+-------+
| 1,2,50,65| 0.794| 0.141| 0.030| 0.006|
+----------+-------+-------+--------+-------+
cd .../bench
createdb TEST
psql TEST &lt; ../_int.sql
./create_test.pl | psql TEST
./bench.pl
</programlisting>
<para>
The <filename>bench.pl</> script has numerous options, which
are displayed when it is run without any arguments.
</para>
</sect2>
<sect2>
<title>Authors</title>
<para>
All work was done by Teodor Sigaev (<email>teodor@stack.net</email>) and Oleg
Bartunov (<email>oleg@sai.msu.su</email>). See
All work was done by Teodor Sigaev (<email>teodor@stack.net</email>) and
Oleg Bartunov (<email>oleg@sai.msu.su</email>). See
<ulink url="http://www.sai.msu.su/~megera/postgres/gist"></ulink> for
additional information. Andrey Oktyabrski did a great work on adding new
functions and operations.
......@@ -284,4 +312,3 @@ SELECT message.mid FROM message WHERE '{1,2}' &lt;@ message.sections;
</sect2>
</sect1>
<!-- $PostgreSQL: pgsql/doc/src/sgml/isn.sgml,v 1.4 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="isn">
<title>isn</title>
<indexterm zone="isn">
<primary>isn</primary>
</indexterm>
<para>
The <literal>isn</literal> module adds data types for the following
international-standard namespaces: EAN13, UPC, ISBN (books), ISMN (music),
and ISSN (serials). This module is inspired by Garrett A. Wollman's
isbn_issn code.
</para>
<para>
This module validates, and automatically adds the correct
hyphenations to the numbers. Also, it supports the new ISBN-13
numbers to be used starting in January 2007.
The <filename>isn</filename> module provides data types for the following
international product numbering standards: EAN13, UPC, ISBN (books), ISMN
(music), and ISSN (serials). Numbers are validated on input, and correctly
hyphenated on output.
</para>
<para>
Premises:
</para>
<orderedlist>
<listitem>
<para>ISBN13, ISMN13, ISSN13 numbers are all EAN13 numbers</para>
</listitem>
<listitem>
<para>EAN13 numbers aren't always ISBN13, ISMN13 or ISSN13 (some are)</para>
</listitem>
<listitem>
<para>some ISBN13 numbers can be displayed as ISBN</para>
</listitem>
<listitem>
<para>some ISMN13 numbers can be displayed as ISMN</para>
</listitem>
<listitem>
<para>some ISSN13 numbers can be displayed as ISSN</para>
</listitem>
<listitem>
<para>all UPC, ISBN, ISMN and ISSN can be represented as EAN13 numbers</para>
</listitem>
</orderedlist>
<note>
<para>
All types are internally represented as 64 bit integers,
and internally all are consistently interchangeable.
</para>
</note>
<note>
<para>
We have two operator classes (for btree and for hash) so each data type
can be indexed for faster access.
</para>
</note>
<sect2>
<title>Data types</title>
<para>
We have the following data types:
</para>
<table>
<title>Data types</title>
<title><filename>isn</filename> data types</title>
<tgroup cols="2">
<thead>
<row>
<entry><para>Data type</para></entry>
<entry><para>Description</para></entry>
<entry>Data type</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><para><literal>EAN13</literal></para></entry>
<entry><type>EAN13</type></entry>
<entry>
<para>
European Article Numbers. This type will always show the EAN13-display
format. Te output function for this is <literal>ean13_out()</literal>
</para>
European Article Numbers, always displayed in the EAN13 display format
</entry>
</row>
<row>
<entry><para><literal>ISBN13</literal></para></entry>
<entry><type>ISBN13</type></entry>
<entry>
<para>
For International Standard Book Numbers to be displayed in
the new EAN13-display format.
</para>
International Standard Book Numbers to be displayed in
the new EAN13 display format
</entry>
</row>
<row>
<entry><para><literal>ISMN13</literal></para></entry>
<entry><type>ISMN13</type></entry>
<entry>
<para>
For International Standard Music Numbers to be displayed in
the new EAN13-display format.
</para>
International Standard Music Numbers to be displayed in
the new EAN13 display format
</entry>
</row>
<row>
<entry><para><literal>ISSN13</literal></para></entry>
<entry><type>ISSN13</type></entry>
<entry>
<para>
For International Standard Serial Numbers to be displayed in the new
EAN13-display format.
</para>
International Standard Serial Numbers to be displayed in the new
EAN13 display format
</entry>
</row>
<row>
<entry><para><literal>ISBN</literal></para></entry>
<entry><type>ISBN</type></entry>
<entry>
<para>
For International Standard Book Numbers to be displayed in the current
short-display format.
</para>
International Standard Book Numbers to be displayed in the old
short display format
</entry>
</row>
<row>
<entry><para><literal>ISMN</literal></para></entry>
<entry><type>ISMN</type></entry>
<entry>
<para>
For International Standard Music Numbers to be displayed in the
current short-display format.
</para>
International Standard Music Numbers to be displayed in the
old short display format
</entry>
</row>
<row>
<entry><para><literal>ISSN</literal></para></entry>
<entry><type>ISSN</type></entry>
<entry>
<para>
For International Standard Serial Numbers to be displayed in the
current short-display format. These types will display the short
version of the ISxN (ISxN 10) whenever it's possible, and it will
show ISxN 13 when it's impossible to show the short version. The
output function to do this is <literal>isn_out()</literal>
</para>
International Standard Serial Numbers to be displayed in the
old short display format
</entry>
</row>
<row>
<entry><para><literal>UPC</literal></para></entry>
<entry><type>UPC</type></entry>
<entry>
<para>
For Universal Product Codes. UPC numbers are a subset of the EAN13
numbers (they are basically EAN13 without the first '0' digit.)
The output function to do this is also <literal>isn_out()</literal>
</para>
Universal Product Codes
</entry>
</row>
</tbody>
</tgroup>
</table>
<note>
<para>
<literal>EAN13</literal>, <literal>ISBN13</literal>,
<literal>ISMN13</literal> and <literal>ISSN13</literal> types will always
display the long version of the ISxN (EAN13). The output function to do
this is <literal>ean13_out()</literal>.
</para>
<para>
The need for these types is just for displaying in different ways the same
data: <literal>ISBN13</literal> is actually the same as
<literal>ISBN</literal>, <literal>ISMN13=ISMN</literal> and
<literal>ISSN13=ISSN</literal>.
</para>
</note>
</sect2>
<sect2>
<title>Input functions</title>
<para>
We have the following input functions:
Some notes:
</para>
<table>
<title>Input functions</title>
<tgroup cols="2">
<thead>
<row>
<entry>Function</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><para><literal>ean13_in()</literal></para></entry>
<entry>
<para>
To take a string and return an EAN13.
</para>
</entry>
</row>
<row>
<entry><para><literal>isbn_in()</literal></para></entry>
<entry>
<para>
To take a string and return valid ISBN or ISBN13 numbers.
</para>
</entry>
</row>
<row>
<entry><para><literal>ismn_in()</literal></para></entry>
<entry>
<para>
To take a string and return valid ISMN or ISMN13 numbers.
</para>
</entry>
</row>
<row>
<entry><para><literal>issn_in()</literal></para></entry>
<entry>
<para>
To take a string and return valid ISSN or ISSN13 numbers.
</para>
</entry>
</row>
<row>
<entry><para><literal>upc_in()</literal></para></entry>
<entry>
<para>
To take a string and return an UPC codes.
</para>
</entry>
</row>
</tbody>
</tgroup>
</table>
</sect2>
<sect2>
<title>Casts</title>
<para>
We are able to cast from:
</para>
<itemizedlist>
<orderedlist>
<listitem>
<para>
ISBN13 -&gt; EAN13
</para>
<para>ISBN13, ISMN13, ISSN13 numbers are all EAN13 numbers.</para>
</listitem>
<listitem>
<para>
ISMN13 -&gt; EAN13
</para>
<para>EAN13 numbers aren't always ISBN13, ISMN13 or ISSN13 (some
are).</para>
</listitem>
<listitem>
<para>
ISSN13 -&gt; EAN13
</para>
<para>Some ISBN13 numbers can be displayed as ISBN.</para>
</listitem>
<listitem>
<para>
ISBN -&gt; EAN13
</para>
<para>Some ISMN13 numbers can be displayed as ISMN.</para>
</listitem>
<listitem>
<para>
ISMN -&gt; EAN13
</para>
<para>Some ISSN13 numbers can be displayed as ISSN.</para>
</listitem>
<listitem>
<para>
ISSN -&gt; EAN13
</para>
<para>UPC numbers are a subset of the EAN13 numbers (they are basically
EAN13 without the first <literal>0</> digit).</para>
</listitem>
<listitem>
<para>
UPC -&gt; EAN13
</para>
<para>All UPC, ISBN, ISMN and ISSN numbers can be represented as EAN13
numbers.</para>
</listitem>
</orderedlist>
<para>
Internally, all these types use the same representation (a 64-bit
integer), and all are interchangeable. Multiple types are provided
to control display formatting and to permit tighter validity checking
of input that is supposed to denote one particular type of number.
</para>
<para>
The <type>ISBN</>, <type>ISMN</>, and <type>ISSN</> types will display the
short version of the number (ISxN 10) whenever it's possible, and will show
ISxN 13 format for numbers that do not fit in the short version.
The <type>EAN13</type>, <type>ISBN13</type>, <type>ISMN13</type> and
<type>ISSN13</type> types will always display the long version of the ISxN
(EAN13).
</para>
</sect2>
<sect2>
<title>Casts</title>
<para>
The <filename>isn</> module provides the following pairs of type casts:
</para>
<itemizedlist>
<listitem>
<para>
ISBN &lt;-&gt; ISBN13
ISBN13 &lt;=&gt; EAN13
</para>
</listitem>
<listitem>
<para>
ISMN &lt;-&gt; ISMN13
ISMN13 &lt;=&gt; EAN13
</para>
</listitem>
<listitem>
<para>
ISSN &lt;-&gt; ISSN13
ISSN13 &lt;=&gt; EAN13
</para>
</listitem>
</itemizedlist>
</sect2>
<sect2>
<title>C API</title>
<para>
The C API is implemented as:
</para>
<programlisting>
extern Datum isn_out(PG_FUNCTION_ARGS);
extern Datum ean13_out(PG_FUNCTION_ARGS);
extern Datum ean13_in(PG_FUNCTION_ARGS);
extern Datum isbn_in(PG_FUNCTION_ARGS);
extern Datum ismn_in(PG_FUNCTION_ARGS);
extern Datum issn_in(PG_FUNCTION_ARGS);
extern Datum upc_in(PG_FUNCTION_ARGS);
</programlisting>
<para>
On success:
</para>
<itemizedlist>
<listitem>
<para>
<literal>isn_out()</literal> takes any of our types and returns a string containing
the shortes possible representation of the number.
ISBN &lt;=&gt; EAN13
</para>
</listitem>
<listitem>
<para>
<literal>ean13_out()</literal> takes any of our types and returns the
EAN13 (long) representation of the number.
ISMN &lt;=&gt; EAN13
</para>
</listitem>
<listitem>
<para>
<literal>ean13_in()</literal> takes a string and return a EAN13. Which, as stated in (2)
could or could not be any of our types, but it certainly is an EAN13
number. Only if the string is a valid EAN13 number, otherwise it fails.
ISSN &lt;=&gt; EAN13
</para>
</listitem>
<listitem>
<para>
<literal>isbn_in()</literal> takes a string and return an ISBN/ISBN13. Only if the string
is really a ISBN/ISBN13, otherwise it fails.
UPC &lt;=&gt; EAN13
</para>
</listitem>
<listitem>
<para>
<literal>ismn_in()</literal> takes a string and return an ISMN/ISMN13. Only if the string
is really a ISMN/ISMN13, otherwise it fails.
ISBN &lt;=&gt; ISBN13
</para>
</listitem>
<listitem>
<para>
<literal>issn_in()</literal> takes a string and return an ISSN/ISSN13. Only if the string
is really a ISSN/ISSN13, otherwise it fails.
ISMN &lt;=&gt; ISMN13
</para>
</listitem>
<listitem>
<para>
<literal>upc_in()</literal> takes a string and return an UPC. Only if the string is
really a UPC, otherwise it fails.
ISSN &lt;=&gt; ISSN13
</para>
</listitem>
</itemizedlist>
<para>
(on failure, the functions 'ereport' the error)
When casting from <type>EAN13</> to another type, there is a run-time
check that the value is within the domain of the other type, and an error
is thrown if not. The other casts are simply relabelings that will
always succeed.
</para>
</sect2>
<sect2>
<title>Testing functions</title>
<title>Functions and Operators</title>
<para>
The <filename>isn</> module provides the standard comparison operators,
plus btree and hash indexing support for all these datatypes. In
addition there are several specialized functions. In this table,
<type>isn</> means any one of the module's data types:
</para>
<table>
<title>Testing functions</title>
<tgroup cols="2">
<title><filename>isn</> functions</title>
<tgroup cols="3">
<thead>
<row>
<entry><para>Function</para></entry>
<entry><para>Description</para></entry>
<entry>Function</entry>
<entry>Returns</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><para><literal>isn_weak(boolean)</literal></para></entry>
<entry><para>Sets the weak input mode.</para></entry>
<entry><function>isn_weak(boolean)</function></entry>
<entry><type>boolean</type></entry>
<entry>Sets the weak input mode (returns new setting)</entry>
</row>
<row>
<entry><para><literal>isn_weak()</literal></para></entry>
<entry><para>Gets the current status of the weak mode.</para></entry>
<entry><function>isn_weak()</function></entry>
<entry><type>boolean</type></entry>
<entry>Gets the current status of the weak mode</entry>
</row>
<row>
<entry><para><literal>make_valid()</literal></para></entry>
<entry><para>Validates an invalid number (deleting the invalid flag).</para></entry>
<entry><function>make_valid(isn)</function></entry>
<entry><type>isn</type></entry>
<entry>Validates an invalid number (clears the invalid flag)</entry>
</row>
<row>
<entry><para><literal>is_valid()</literal></para></entry>
<entry><para>Checks for the invalid flag prsence.</para></entry>
<entry><function>is_valid(isn)</function></entry>
<entry><type>boolean</type></entry>
<entry>Checks for the presence of the invalid flag</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
<literal>Weak</literal> mode is used to be able to insert invalid data to
a table. Invalid as in the check digit being wrong, not missing numbers.
<firstterm>Weak</firstterm> mode is used to be able to insert invalid data
into a table. Invalid means the check digit is wrong, not that there are
missing numbers.
</para>
<para>
Why would you want to use the weak mode? Well, it could be that
you have a huge collection of ISBN numbers, and that there are so many of
them that for weird reasons some have the wrong check digit (perhaps the
numbers where scanned from a printed list and the OCR got the numbers wrong,
perhaps the numbers were manually captured... who knows.) Anyway, the thing
is you might want to clean the mess up, but you still want to be able to have
all the numbers in your database and maybe use an external tool to access
the invalid numbers in the database so you can verify the information and
validate it more easily; as selecting all the invalid numbers in the table.
numbers were scanned from a printed list and the OCR got the numbers wrong,
perhaps the numbers were manually captured... who knows). Anyway, the point
is you might want to clean the mess up, but you still want to be able to
have all the numbers in your database and maybe use an external tool to
locate the invalid numbers in the database so you can verify the
information and validate it more easily; so for example you'd want to
select all the invalid numbers in the table.
</para>
<para>
When you insert invalid numbers in a table using the weak mode, the number
will be inserted with the corrected check digit, but it will be flagged
with an exclamation mark ('!') at the end (i.e. 0-11-000322-5!)
will be inserted with the corrected check digit, but it will be displayed
with an exclamation mark (<literal>!</>) at the end, for example
<literal>0-11-000322-5!</>. This invalid marker can be checked with
the <function>is_valid</> function and cleared with the
<function>make_valid</> function.
</para>
<para>
You can also force the insertion of invalid numbers even when not in the
weak mode, by appending the <literal>!</> character at the end of the
number.
</para>
<para>
You can also force the insertion of invalid numbers even not in the weak mode,
appending the '!' character at the end of the number.
Another special feature is that during input, you can write
<literal>?</> in place of the check digit, and the correct check digit
will be inserted automatically.
</para>
</sect2>
<sect2>
<title>Examples</title>
<programlisting>
--Using the types directly:
SELECT isbn('978-0-393-04002-9');
......@@ -431,18 +301,18 @@ SELECT isbn13('0901690546');
SELECT issn('1436-4522');
--Casting types:
-- note that you can only cast from ean13 to other type when the casted
-- number would be valid in the realm of the casted type;
-- thus, the following will NOT work: select isbn(ean13('0220356483481'));
-- but these will:
SELECT upc(ean13('0220356483481'));
SELECT ean13(upc('220356483481'));
-- note that you can only cast from ean13 to another type when the
-- number would be valid in the realm of the target type;
-- thus, the following will NOT work: select isbn(ean13('0220356483481'));
-- but these will:
SELECT upc(ean13('0220356483481'));
SELECT ean13(upc('220356483481'));
--Create a table with a single column to hold ISBN numbers:
CREATE TABLE test ( id isbn );
CREATE TABLE test (id isbn);
INSERT INTO test VALUES('9780393040029');
--Automatically calculating check digits (observe the '?'):
--Automatically calculate check digits (observe the '?'):
INSERT INTO test VALUES('220500896?');
INSERT INTO test VALUES('978055215372?');
......@@ -455,10 +325,10 @@ INSERT INTO test VALUES('978-0-11-000533-4');
INSERT INTO test VALUES('9780141219307');
INSERT INTO test VALUES('2-205-00876-X');
SELECT isn_weak(false);
SELECT id FROM test WHERE NOT is_valid(id);
UPDATE test SET id=make_valid(id) WHERE id = '2-205-00876-X!';
UPDATE test SET id = make_valid(id) WHERE id = '2-205-00876-X!';
SELECT * FROM test;
SELECT isbn13(id) FROM test;
......@@ -467,8 +337,9 @@ SELECT isbn13(id) FROM test;
<sect2>
<title>Bibliography</title>
<para>
The information to implement this module was collected through
The information to implement this module was collected from
several sites, including:
</para>
<programlisting>
......@@ -477,26 +348,33 @@ SELECT isbn13(id) FROM test;
http://www.ismn-international.org/
http://www.wikipedia.org/
</programlisting>
<para>
the prefixes used for hyphenation where also compiled from:
The prefixes used for hyphenation were also compiled from:
</para>
<programlisting>
http://www.gs1.org/productssolutions/idkeys/support/prefix_list.html
http://www.isbn-international.org/en/identifiers.html
http://www.ismn-international.org/ranges.html
</programlisting>
<para>
Care was taken during the creation of the algorithms and they
were meticulously verified against the suggested algorithms
in the official ISBN, ISMN, ISSN User Manuals.
</para>
</sect2>
<sect2>
<title>Author</title>
<para>
Germán Méndez Bravo (Kronuz), 2004 - 2006
</para>
<para>
This module was inspired by Garrett A. Wollman's
isbn_issn code.
</para>
</sect2>
</sect1>
</sect1>
<!-- $PostgreSQL: pgsql/doc/src/sgml/ltree.sgml,v 1.3 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="ltree">
<title>ltree</title>
<indexterm zone="ltree">
<primary>ltree</primary>
</indexterm>
<para>
<literal>ltree</literal> is a PostgreSQL module that contains implementation
of data types, indexed access methods and queries for data organized as a
tree-like structures.
This module implements a data type <type>ltree</> for representing
labels of data stored in a hierarchical tree-like structure.
Extensive facilities for searching through label trees are provided.
</para>
<sect2>
<title>Definitions</title>
<para>
A <firstterm>label</firstterm> of a node is a sequence of one or more words
separated by blank character '_' and containing letters and digits ( for
example, [a-zA-Z0-9] for C locale). The length of a label is limited by 256
bytes.
A <firstterm>label</firstterm> is a sequence of alphanumeric characters
and underscores (for example, in C locale the characters
<literal>A-Za-z0-9_</> are allowed). Labels must be less than 256 bytes
long.
</para>
<para>
Example: 'Countries', 'Personal_Services'
Examples: <literal>42</>, <literal>Personal_Services</>
</para>
<para>
A <firstterm>label path</firstterm> of a node is a sequence of one or more
dot-separated labels l1.l2...ln, represents path from root to the node. The
length of a label path is limited by 65Kb, but size &lt;= 2Kb is preferrable.
We consider it's not a strict limitation (maximal size of label path for
DMOZ catalogue - <ulink url="http://www.dmoz.org"></ulink>, is about 240
bytes!)
A <firstterm>label path</firstterm> is a sequence of zero or more
labels separated by dots, for example <literal>L1.L2.L3</>, representing
a path from the root of a hierarchical tree to a particular node. The
length of a label path must be less than 65Kb, but keeping it under 2Kb is
preferable. In practice this is not a major limitation; for example,
the longest label path in the DMOZ catalogue (<ulink
url="http://www.dmoz.org"></ulink>) is about 240 bytes.
</para>
<para>
Example: <literal>'Top.Countries.Europe.Russia'</literal>
Example: <literal>Top.Countries.Europe.Russia</literal>
</para>
<para>
We introduce several datatypes:
The <filename>ltree</> module provides several datatypes:
</para>
<itemizedlist>
<listitem>
<para>
<literal>ltree</literal> - is a datatype for label path.
</para>
</listitem>
<listitem>
<para>
<literal>ltree[]</literal> - is a datatype for arrays of ltree.
<type>ltree</type> stores a label path.
</para>
</listitem>
<listitem>
<para>
<literal>lquery</literal>
- is a path expression that has regular expression in the label path and
used for ltree matching. Star symbol (*) is used to specify any number of
labels (levels) and could be used at the beginning and the end of lquery,
for example, '*.Europe.*'.
<type>lquery</type> represents a regular-expression-like pattern
for matching <type>ltree</> values. A simple word matches that
label within a path. A star symbol (<literal>*</>) matches zero
or more labels. For example:
<programlisting>
foo <lineannotation>Match the exact label path <literal>foo</></lineannotation>
*.foo.* <lineannotation>Match any label path containing the label <literal>foo</></lineannotation>
*.foo <lineannotation>Match any label path whose last label is <literal>foo</></lineannotation>
</programlisting>
</para>
<para>
The following quantifiers are recognized for '*' (like in Perl):
Star symbols can also be quantified to restrict how many labels
they can match:
<programlisting>
*{<replaceable>n</>} <lineannotation>Match exactly <replaceable>n</> labels</lineannotation>
*{<replaceable>n</>,} <lineannotation>Match at least <replaceable>n</> labels</lineannotation>
*{<replaceable>n</>,<replaceable>m</>} <lineannotation>Match at least <replaceable>n</> but not more than <replaceable>m</> labels</lineannotation>
*{,<replaceable>m</>} <lineannotation>Match at most <replaceable>m</> labels &mdash; same as </lineannotation> *{0,<replaceable>m</>}
</programlisting>
</para>
<itemizedlist>
<listitem>
<para>{n} Match exactly n levels</para>
</listitem>
<listitem>
<para>{n,} Match at least n levels</para>
</listitem>
<listitem>
<para>{n,m} Match at least n but not more than m levels</para>
</listitem>
<listitem>
<para>{,m} Match at maximum m levels (eq. to {0,m})</para>
</listitem>
</itemizedlist>
<para>
It is possible to use several modifiers at the end of a label:
There are several modifiers that can be put at the end of a non-star
label in <type>lquery</> to make it match more than just the exact match:
<programlisting>
@ <lineannotation>Match case-insensitively, for example <literal>a@</> matches <literal>A</></lineannotation>
* <lineannotation>Match any label with this prefix, for example <literal>foo*</> matches <literal>foobar</></lineannotation>
% <lineannotation>Match initial underscore-separated words</lineannotation>
</programlisting>
The behavior of <literal>%</> is a bit complicated. It tries to match
words rather than the entire label. For example
<literal>foo_bar%</> matches <literal>foo_bar_baz</> but not
<literal>foo_barbaz</>. If combined with <literal>*</>, prefix
matching applies to each word separately, for example
<literal>foo_bar%*</> matches <literal>foo1_bar2_baz</> but
not <literal>foo1_br2_baz</>.
</para>
<itemizedlist>
<listitem>
<para>@ Do case-insensitive label matching</para>
</listitem>
<listitem>
<para>* Do prefix matching for a label</para>
</listitem>
<listitem>
<para>% Don't account word separator '_' in label matching, that is
'Russian%' would match 'Russian_nations', but not 'Russian'
</para>
</listitem>
</itemizedlist>
<para>
<literal>lquery</literal> can contain logical '!' (NOT) at the beginning
of the label and '|' (OR) to specify possible alternatives for label
matching.
</para>
<para>
Example of <literal>lquery</literal>:
Also, you can write several possibly-modified labels separated with
<literal>|</> (OR) to match any of those labels, and you can put
<literal>!</> (NOT) at the start to match any label that doesn't
match any of the alternatives.
</para>
<programlisting>
<para>
Here's an annotated example of <type>lquery</type>:
<programlisting>
Top.*{0,2}.sport*@.!football|tennis.Russ*|Spain
a) b) c) d) e)
a. b. c. d. e.
</programlisting>
<para>
A label path should
This query will match any label path that:
</para>
<orderedlist numeration='loweralpha'>
<listitem>
<para>
begin from a node with label 'Top'
begins with the label <literal>Top</literal>
</para>
</listitem>
<listitem>
<para>
and following zero or 2 labels until
and next has zero to two labels before
</para>
</listitem>
<listitem>
<para>
a node with label beginning from case-insensitive prefix 'sport'
a label beginning with the case-insensitive prefix <literal>sport</literal>
</para>
</listitem>
<listitem>
<para>
following node with label not matched 'football' or 'tennis' and
then a label not matching <literal>football</literal> nor
<literal>tennis</literal>
</para>
</listitem>
<listitem>
<para>
end on node with label beginning from 'Russ' or strictly matched
'Spain'.
and then ends with a label beginning with <literal>Russ</literal> or
exactly matching <literal>Spain</literal>.
</para>
</listitem>
</orderedlist>
</listitem>
<listitem>
<para><literal>ltxtquery</literal>
- is a datatype for label searching (like type 'query' for full text
searching, see contrib/tsearch). It's possible to use modifiers @,%,* at
the end of word. The meaning of modifiers are the same as for lquery.
</para>
<para>
Example: <literal>'Europe &amp; Russia*@ &amp; !Transportation'</literal>
<para><type>ltxtquery</type> represents a full-text-search-like
pattern for matching <type>ltree</> values. An
<type>ltxtquery</type> value contains words, possibly with the
modifiers <literal>@</>, <literal>*</>, <literal>%</> at the end;
the modifiers have the same meanings as in <type>lquery</>.
Words can be combined with <literal>&amp;</> (AND),
<literal>|</> (OR), <literal>!</> (NOT), and parentheses.
The key difference from
<type>lquery</> is that <type>ltxtquery</type> matches words without
regard to their position in the label path.
</para>
<para>
Search paths contain words 'Europe' and 'Russia*' (case-insensitive) and
not 'Transportation'. Notice, the order of words as they appear in label
path is not important !
Here's an example <type>ltxtquery</type>:
<programlisting>
Europe &amp; Russia*@ &amp; !Transportation
</programlisting>
This will match paths that contain the label <literal>Europe</literal> and
any label beginning with <literal>Russia</literal> (case-insensitive),
but not paths containing the label <literal>Transportation</literal>.
The location of these words within the path is not important.
Also, when <literal>%</> is used, the word can be matched to any
underscore-separated word within a label, regardless of position.
</para>
</listitem>
</itemizedlist>
</sect2>
<sect2>
<title>Operations</title>
<para>
The following operations are defined for type ltree:
Note: <type>ltxtquery</> allows whitespace between symbols, but
<type>ltree</> and <type>lquery</> do not.
</para>
</sect2>
<sect2>
<title>Operators and Functions</title>
<itemizedlist>
<listitem>
<para>
<literal>&lt;,&gt;,&lt;=,&gt;=,=, &lt;&gt;</literal>
- Have their usual meanings. Comparison is doing in the order of direct
tree traversing, children of a node are sorted lexicographic.
</para>
</listitem>
<listitem>
<para>
<literal>ltree @&gt; ltree</literal>
- returns TRUE if left argument is an ancestor of right argument (or
equal).
</para>
</listitem>
<listitem>
<para>
<literal>ltree &lt;@ ltree </literal>
- returns TRUE if left argument is a descendant of right argument (or
equal).
</para>
</listitem>
<listitem>
<para>
<literal>ltree ~ lquery, lquery ~ ltree</literal>
- return TRUE if node represented by ltree satisfies lquery.
</para>
</listitem>
<listitem>
<para>
<literal>ltree ? lquery[], lquery ? ltree[]</literal>
- return TRUE if node represented by ltree satisfies at least one lquery
from array.
</para>
</listitem>
<listitem>
<para>
<literal>ltree @ ltxtquery, ltxtquery @ ltree</literal>
- return TRUE if node represented by ltree satisfies ltxtquery.
</para>
</listitem>
<listitem>
<para>
<literal>ltree || ltree, ltree || text, text || ltree</literal>
- return concatenated ltree.
</para>
</listitem>
</itemizedlist>
<para>
Operations for arrays of ltree (<literal>ltree[]</literal>):
Type <type>ltree</> has the usual comparison operators
<literal>=</>, <literal>&lt;&gt;</literal>,
<literal>&lt;</>, <literal>&gt;</>, <literal>&lt;=</>, <literal>&gt;=</>.
Comparison sorts in the order of a tree traversal, with the children
of a node sorted by label text. In addition, there are the following
specialized operators:
</para>
<itemizedlist>
<listitem>
<para>
<literal>ltree[] @&gt; ltree, ltree &lt;@ ltree[]</literal>
- returns TRUE if array ltree[] contains an ancestor of ltree.
</para>
</listitem>
<listitem>
<para>
<literal>ltree @&gt; ltree[], ltree[] &lt;@ ltree</literal>
- returns TRUE if array ltree[] contains a descendant of ltree.
</para>
</listitem>
<listitem>
<para>
<literal>ltree[] ~ lquery, lquery ~ ltree[]</literal>
- returns TRUE if array ltree[] contains label paths matched lquery.
</para>
</listitem>
<listitem>
<para>
<literal>ltree[] ? lquery[], lquery[] ? ltree[]</literal>
- returns TRUE if array ltree[] contains label paths matched atleaset one
lquery from array.
</para>
</listitem>
<listitem>
<para>
<literal>ltree[] @ ltxtquery, ltxtquery @ ltree[]</literal>
- returns TRUE if array ltree[] contains label paths matched ltxtquery
(full text search).
</para>
</listitem>
<listitem>
<para>
<literal>ltree[] ?@&gt; ltree, ltree ?&lt;@ ltree[], ltree[] ?~ lquery, ltree[] ?@ ltxtquery</literal>
- returns first element of array ltree[] satisfies corresponding condition
and NULL in vice versa.
</para>
</listitem>
</itemizedlist>
</sect2>
<table id="ltree-op-table">
<title><type>ltree</> Operators</title>
<sect2>
<title>Remark</title>
<tgroup cols="3">
<thead>
<row>
<entry>Operator</entry>
<entry>Returns</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><type>ltree</> <literal>@&gt;</> <type>ltree</></entry>
<entry><type>boolean</type></entry>
<entry>is left argument an ancestor of right (or equal)?</entry>
</row>
<row>
<entry><type>ltree</> <literal>&lt;@</> <type>ltree</></entry>
<entry><type>boolean</type></entry>
<entry>is left argument a descendant of right (or equal)?</entry>
</row>
<row>
<entry><type>ltree</> <literal>~</> <type>lquery</></entry>
<entry><type>boolean</type></entry>
<entry>does <type>ltree</> match <type>lquery</>?</entry>
</row>
<row>
<entry><type>lquery</> <literal>~</> <type>ltree</></entry>
<entry><type>boolean</type></entry>
<entry>does <type>ltree</> match <type>lquery</>?</entry>
</row>
<row>
<entry><type>ltree</> <literal>?</> <type>lquery[]</></entry>
<entry><type>boolean</type></entry>
<entry>does <type>ltree</> match any <type>lquery</> in array?</entry>
</row>
<row>
<entry><type>lquery[]</> <literal>?</> <type>ltree</></entry>
<entry><type>boolean</type></entry>
<entry>does <type>ltree</> match any <type>lquery</> in array?</entry>
</row>
<row>
<entry><type>ltree</> <literal>@</> <type>ltxtquery</></entry>
<entry><type>boolean</type></entry>
<entry>does <type>ltree</> match <type>ltxtquery</>?</entry>
</row>
<row>
<entry><type>ltxtquery</> <literal>@</> <type>ltree</></entry>
<entry><type>boolean</type></entry>
<entry>does <type>ltree</> match <type>ltxtquery</>?</entry>
</row>
<row>
<entry><type>ltree</> <literal>||</> <type>ltree</></entry>
<entry><type>ltree</type></entry>
<entry>concatenate <type>ltree</> paths</entry>
</row>
<row>
<entry><type>ltree</> <literal>||</> <type>text</></entry>
<entry><type>ltree</type></entry>
<entry>convert text to <type>ltree</> and concatenate</entry>
</row>
<row>
<entry><type>text</> <literal>||</> <type>ltree</></entry>
<entry><type>ltree</type></entry>
<entry>convert text to <type>ltree</> and concatenate</entry>
</row>
<row>
<entry><type>ltree[]</> <literal>@&gt;</> <type>ltree</></entry>
<entry><type>boolean</type></entry>
<entry>does array contain an ancestor of <type>ltree</>?</entry>
</row>
<row>
<entry><type>ltree</> <literal>&lt;@</> <type>ltree[]</></entry>
<entry><type>boolean</type></entry>
<entry>does array contain an ancestor of <type>ltree</>?</entry>
</row>
<row>
<entry><type>ltree[]</> <literal>&lt;@</> <type>ltree</></entry>
<entry><type>boolean</type></entry>
<entry>does array contain a descendant of <type>ltree</>?</entry>
</row>
<row>
<entry><type>ltree</> <literal>@&gt;</> <type>ltree[]</></entry>
<entry><type>boolean</type></entry>
<entry>does array contain a descendant of <type>ltree</>?</entry>
</row>
<row>
<entry><type>ltree[]</> <literal>~</> <type>lquery</></entry>
<entry><type>boolean</type></entry>
<entry>does array contain any path matching <type>lquery</>?</entry>
</row>
<row>
<entry><type>lquery</> <literal>~</> <type>ltree[]</></entry>
<entry><type>boolean</type></entry>
<entry>does array contain any path matching <type>lquery</>?</entry>
</row>
<row>
<entry><type>ltree[]</> <literal>?</> <type>lquery[]</></entry>
<entry><type>boolean</type></entry>
<entry>does <type>ltree</> array contain any path matching any <type>lquery</>?</entry>
</row>
<row>
<entry><type>lquery[]</> <literal>?</> <type>ltree[]</></entry>
<entry><type>boolean</type></entry>
<entry>does <type>ltree</> array contain any path matching any <type>lquery</>?</entry>
</row>
<row>
<entry><type>ltree[]</> <literal>@</> <type>ltxtquery</></entry>
<entry><type>boolean</type></entry>
<entry>does array contain any path matching <type>ltxtquery</>?</entry>
</row>
<row>
<entry><type>ltxtquery</> <literal>@</> <type>ltree[]</></entry>
<entry><type>boolean</type></entry>
<entry>does array contain any path matching <type>ltxtquery</>?</entry>
</row>
<row>
<entry><type>ltree[]</> <literal>?@&gt;</> <type>ltree</></entry>
<entry><type>ltree</type></entry>
<entry>first array entry that is an ancestor of <type>ltree</>; NULL if none</entry>
</row>
<row>
<entry><type>ltree[]</> <literal>?&lt;@</> <type>ltree</></entry>
<entry><type>ltree</type></entry>
<entry>first array entry that is a descendant of <type>ltree</>; NULL if none</entry>
</row>
<row>
<entry><type>ltree[]</> <literal>?~</> <type>lquery</></entry>
<entry><type>ltree</type></entry>
<entry>first array entry that matches <type>lquery</>; NULL if none</entry>
</row>
<row>
<entry><type>ltree[]</> <literal>?@</> <type>ltxtquery</></entry>
<entry><type>ltree</type></entry>
<entry>first array entry that matches <type>ltxtquery</>; NULL if none</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
Operations <literal>&lt;@</literal>, <literal>@&gt;</literal>, <literal>@</literal> and
<literal>~</literal> have analogues - <literal>^&lt;@, ^@&gt;, ^@, ^~,</literal> which don't use
indices!
The operators <literal>&lt;@</literal>, <literal>@&gt;</literal>,
<literal>@</literal> and <literal>~</literal> have analogues
<literal>^&lt;@</>, <literal>^@&gt;</>, <literal>^@</>,
<literal>^~</literal>, which are the same except they do not use
indexes. These are useful only for testing purposes.
</para>
<para>
The following functions are available:
</para>
<table id="ltree-func-table">
<title><type>ltree</> Functions</title>
<tgroup cols="5">
<thead>
<row>
<entry>Function</entry>
<entry>Return Type</entry>
<entry>Description</entry>
<entry>Example</entry>
<entry>Result</entry>
</row>
</thead>
<tbody>
<row>
<entry><function>subltree(ltree, int start, int end)</function></entry>
<entry><type>ltree</type></entry>
<entry>subpath of <type>ltree</> from position <parameter>start</> to
position <parameter>end</>-1 (counting from 0)</entry>
<entry><literal>subltree('Top.Child1.Child2',1,2)</literal></entry>
<entry><literal>Child1</literal></entry>
</row>
<row>
<entry><function>subpath(ltree, int offset, int len)</function></entry>
<entry><type>ltree</type></entry>
<entry>subpath of <type>ltree</> starting at position
<parameter>offset</>, length <parameter>len</>.
If <parameter>offset</> is negative, subpath starts that far from the
end of the path. If <parameter>len</> is negative, leaves that many
labels off the end of the path.</entry>
<entry><literal>subpath('Top.Child1.Child2',0,2)</literal></entry>
<entry><literal>Top.Child1</literal></entry>
</row>
<row>
<entry><function>subpath(ltree, int offset)</function></entry>
<entry><type>ltree</type></entry>
<entry>subpath of <type>ltree</> starting at position
<parameter>offset</>, extending to end of path.
If <parameter>offset</> is negative, subpath starts that far from the
end of the path.</entry>
<entry><literal>subpath('Top.Child1.Child2',1)</literal></entry>
<entry><literal>Child1.Child2</literal></entry>
</row>
<row>
<entry><function>nlevel(ltree)</function></entry>
<entry><type>integer</type></entry>
<entry>number of labels in path</entry>
<entry><literal>nlevel('Top.Child1.Child2')</literal></entry>
<entry><literal>3</literal></entry>
</row>
<row>
<entry><function>index(ltree a, ltree b)</function></entry>
<entry><type>integer</type></entry>
<entry>position of first occurrence of <parameter>b</> in
<parameter>a</>; -1 if not found</entry>
<entry><literal>index('0.1.2.3.5.4.5.6.8.5.6.8','5.6')</literal></entry>
<entry><literal>6</literal></entry>
</row>
<row>
<entry><function>index(ltree a, ltree b, int offset)</function></entry>
<entry><type>integer</type></entry>
<entry>position of first occurrence of <parameter>b</> in
<parameter>a</>, searching starting at <parameter>offset</>;
negative <parameter>offset</> means start <parameter>-offset</>
labels from the end of the path</entry>
<entry><literal>index('0.1.2.3.5.4.5.6.8.5.6.8','5.6',-4)</literal></entry>
<entry><literal>9</literal></entry>
</row>
<row>
<entry><function>text2ltree(text)</function></entry>
<entry><type>ltree</type></entry>
<entry>cast <type>text</> to <type>ltree</></entry>
<entry><literal></literal></entry>
<entry><literal></literal></entry>
</row>
<row>
<entry><function>ltree2text(ltree)</function></entry>
<entry><type>text</type></entry>
<entry>cast <type>ltree</> to <type>text</></entry>
<entry><literal></literal></entry>
<entry><literal></literal></entry>
</row>
<row>
<entry><function>lca(ltree, ltree, ...)</function></entry>
<entry><type>ltree</type></entry>
<entry>lowest common ancestor, i.e., longest common prefix of paths
(up to 8 arguments supported)</entry>
<entry><literal>lca('1.2.2.3','1.2.3.4.5.6')</literal></entry>
<entry><literal>1.2</literal></entry>
</row>
<row>
<entry><function>lca(ltree[])</function></entry>
<entry><type>ltree</type></entry>
<entry>lowest common ancestor, i.e., longest common prefix of paths</entry>
<entry><literal>lca(array['1.2.2.3'::ltree,'1.2.3'])</literal></entry>
<entry><literal>1.2</literal></entry>
</row>
</tbody>
</tgroup>
</table>
</sect2>
<sect2>
<title>Indices</title>
<title>Indexes</title>
<para>
Various indices could be created to speed up execution of operations:
<filename>ltree</> supports several types of indexes that can speed
up the indicated operators:
</para>
<itemizedlist>
<listitem>
<para>
B-tree index over ltree: <literal>&lt;, &lt;=, =, &gt;=, &gt;</literal>
B-tree index over <type>ltree</>:
<literal>&lt;</>, <literal>&lt;=</>, <literal>=</>,
<literal>&gt;=</>, <literal>&gt;</literal>
</para>
</listitem>
<listitem>
<para>
GiST index over ltree: <literal>&lt;, &lt;=, =, &gt;=, &gt;, @&gt;, &lt;@, @, ~, ?</literal>
GiST index over <type>ltree</>:
<literal>&lt;</>, <literal>&lt;=</>, <literal>=</>,
<literal>&gt;=</>, <literal>&gt;</>,
<literal>@&gt;</>, <literal>&lt;@</>,
<literal>@</>, <literal>~</>, <literal>?</literal>
</para>
<para>
Example:
Example of creating such an index:
</para>
<programlisting>
CREATE INDEX path_gist_idx ON test USING GIST (path);
</programlisting>
</listitem>
<listitem>
<para>GiST index over ltree[]:
<literal>ltree[]&lt;@ ltree, ltree @&gt; ltree[], @, ~, ?.</literal>
</para>
<para>
Example:
GiST index over <type>ltree[]</>:
<literal>ltree[] &lt;@ ltree</>, <literal>ltree @&gt; ltree[]</>,
<literal>@</>, <literal>~</>, <literal>?</literal>
</para>
<programlisting>
CREATE INDEX path_gist_idx ON test USING GIST (array_path);
</programlisting>
<para>
Notices: This index is lossy.
</para>
</listitem>
</itemizedlist>
</sect2>
<sect2>
<title>Functions</title>
<itemizedlist>
<listitem>
<para>
<literal>ltree subltree(ltree, start, end)</literal>
returns subpath of ltree from start (inclusive) until the end.
Example of creating such an index:
</para>
<programlisting>
# select subltree('Top.Child1.Child2',1,2);
subltree
--------
Child1
CREATE INDEX path_gist_idx ON test USING GIST (array_path);
</programlisting>
</listitem>
<listitem>
<para>
<literal>ltree subpath(ltree, OFFSET,LEN)</literal> and
<literal>ltree subpath(ltree, OFFSET)</literal>
returns subpath of ltree from OFFSET (inclusive) with length LEN.
If OFFSET is negative returns subpath starts that far from the end
of the path. If LENGTH is omitted, returns everything to the end
of the path. If LENGTH is negative, leaves that many labels off
the end of the path.
Note: This index type is lossy.
</para>
<programlisting>
# select subpath('Top.Child1.Child2',1,2);
subpath
-------
Child1.Child2
# select subpath('Top.Child1.Child2',-2,1);
subpath
---------
Child1
</programlisting>
</listitem>
<listitem>
<para>
<literal>int4 nlevel(ltree)</literal> - returns level of the node.
</para>
<programlisting>
# select nlevel('Top.Child1.Child2');
nlevel
--------
3
</programlisting>
<para>
Note, that arguments start, end, OFFSET, LEN have meaning of level of the
node !
</para>
</listitem>
<listitem>
<para>
<literal>int4 index(ltree,ltree)</literal> and
<literal>int4 index(ltree,ltree,OFFSET)</literal>
returns number of level of the first occurence of second argument in first
one beginning from OFFSET. if OFFSET is negative, than search begins from |
OFFSET| levels from the end of the path.
</para>
<programlisting>
SELECT index('0.1.2.3.5.4.5.6.8.5.6.8','5.6',3);
index
-------
6
SELECT index('0.1.2.3.5.4.5.6.8.5.6.8','5.6',-4);
index
-------
9
</programlisting>
</listitem>
<listitem>
<para>
<literal>ltree text2ltree(text)</literal> and
<literal>text ltree2text(text)</literal> cast functions for ltree and text.
</para>
</listitem>
<listitem>
<para>
<literal>ltree lca(ltree,ltree,...) (up to 8 arguments)</literal> and
<literal>ltree lca(ltree[])</literal> Returns Lowest Common Ancestor (lca).
</para>
<programlisting>
# select lca('1.2.2.3','1.2.3.4.5.6');
lca
-----
1.2
# select lca('{la.2.3,1.2.3.4.5.6}') is null;
?column?
----------
f
</programlisting>
</listitem>
</itemizedlist>
</sect2>
<sect2>
<title>Installation</title>
<programlisting>
cd contrib/ltree
make
make install
make installcheck
</programlisting>
</sect2>
<sect2>
<title>Example</title>
<para>
This example uses the following data (also available in file
<filename>contrib/ltree/ltreetest.sql</> in the source distribution):
</para>
<programlisting>
createdb ltreetest
psql ltreetest &lt; /usr/local/pgsql/share/contrib/ltree.sql
psql ltreetest &lt; ltreetest.sql
CREATE TABLE test (path ltree);
INSERT INTO test VALUES ('Top');
INSERT INTO test VALUES ('Top.Science');
INSERT INTO test VALUES ('Top.Science.Astronomy');
INSERT INTO test VALUES ('Top.Science.Astronomy.Astrophysics');
INSERT INTO test VALUES ('Top.Science.Astronomy.Cosmology');
INSERT INTO test VALUES ('Top.Hobbies');
INSERT INTO test VALUES ('Top.Hobbies.Amateurs_Astronomy');
INSERT INTO test VALUES ('Top.Collections');
INSERT INTO test VALUES ('Top.Collections.Pictures');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Stars');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Galaxies');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Astronauts');
CREATE INDEX path_gist_idx ON test USING gist(path);
CREATE INDEX path_idx ON test USING btree(path);
</programlisting>
<para>
Now, we have a database ltreetest populated with a data describing hierarchy
shown below:
Now, we have a table <structname>test</> populated with data describing
the hierarchy shown below:
</para>
<programlisting>
TOP
/ | \
Science Hobbies Collections
Top
/ | \
Science Hobbies Collections
/ | \
Astronomy Amateurs_Astronomy Pictures
/ \ |
......@@ -440,13 +570,14 @@ shown below:
/ | \
Galaxies Stars Astronauts
</programlisting>
<para>
Inheritance:
We can do inheritance:
</para>
<programlisting>
ltreetest=# select path from test where path &lt;@ 'Top.Science';
path
path
------------------------------------
Top.Science
Top.Science.Astronomy
......@@ -454,12 +585,14 @@ ltreetest=# select path from test where path &lt;@ 'Top.Science';
Top.Science.Astronomy.Cosmology
(4 rows)
</programlisting>
<para>
Matching:
Here are some examples of path matching:
</para>
<programlisting>
ltreetest=# select path from test where path ~ '*.Astronomy.*';
path
path
-----------------------------------------------
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
......@@ -469,20 +602,22 @@ ltreetest=# select path from test where path ~ '*.Astronomy.*';
Top.Collections.Pictures.Astronomy.Galaxies
Top.Collections.Pictures.Astronomy.Astronauts
(7 rows)
ltreetest=# select path from test where path ~ '*.!pictures@.*.Astronomy.*';
path
path
------------------------------------
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
Top.Science.Astronomy.Cosmology
(3 rows)
</programlisting>
<para>
Full text search:
Here are some examples of full text search:
</para>
<programlisting>
ltreetest=# select path from test where path @ 'Astro*% &amp; !pictures@';
path
path
------------------------------------
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
......@@ -491,36 +626,20 @@ ltreetest=# select path from test where path @ 'Astro*% &amp; !pictures@';
(4 rows)
ltreetest=# select path from test where path @ 'Astro* &amp; !pictures@';
path
path
------------------------------------
Top.Science.Astronomy
Top.Science.Astronomy.Astrophysics
Top.Science.Astronomy.Cosmology
(3 rows)
</programlisting>
<para>
Using Functions:
Path construction using functions:
</para>
<programlisting>
ltreetest=# select subpath(path,0,2)||'Space'||subpath(path,2) from test where path &lt;@ 'Top.Science.Astronomy';
?column?
------------------------------------------
Top.Science.Space.Astronomy
Top.Science.Space.Astronomy.Astrophysics
Top.Science.Space.Astronomy.Cosmology
(3 rows)
We could create SQL-function:
CREATE FUNCTION ins_label(ltree, int4, text) RETURNS ltree
AS 'select subpath($1,0,$2) || $3 || subpath($1,$2);'
LANGUAGE SQL IMMUTABLE;
</programlisting>
<para>
and previous select could be rewritten as:
</para>
<programlisting>
ltreetest=# select ins_label(path,2,'Space') from test where path &lt;@ 'Top.Science.Astronomy';
ins_label
?column?
------------------------------------------
Top.Science.Space.Astronomy
Top.Science.Space.Astronomy.Astrophysics
......@@ -529,16 +648,16 @@ ltreetest=# select ins_label(path,2,'Space') from test where path &lt;@ 'Top.Sci
</programlisting>
<para>
Or with another arguments:
We could simplify this by creating a SQL function that inserts a label
at a specified position in a path:
</para>
<programlisting>
CREATE FUNCTION ins_label(ltree, ltree, text) RETURNS ltree
AS 'select subpath($1,0,nlevel($2)) || $3 || subpath($1,nlevel($2));'
CREATE FUNCTION ins_label(ltree, int, text) RETURNS ltree
AS 'select subpath($1,0,$2) || $3 || subpath($1,$2);'
LANGUAGE SQL IMMUTABLE;
ltreetest=# select ins_label(path,'Top.Science'::ltree,'Space') from test where path &lt;@ 'Top.Science.Astronomy';
ins_label
ltreetest=# select ins_label(path,2,'Space') from test where path &lt;@ 'Top.Science.Astronomy';
ins_label
------------------------------------------
Top.Science.Space.Astronomy
Top.Science.Space.Astronomy.Astrophysics
......@@ -547,224 +666,16 @@ ltreetest=# select ins_label(path,'Top.Science'::ltree,'Space') from test where
</programlisting>
</sect2>
<sect2>
<title>Additional data</title>
<para>
To get more feeling from our ltree module you could download
dmozltree-eng.sql.gz (about 3Mb tar.gz archive containing 300,274 nodes),
available from
<ulink url="http://www.sai.msu.su/~megera/postgres/gist/ltree/"></ulink>
dmozltree-eng.sql.gz, which is DMOZ catalogue, prepared for use with ltree.
Setup your test database (dmoz), load ltree module and issue command:
</para>
<programlisting>
zcat dmozltree-eng.sql.gz| psql dmoz
</programlisting>
<para>
Data will be loaded into database dmoz and all indices will be created.
</para>
</sect2>
<sect2>
<title>Benchmarks</title>
<para>
All runs were performed on my IBM ThinkPad T21 (256 MB RAM, 750Mhz) using DMOZ
data, containing 300,274 nodes (see above for download link). We used some
basic queries typical for walking through catalog.
</para>
<sect3>
<title>Queries</title>
<itemizedlist>
<listitem>
<para>
Q0: Count all rows (sort of base time for comparison)
</para>
<programlisting>
select count(*) from dmoz;
count
--------
300274
(1 row)
</programlisting>
</listitem>
<listitem>
<para>
Q1: Get direct children (without inheritance)
</para>
<programlisting>
select path from dmoz where path ~ 'Top.Adult.Arts.Animation.*{1}';
path
-----------------------------------
Top.Adult.Arts.Animation.Cartoons
Top.Adult.Arts.Animation.Anime
(2 rows)
</programlisting>
</listitem>
<listitem>
<para>
Q2: The same as Q1 but with counting of successors
</para>
<programlisting>
select path as parentpath , (select count(*)-1 from dmoz where path &lt;@
p.path) as count from dmoz p where path ~ 'Top.Adult.Arts.Animation.*{1}';
parentpath | count
-----------------------------------+-------
Top.Adult.Arts.Animation.Cartoons | 2
Top.Adult.Arts.Animation.Anime | 61
(2 rows)
</programlisting>
</listitem>
<listitem>
<para>
Q3: Get all parents
</para>
<programlisting>
select path from dmoz where path @&gt; 'Top.Adult.Arts.Animation' order by
path asc;
path
--------------------------
Top
Top.Adult
Top.Adult.Arts
Top.Adult.Arts.Animation
(4 rows)
</programlisting>
</listitem>
<listitem>
<para>
Q4: Get all parents with counting of children
</para>
<programlisting>
select path, (select count(*)-1 from dmoz where path &lt;@ p.path) as count
from dmoz p where path @&gt; 'Top.Adult.Arts.Animation' order by path asc;
path | count
--------------------------+--------
Top | 300273
Top.Adult | 4913
Top.Adult.Arts | 339
Top.Adult.Arts.Animation | 65
(4 rows)
</programlisting>
</listitem>
<listitem>
<para>
Q5: Get all children with levels
</para>
<programlisting>
select path, nlevel(path) - nlevel('Top.Adult.Arts.Animation') as level
from dmoz where path ~ 'Top.Adult.Arts.Animation.*{1,2}' order by path asc;
path | level
------------------------------------------------+-------
Top.Adult.Arts.Animation.Anime | 1
Top.Adult.Arts.Animation.Anime.Fan_Works | 2
Top.Adult.Arts.Animation.Anime.Games | 2
Top.Adult.Arts.Animation.Anime.Genres | 2
Top.Adult.Arts.Animation.Anime.Image_Galleries | 2
Top.Adult.Arts.Animation.Anime.Multimedia | 2
Top.Adult.Arts.Animation.Anime.Resources | 2
Top.Adult.Arts.Animation.Anime.Titles | 2
Top.Adult.Arts.Animation.Cartoons | 1
Top.Adult.Arts.Animation.Cartoons.AVS | 2
Top.Adult.Arts.Animation.Cartoons.Members | 2
(11 rows)
</programlisting>
</listitem>
</itemizedlist>
</sect3>
<sect3>
<title>Timings</title>
<programlisting>
+---------------------------------------------+
|Query|Rows|Time (ms) index|Time (ms) no index|
|-----+----+---------------+------------------|
| Q0| 1| NA| 1453.44|
|-----+----+---------------+------------------|
| Q1| 2| 0.49| 1001.54|
|-----+----+---------------+------------------|
| Q2| 2| 1.48| 3009.39|
|-----+----+---------------+------------------|
| Q3| 4| 0.55| 906.98|
|-----+----+---------------+------------------|
| Q4| 4| 24385.07| 4951.91|
|-----+----+---------------+------------------|
| Q5| 11| 0.85| 1003.23|
+---------------------------------------------+
</programlisting>
<para>
Timings without indices were obtained using operations which doesn't use
indices (see above)
</para>
</sect3>
<sect3>
<title>Remarks</title>
<para>
We didn't run full-scale tests, also we didn't present (yet) data for
operations with arrays of ltree (ltree[]) and full text searching. We'll
appreciate your input. So far, below some (rather obvious) results:
</para>
<itemizedlist>
<listitem>
<para>
Indices does help execution of queries
</para>
</listitem>
<listitem>
<para>
Q4 performs bad because one needs to read almost all data from the HDD
</para>
</listitem>
</itemizedlist>
</sect3>
</sect2>
<sect2>
<title>Some Backgrounds</title>
<para>
The approach we use for ltree is much like one we used in our other GiST based
contrib modules (intarray, tsearch, tree, btree_gist, rtree_gist). Theoretical
background is available in papers referenced from our GiST development page
(<ulink url="http://www.sai.msu.su/~megera/postgres/gist"></ulink>).
</para>
<para>
A hierarchical data structure (tree) is a set of nodes. Each node has a
signature (LPS) of a fixed size, which is a hashed label path of that node.
Traversing a tree we could *certainly* prune branches if
</para>
<programlisting>
LQS (bitwise AND) LPS != LQS
</programlisting>
<para>
where LQS is a signature of lquery or ltxtquery, obtained in the same way as
LPS.
</para>
<programlisting>
ltree[]:
</programlisting>
<para>
For array of ltree LPS is a bitwise OR-ed signatures of *ALL* children
reachable from that node. Signatures are stored in RD-tree, implemented using
GiST, which provides indexed access.
</para>
<programlisting>
ltree:
</programlisting>
<para>
For ltree we store LPS in a B-tree, implemented using GiST. Each node entry is
represented by (left_bound, signature, right_bound), so that we could speedup
operations <literal>&lt;, &lt;=, =, &gt;=, &gt;</literal> using left_bound, right_bound and prune branches of
a tree using signature.
</para>
</sect2>
<sect2>
<title>Authors</title>
<para>
All work was done by Teodor Sigaev (<email>teodor@stack.net</email>) and
Oleg Bartunov (<email>oleg@sai.msu.su</email>). See
All work was done by Teodor Sigaev (<email>teodor@stack.net</email>) and
Oleg Bartunov (<email>oleg@sai.msu.su</email>). See
<ulink url="http://www.sai.msu.su/~megera/postgres/gist"></ulink> for
additional information. Authors would like to thank Eugeny Rodichev for
additional information. Authors would like to thank Eugeny Rodichev for
helpful discussions. Comments and bug reports are welcome.
</para>
</sect2>
</sect1>
<!-- $PostgreSQL: pgsql/doc/src/sgml/oid2name.sgml,v 1.3 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="oid2name">
<title>oid2name</title>
<indexterm zone="oid2name">
<primary>oid2name</primary>
</indexterm>
<para>
This utility allows administrators to examine the file structure used by
PostgreSQL. To make use of it, you need to be familiar with the file
structure, which is described in <xref linkend="storage">.
<application>oid2name</> is a utility program that helps administrators to
examine the file structure used by PostgreSQL. To make use of it, you need
to be familiar with the database file structure, which is described in
<xref linkend="storage">.
</para>
<sect2>
<title>Overview</title>
<para>
<literal>oid2name</literal> connects to the database and extracts OID,
filenode, and table name information. You can also have it show database
OIDs and tablespace OIDs.
</para>
<note>
<para>
When displaying specific tables, you can select which tables to show by
using -o, -f and -t. The first switch takes an OID, the second takes
a filenode, and the third takes a tablename (actually, it's a LIKE
pattern, so you can use things like "foo%"). Note that you can use as many
of these switches as you like, and the listing will include all objects
matched by any of the switches. Also note that these switches can only
show objects in the database given in -d.
The name <quote>oid2name</> is historical, and is actually rather
misleading, since most of the time when you use it, you will really
be concerned with tables' filenode numbers (which are the file names
visible in the database directories). Be sure you understand the
difference between table OIDs and table filenodes!
</para>
</note>
<sect2>
<title>Overview</title>
<para>
If you don't give any of -o, -f or -t it will dump all the tables in the
database given in -d. If you don't give -d, it will show a database
listing. Alternatively you can give -s to get a tablespace listing.
<application>oid2name</application> connects to a target database and
extracts OID, filenode, and/or table name information. You can also have
it show database OIDs or tablespace OIDs. The program is controlled by
a large number of command-line switches, as shown in
<xref linkend="oid2name-switches">.
</para>
<table>
<title>Additional switches</title>
<table id="oid2name-switches">
<title><application>oid2name</> switches</title>
<tgroup cols="2">
<thead>
<row>
<entry>Switch</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>-i</literal></entry>
<entry>include indexes and sequences in the database listing.</entry>
<entry><literal>-o</literal> <replaceable>oid</></entry>
<entry>show info for table with OID <replaceable>oid</></entry>
</row>
<row>
<entry><literal>-x</literal></entry>
<entry>display more information about each object shown: tablespace name,
schema name, OID.
</entry>
<entry><literal>-f</literal> <replaceable>filenode</></entry>
<entry>show info for table with filenode <replaceable>filenode</></entry>
</row>
<row>
<entry><literal>-t</literal> <replaceable>tablename_pattern</></entry>
<entry>show info for table(s) matching <replaceable>tablename_pattern</></entry>
</row>
<row>
<entry><literal>-s</literal></entry>
<entry>show tablespace OIDs</entry>
</row>
<row>
<entry><literal>-S</literal></entry>
<entry>also show system objects (those in information_schema, pg_toast
and pg_catalog schemas)
<entry>include system objects (those in
<literal>information_schema</literal>, <literal>pg_toast</literal>
and <literal>pg_catalog</literal> schemas)
</entry>
</row>
<row>
<entry><literal>-i</literal></entry>
<entry>include indexes and sequences in the listing</entry>
</row>
<row>
<entry><literal>-x</literal></entry>
<entry>display more information about each object shown: tablespace name,
schema name, and OID
</entry>
</row>
<row>
<entry><literal>-q</literal></entry>
<entry>don't display headers(useful for scripting)</entry>
<entry>omit headers (useful for scripting)</entry>
</row>
<row>
<entry><literal>-d</literal> <replaceable>database</></entry>
<entry>database to connect to</entry>
</row>
<row>
<entry><literal>-H</literal> <replaceable>host</></entry>
<entry>database server's host</entry>
</row>
<row>
<entry><literal>-p</literal> <replaceable>port</></entry>
<entry>database server's port</entry>
</row>
<row>
<entry><literal>-U</literal> <replaceable>username</></entry>
<entry>username to connect as</entry>
</row>
<row>
<entry><literal>-P</literal> <replaceable>password</></entry>
<entry>password (deprecated &mdash; putting this on the command line
is a security hazard)</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
To display specific tables, select which tables to show by
using <literal>-o</>, <literal>-f</> and/or <literal>-t</>.
<literal>-o</> takes an OID,
<literal>-f</> takes a filenode,
and <literal>-t</> takes a tablename (actually, it's a LIKE
pattern, so you can use things like <literal>foo%</>).
You can use as many
of these switches as you like, and the listing will include all objects
matched by any of the switches. But note that these switches can only
show objects in the database given by <literal>-d</>.
</para>
<para>
If you don't give any of <literal>-o</>, <literal>-f</> or <literal>-t</>,
but do give <literal>-d</>, it will list all tables in the database
named by <literal>-d</>. In this mode, the <literal>-S</> and
<literal>-i</> switches control what gets listed.
</para>
<para>
If you don't give <literal>-d</> either, it will show a listing of database
OIDs. Alternatively you can give <literal>-s</> to get a tablespace
listing.
</para>
</sect2>
<sect2>
<title>Examples</title>
<programlisting>
$ # what's in this database server, anyway?
$ oid2name
All databases:
Oid Database Name Tablespace
......@@ -83,7 +170,8 @@ All tablespaces:
155151 fastdisk
155152 bigdisk
$ cd $PGDATA/17228
$ # OK, let's look into database alvherre
$ cd $PGDATA/base/17228
$ # get top 10 db objects in the default tablespace, ordered by size
$ ls -lS * | head -10
......@@ -98,6 +186,7 @@ $ ls -lS * | head -10
-rw------- 1 alvherre alvherre 163840 sep 14 09:50 16699
-rw------- 1 alvherre alvherre 122880 sep 6 17:51 16751
$ # I wonder what file 155173 is ...
$ oid2name -d alvherre -f 155173
From database "alvherre":
Filenode Table Name
......@@ -112,7 +201,7 @@ From database "alvherre":
155173 accounts
1155291 accounts_pkey
$ # you can also mix the options, and have more details
$ # you can mix the options, and get more details with -x
$ oid2name -d alvherre -t accounts -f 1155291 -x
From database "alvherre":
Filenode Table Name Oid Schema Tablespace
......@@ -157,7 +246,7 @@ $ ls -d 155151/*
155151/17228/ 155151/PG_VERSION
$ # Oh, what was database 17228 again?
$ oid2name
$ oid2name
All databases:
Oid Database Name Tablespace
----------------------------------
......@@ -178,28 +267,25 @@ From database "alvherre":
Filenode Table Name
----------------------
155156 foo
$ # end of sample session.
</programlisting>
</sect2>
<sect2>
<title>Limitations</title>
<para>
You can also get approximate size data for each object using psql. For
example,
</para>
<programlisting>
SELECT relpages, relfilenode, relname FROM pg_class ORDER BY relpages DESC;
</programlisting>
<para>
Each page is typically 8k. Relpages is updated by VACUUM.
<application>oid2name</> requires a running database server with
non-corrupt system catalogs. It is therefore of only limited use
for recovering from catastrophic database corruption situations.
</para>
</sect2>
<sect2>
<title>Author</title>
<para>
b. palmer, <email>bpalmer@crimelabs.net</email>
B. Palmer <email>bpalmer@crimelabs.net</email>
</para>
</sect2>
</sect1>
<!-- $PostgreSQL: pgsql/doc/src/sgml/pageinspect.sgml,v 1.3 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="pageinspect">
<title>pageinspect</title>
<indexterm zone="pageinspect">
<primary>pageinspect</primary>
</indexterm>
<para>
The functions in this module allow you to inspect the contents of data pages
at a low level, for debugging purposes.
The <filename>pageinspect</> module provides functions that allow you to
inspect the contents of database pages at a low level, which is useful for
debugging purposes. All of these functions may be used only by superusers.
</para>
<sect2>
<title>Functions included</title>
<itemizedlist>
<listitem>
<para>
<literal>get_raw_page</literal> reads one block of the named table and returns a copy as a
bytea field. This allows a single time-consistent copy of the block to be
made. Use of this functions is restricted to superusers.
</para>
</listitem>
<title>Functions</title>
<listitem>
<para>
<literal>page_header</literal> shows fields which are common to all PostgreSQL heap and index
pages. Use of this function is restricted to superusers.
</para>
<para>
A page image obtained with <literal>get_raw_page</literal> should be passed as argument:
</para>
<programlisting>
regression=# SELECT * FROM page_header(get_raw_page('pg_class',0));
lsn | tli | flags | lower | upper | special | pagesize | version | prune_xid
-----------+-----+-------+-------+-------+---------+----------+---------+-----------
0/24A1B50 | 1 | 1 | 232 | 368 | 8192 | 8192 | 4 | 0
</programlisting>
<para>
The returned columns correspond to the fields in the PageHeaderData struct.
See src/include/storage/bufpage.h for more details.
</para>
</listitem>
<listitem>
<para>
<literal>heap_page_items</literal> shows all line pointers on a heap page. For those line
pointers that are in use, tuple headers are also shown. All tuples are
shown, whether or not the tuples were visible to an MVCC snapshot at the
time the raw page was copied. Use of this function is restricted to
superusers.
</para>
<para>
A heap page image obtained with <literal>get_raw_page</literal> should be passed as argument:
</para>
<programlisting>
test=# SELECT * FROM heap_page_items(get_raw_page('pg_class',0));
</programlisting>
<para>
See src/include/storage/itemid.h and src/include/access/htup.h for
explanations of the fields returned.
</para>
</listitem>
<variablelist>
<varlistentry>
<term>
<function>get_raw_page(text, int) returns bytea</function>
</term>
<listitem>
<para>
<literal>bt_metap()</literal> returns information about the btree index metapage:
</para>
<programlisting>
test=> SELECT * FROM bt_metap('pg_cast_oid_index');
-[ RECORD 1 ]-----
magic | 340322
version | 2
root | 1
level | 0
fastroot | 1
fastlevel | 0
</programlisting>
</listitem>
<listitem>
<para>
<literal>bt_page_stats()</literal> shows information about single btree pages:
</para>
<programlisting>
test=> SELECT * FROM bt_page_stats('pg_cast_oid_index', 1);
-[ RECORD 1 ]-+-----
blkno | 1
type | l
live_items | 256
dead_items | 0
avg_item_size | 12
page_size | 8192
free_size | 4056
btpo_prev | 0
btpo_next | 0
btpo | 0
btpo_flags | 3
</programlisting>
</listitem>
<listitem>
<para>
<literal>bt_page_items()</literal> returns information about specific items on btree pages:
<listitem>
<para>
<function>get_raw_page</function> reads the specified block of the named
table and returns a copy as a <type>bytea</> value. This allows a
single time-consistent copy of the block to be obtained.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<function>page_header(bytea) returns record</function>
</term>
<listitem>
<para>
<function>page_header</function> shows fields that are common to all
<productname>PostgreSQL</> heap and index pages.
</para>
<para>
A page image obtained with <function>get_raw_page</function> should be
passed as argument. For example:
</para>
<programlisting>
test=# SELECT * FROM page_header(get_raw_page('pg_class', 0));
lsn | tli | flags | lower | upper | special | pagesize | version | prune_xid
-----------+-----+-------+-------+-------+---------+----------+---------+-----------
0/24A1B50 | 1 | 1 | 232 | 368 | 8192 | 8192 | 4 | 0
</programlisting>
<para>
The returned columns correspond to the fields in the
<structname>PageHeaderData</> struct.
See <filename>src/include/storage/bufpage.h</> for details.
</para>
<programlisting>
test=> SELECT * FROM bt_page_items('pg_cast_oid_index', 1);
itemoffset | ctid | itemlen | nulls | vars | data
------------+---------+---------+-------+------+-------------
1 | (0,1) | 12 | f | f | 23 27 00 00
2 | (0,2) | 12 | f | f | 24 27 00 00
3 | (0,3) | 12 | f | f | 25 27 00 00
4 | (0,4) | 12 | f | f | 26 27 00 00
5 | (0,5) | 12 | f | f | 27 27 00 00
6 | (0,6) | 12 | f | f | 28 27 00 00
7 | (0,7) | 12 | f | f | 29 27 00 00
8 | (0,8) | 12 | f | f | 2a 27 00 00
</programlisting>
</listitem>
</itemizedlist>
</listitem>
</varlistentry>
<varlistentry>
<term>
<function>heap_page_items(bytea) returns setof record</function>
</term>
<listitem>
<para>
<function>heap_page_items</function> shows all line pointers on a heap
page. For those line pointers that are in use, tuple headers are also
shown. All tuples are shown, whether or not the tuples were visible to
an MVCC snapshot at the time the raw page was copied.
</para>
<para>
A heap page image obtained with <function>get_raw_page</function> should
be passed as argument. For example:
</para>
<programlisting>
test=# SELECT * FROM heap_page_items(get_raw_page('pg_class', 0));
</programlisting>
<para>
See <filename>src/include/storage/itemid.h</> and
<filename>src/include/access/htup.h</> for explanations of the fields
returned.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<function>bt_metap(text) returns record</function>
</term>
<listitem>
<para>
<function>bt_metap</function> returns information about a btree
index's metapage. For example:
</para>
<programlisting>
test=# SELECT * FROM bt_metap('pg_cast_oid_index');
-[ RECORD 1 ]-----
magic | 340322
version | 2
root | 1
level | 0
fastroot | 1
fastlevel | 0
</programlisting>
</listitem>
</varlistentry>
<varlistentry>
<term>
<function>bt_page_stats(text, int) returns record</function>
</term>
<listitem>
<para>
<function>bt_page_stats</function> returns summary information about
single pages of btree indexes. For example:
</para>
<programlisting>
test=# SELECT * FROM bt_page_stats('pg_cast_oid_index', 1);
-[ RECORD 1 ]-+-----
blkno | 1
type | l
live_items | 256
dead_items | 0
avg_item_size | 12
page_size | 8192
free_size | 4056
btpo_prev | 0
btpo_next | 0
btpo | 0
btpo_flags | 3
</programlisting>
</listitem>
</varlistentry>
<varlistentry>
<term>
<function>bt_page_items(text, int) returns setof record</function>
</term>
<listitem>
<para>
<function>bt_page_items</function> returns detailed information about
all of the items on a btree index page. For example:
</para>
<programlisting>
test=# SELECT * FROM bt_page_items('pg_cast_oid_index', 1);
itemoffset | ctid | itemlen | nulls | vars | data
------------+---------+---------+-------+------+-------------
1 | (0,1) | 12 | f | f | 23 27 00 00
2 | (0,2) | 12 | f | f | 24 27 00 00
3 | (0,3) | 12 | f | f | 25 27 00 00
4 | (0,4) | 12 | f | f | 26 27 00 00
5 | (0,5) | 12 | f | f | 27 27 00 00
6 | (0,6) | 12 | f | f | 28 27 00 00
7 | (0,7) | 12 | f | f | 29 27 00 00
8 | (0,8) | 12 | f | f | 2a 27 00 00
</programlisting>
</listitem>
</varlistentry>
</variablelist>
</sect2>
</sect1>
</sect1>
<!-- $PostgreSQL: pgsql/doc/src/sgml/pgbench.sgml,v 1.4 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="pgbench">
<title>pgbench</title>
<indexterm zone="pgbench">
<primary>pgbench</primary>
</indexterm>
<para>
<literal>pgbench</literal> is a simple program to run a benchmark test.
<literal>pgbench</literal> is a client application of PostgreSQL and runs
with PostgreSQL only. It performs lots of small and simple transactions
including SELECT/UPDATE/INSERT operations then calculates number of
transactions successfully completed within a second (transactions
per second, tps). Targeting data includes a table with at least 100k
tuples.
<application>pgbench</application> is a simple program for running benchmark
tests on <productname>PostgreSQL</>. It runs the same sequence of SQL
commands over and over, possibly in multiple concurrent database sessions,
and then calculates the average transaction rate (transactions per second).
By default, <application>pgbench</application> tests a scenario that is
loosely based on TPC-B, involving five <command>SELECT</>,
<command>UPDATE</>, and <command>INSERT</> commands per transaction.
However, it is easy to test other cases by writing your own transaction
script files.
</para>
<para>
Example outputs from pgbench look like:
</para>
Typical output from pgbench looks like:
<programlisting>
number of clients: 4
number of transactions per client: 100
number of processed transactions: 400/400
tps = 19.875015(including connections establishing)
tps = 20.098827(excluding connections establishing)
transaction type: TPC-B (sort of)
scaling factor: 10
number of clients: 10
number of transactions per client: 1000
number of transactions actually processed: 10000/10000
tps = 85.184871 (including connections establishing)
tps = 85.296346 (excluding connections establishing)
</programlisting>
<para> Similar program called "JDBCBench" already exists, but it requires
Java that may not be available on every platform. Moreover some
people concerned about the overhead of Java that might lead
inaccurate results. So I decided to write in pure C, and named
it "pgbench."
</para>
<para>
Features of pgbench:
The first four lines just report some of the most important parameter
settings. The next line reports the number of transactions completed
and intended (the latter being just the product of number of clients
and number of transactions); these will be equal unless the run
failed before completion. The last two lines report the TPS rate,
figured with and without counting the time to start database sessions.
</para>
<itemizedlist>
<listitem>
<para>
pgbench is written in C using libpq only. So it is very portable
and easy to install.
</para>
</listitem>
<listitem>
<para>
pgbench can simulate concurrent connections using asynchronous
capability of libpq. No threading is required.
</para>
</listitem>
</itemizedlist>
<sect2>
<title>Overview</title>
<orderedlist>
<listitem>
<para>(optional)Initialize database by:</para>
<programlisting>
pgbench -i &lt;dbname&gt;
</programlisting>
<para>
where &lt;dbname&gt; is the name of database. pgbench uses four tables
accounts, branches, history and tellers. These tables will be
destroyed. Be very careful if you have tables having same
names. Default test data contains:
</para>
<programlisting>
table # of tuples
<para>
The default TPC-B-like transaction test requires specific tables to be
set up beforehand. <application>pgbench</> should be invoked with
the <literal>-i</> (initialize) option to create and populate these
tables. (When you are testing a custom script, you don't need this
step, but will instead need to do whatever setup your test needs.)
Initialization looks like:
<programlisting>
pgbench -i <optional> <replaceable>other-options</> </optional> <replaceable>dbname</>
</programlisting>
where <replaceable>dbname</> is the name of the already-created
database to test in. (You may also need <literal>-h</>,
<literal>-p</>, and/or <literal>-U</> options to specify how to
connect to the database server.)
</para>
<caution>
<para>
<literal>pgbench -i</> creates four tables <structname>accounts</>,
<structname>branches</>, <structname>history</>, and
<structname>tellers</>, destroying any existing tables of these names.
Be very careful to use another database if you have tables having these
names!
</para>
</caution>
<para>
At the default <quote>scale factor</> of 1, the tables initially
contain this many rows:
</para>
<programlisting>
table # of rows
-------------------------
branches 1
tellers 10
accounts 100000
history 0
</programlisting>
<para>
You can increase the number of tuples by using -s option. branches,
tellers and accounts tables are created with a fillfactor which is
set using -F option. See below.
</para>
</listitem>
<listitem>
<para>Run the benchmark test</para>
<programlisting>
pgbench &lt;dbname&gt;
</programlisting>
<para>
The default configuration is:
</para>
<programlisting>
number of clients: 1
number of transactions per client: 10
</programlisting>
</listitem>
</orderedlist>
</programlisting>
<para>
You can (and, for most purposes, probably should) increase the number
of rows by using the <literal>-s</> (scale factor) option. The
<literal>-F</> (fillfactor) option might also be used at this point.
</para>
<para>
Once you have done the necessary setup, you can run your benchmark
with a command that doesn't include <literal>-i</>, that is
<programlisting>
pgbench <optional> <replaceable>options</> </optional> <replaceable>dbname</>
</programlisting>
<table>
<title><literal>pgbench</literal> options</title>
In nearly all cases, you'll need some options to make a useful test.
The most important options are <literal>-c</> (number of clients),
<literal>-t</> (number of transactions), and <literal>-f</> (specify
a custom script file). See below for a full list.
</para>
<para>
<xref linkend="pgbench-init-options"> shows options that are used
during database initialization, while
<xref linkend="pgbench-run-options"> shows options that are used
while running benchmarks, and
<xref linkend="pgbench-common-options"> shows options that are useful
in both cases.
</para>
<table id="pgbench-init-options">
<title><application>pgbench</application> initialization options</title>
<tgroup cols="2">
<thead>
<row>
<entry>Parameter</entry>
<entry>Option</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>-h hostname</literal></entry>
<entry><literal>-i</literal></entry>
<entry>
<para>
hostname where the backend is running. If this option
is omitted, pgbench will connect to the localhost via
Unix domain socket.
</para>
Required to invoke initialization mode.
</entry>
</row>
<row>
<entry><literal>-p port</literal></entry>
<entry><literal>-s</literal> <replaceable>scale_factor</></entry>
<entry>
<para>
the port number that the backend is accepting. default is
libpq's default, usually 5432.
</para>
Multiply the number of rows generated by the scale factor.
For example, <literal>-s 100</> will imply 10,000,000 rows
in the <structname>accounts</> table. Default is 1.
</entry>
</row>
<row>
<entry><literal>-c number_of_clients</literal></entry>
<entry><literal>-F</literal> <replaceable>fillfactor</></entry>
<entry>
<para>
Number of clients simulated. default is 1.
</para>
Create the <structname>accounts</>, <structname>tellers</> and
<structname>branches</> tables with the given fillfactor.
Default is 100.
</entry>
</row>
</tbody>
</tgroup>
</table>
<table id="pgbench-run-options">
<title><application>pgbench</application> benchmarking options</title>
<tgroup cols="2">
<thead>
<row>
<entry>Option</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>-t number_of_transactions</literal></entry>
<entry><literal>-c</literal> <replaceable>clients</></entry>
<entry>
<para>
Number of transactions each client runs. default is 10.
</para>
Number of clients simulated, that is, number of concurrent database
sessions. Default is 1.
</entry>
</row>
<row>
<entry><literal>-s scaling_factor</literal></entry>
<entry><literal>-t</literal> <replaceable>transactions</></entry>
<entry>
<para>
this should be used with -i (initialize) option.
number of tuples generated will be multiple of the
scaling factor. For example, -s 100 will imply 10M
(10,000,000) tuples in the accounts table.
default is 1.
</para>
<para>
NOTE: scaling factor should be at least
as large as the largest number of clients you intend
to test; else you'll mostly be measuring update contention.
Regular (not initializing) runs using one of the
built-in tests will detect scale based on the number of
branches in the database. For custom (-f) runs it can
be manually specified with this parameter.
</para>
Number of transactions each client runs. Default is 10.
</entry>
</row>
<row>
<entry><literal>-D varname=value</literal></entry>
<entry><literal>-N</literal></entry>
<entry>
<para>
Define a variable. It can be refered to by a script
provided by using -f option. Multiple -D options are allowed.
</para>
Do not update <structname>tellers</> and <structname>branches</>.
This will avoid update contention on these tables, but
it makes the test case even less like TPC-B.
</entry>
</row>
<row>
<entry><literal>-U login</literal></entry>
<entry><literal>-S</literal></entry>
<entry>
<para>
Specify db user's login name if it is different from
the Unix login name.
</para>
Perform select-only transactions instead of TPC-B-like test.
</entry>
</row>
<row>
<entry><literal>-P password</literal></entry>
<entry><literal>-f</literal> <replaceable>filename</></entry>
<entry>
<para>
Specify the db password. CAUTION: using this option
might be a security hole since ps command will
show the password. Use this for TESTING PURPOSE ONLY.
</para>
Read transaction script from <replaceable>filename</>.
See below for details.
<literal>-N</literal>, <literal>-S</literal>, and <literal>-f</literal>
are mutually exclusive.
</entry>
</row>
<row>
<entry><literal>-n</literal></entry>
<entry>
<para>
No vacuuming and cleaning the history table prior to the
test is performed.
</para>
No vacuuming is performed before running the test.
This option is <emphasis>necessary</>
if you are running a custom test scenario that does not include
the standard tables <structname>accounts</>,
<structname>branches</>, <structname>history</>, and
<structname>tellers</>.
</entry>
</row>
<row>
<entry><literal>-v</literal></entry>
<entry>
<para>
Do vacuuming before testing. This will take some time.
With neither -n nor -v, pgbench will vacuum tellers and
branches tables only.
</para>
Vacuum all four standard tables before running the test.
With neither <literal>-n</> nor <literal>-v</>, pgbench will vacuum
<structname>tellers</> and <structname>branches</> tables, and
will remove all entries in <structname>history</>.
</entry>
</row>
<row>
<entry><literal>-S</literal></entry>
<entry><literal>-D</literal> <replaceable>varname</><literal>=</><replaceable>value</></entry>
<entry>
<para>
Perform select only transactions instead of TPC-B.
</para>
Define a variable for use by a custom script (see below).
Multiple <literal>-D</> options are allowed.
</entry>
</row>
<row>
<entry><literal>-N</literal></entry>
<entry><literal>-C</literal></entry>
<entry>
<para>
Do not update "branches" and "tellers". This will
avoid heavy update contention on branches and tellers,
while it will not make pgbench supporting TPC-B like
transactions.
</para>
Establish a new connection for each transaction, rather than
doing it just once per client thread.
This is useful to measure the connection overhead.
</entry>
</row>
<row>
<entry><literal>-f filename</literal></entry>
<entry><literal>-l</literal></entry>
<entry>
<para>
Read transaction script from file. Detailed
explanation will appear later.
</para>
Write the time taken by each transaction to a logfile.
See below for details.
</entry>
</row>
<row>
<entry><literal>-C</literal></entry>
<entry><literal>-s</literal> <replaceable>scale_factor</></entry>
<entry>
<para>
Establish connection for each transaction, rather than
doing it just once at beginning of pgbench in the normal
mode. This is useful to measure the connection overhead.
</para>
Report the specified scale factor in <application>pgbench</>'s
output. With the built-in tests, this is not necessary; the
correct scale factor will be detected by counting the number of
rows in the <structname>branches</> table. However, when testing
custom benchmarks (<literal>-f</> option), the scale factor
will be reported as 1 unless this option is used.
</entry>
</row>
<row>
<entry><literal>-l</literal></entry>
<entry><literal>-d</literal></entry>
<entry>
<para>
Write the time taken by each transaction to a logfile,
with the name "pgbench_log.xxx", where xxx is the PID
of the pgbench process. The format of the log is:
</para>
<programlisting>
client_id transaction_no time file_no time-epoch time-us
</programlisting>
<para>
where time is measured in microseconds, , the file_no is
which test file was used (useful when multiple were
specified with -f), and time-epoch/time-us are a
UNIX epoch format timestamp followed by an offset
in microseconds (suitable for creating a ISO 8601
timestamp with a fraction of a second) of when
the transaction completed.
</para>
<para>
Here are example outputs:
</para>
<programlisting>
0 199 2241 0 1175850568 995598
0 200 2465 0 1175850568 998079
0 201 2513 0 1175850569 608
0 202 2038 0 1175850569 2663
</programlisting>
Print debugging output.
</entry>
</row>
</tbody>
</tgroup>
</table>
<table id="pgbench-common-options">
<title><application>pgbench</application> common options</title>
<tgroup cols="2">
<thead>
<row>
<entry><literal>-F fillfactor</literal></entry>
<entry>
<para>
Create tables(accounts, tellers and branches) with the given
fillfactor. Default is 100. This should be used with -i
(initialize) option.
</para>
</entry>
<entry>Option</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>-d</literal></entry>
<entry>
<para>
debug option.
</para>
</entry>
<entry><literal>-h</literal> <replaceable>hostname</></entry>
<entry>database server's host</entry>
</row>
<row>
<entry><literal>-p</literal> <replaceable>port</></entry>
<entry>database server's port</entry>
</row>
<row>
<entry><literal>-U</literal> <replaceable>login</></entry>
<entry>username to connect as</entry>
</row>
<row>
<entry><literal>-P</literal> <replaceable>password</></entry>
<entry>password (deprecated &mdash; putting this on the command line
is a security hazard)</entry>
</row>
</tbody>
</tgroup>
......@@ -299,138 +293,252 @@ pgbench &lt;dbname&gt;
</sect2>
<sect2>
<title>What is the "transaction" actually performed in pgbench?</title>
<orderedlist>
<listitem><para><literal>begin;</literal></para></listitem>
<listitem><para><literal>update accounts set abalance = abalance + :delta where aid = :aid;</literal></para></listitem>
<title>What is the <quote>transaction</> actually performed in pgbench?</title>
<listitem><para><literal>select abalance from accounts where aid = :aid;</literal></para></listitem>
<listitem><para><literal>update tellers set tbalance = tbalance + :delta where tid = :tid;</literal></para></listitem>
<listitem><para><literal>update branches set bbalance = bbalance + :delta where bid = :bid;</literal></para></listitem>
<listitem><para><literal>insert into history(tid,bid,aid,delta) values(:tid,:bid,:aid,:delta);</literal></para></listitem>
<para>
The default transaction script issues seven commands per transaction:
</para>
<listitem><para><literal>end;</literal></para></listitem>
<orderedlist>
<listitem><para><literal>BEGIN;</literal></para></listitem>
<listitem><para><literal>UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid;</literal></para></listitem>
<listitem><para><literal>SELECT abalance FROM accounts WHERE aid = :aid;</literal></para></listitem>
<listitem><para><literal>UPDATE tellers SET tbalance = tbalance + :delta WHERE tid = :tid;</literal></para></listitem>
<listitem><para><literal>UPDATE branches SET bbalance = bbalance + :delta WHERE bid = :bid;</literal></para></listitem>
<listitem><para><literal>INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);</literal></para></listitem>
<listitem><para><literal>END;</literal></para></listitem>
</orderedlist>
<para>
If you specify -N, (4) and (5) aren't included in the transaction.
If you specify <literal>-N</>, steps 4 and 5 aren't included in the
transaction. If you specify <literal>-S</>, only the <command>SELECT</> is
issued.
</para>
</sect2>
<sect2>
<title>Script file</title>
<title>Custom Scripts</title>
<para>
<literal>pgbench</literal> has support for reading a transaction script
from a specified file (<literal>-f</literal> option). This file should
include SQL commands in each line. SQL command consists of multiple lines
are not supported. Empty lines and lines begging with "--" will be ignored.
<application>pgbench</application> has support for running custom
benchmark scenarios by replacing the default transaction script
(described above) with a transaction script read from a file
(<literal>-f</literal> option). In this case a <quote>transaction</>
counts as one execution of a script file. You can even specify
multiple scripts (multiple <literal>-f</literal> options), in which
case a random one of the scripts is chosen each time a client session
starts a new transaction.
</para>
<para>
Multiple <literal>-f</literal> options are allowed. In this case each
transaction is assigned randomly chosen script.
The format of a script file is one SQL command per line; multi-line
SQL commands are not supported. Empty lines and lines beginning with
<literal>--</> are ignored. Script file lines can also be
<quote>meta commands</>, which are interpreted by <application>pgbench</>
itself, as described below.
</para>
<para>
SQL commands can include "meta command" which begins with "\" (back
slash). A meta command takes some arguments separted by white
spaces. Currently following meta command is supported:
There is a simple variable-substitution facility for script files.
Variables can be set by the command-line <literal>-D</> option,
explained above, or by the meta commands explained below.
In addition to any variables preset by <literal>-D</> command-line options,
the variable <literal>scale</> is preset to the current scale factor.
Once set, a variable's
value can be inserted into a SQL command by writing
<literal>:</><replaceable>variablename</>. When running more than
one client session, each session has its own set of variables.
</para>
<itemizedlist>
<listitem>
<para>
<literal>\set name operand1 [ operator operand2 ]</literal>
- Sets the calculated value using "operand1" "operator"
"operand2" to variable "name". If "operator" and "operand2"
are omitted, the value of operand1 is set to variable "name".
</para>
<para>
Example:
</para>
<programlisting>
<para>
Script file meta commands begin with a backslash (<literal>\</>).
Arguments to a meta command are separated by white space.
These meta commands are supported:
</para>
<variablelist>
<varlistentry>
<term>
<literal>\set <replaceable>varname</> <replaceable>operand1</> [ <replaceable>operator</> <replaceable>operand2</> ]</literal>
</term>
<listitem>
<para>
Sets variable <replaceable>varname</> to a calculated integer value.
Each <replaceable>operand</> is either an integer constant or a
<literal>:</><replaceable>variablename</> reference to a variable
having an integer value. The <replaceable>operator</> can be
<literal>+</>, <literal>-</>, <literal>*</>, or <literal>/</>.
</para>
<para>
Example:
<programlisting>
\set ntellers 10 * :scale
</programlisting>
</listitem>
<listitem>
<para>
<literal>\setrandom name min max</literal>
- Assigns random integer to name between min and max
</para>
<para>
Example:
</para>
<programlisting>
\setrandom aid 1 100000
</programlisting>
</listitem>
<listitem>
<para>
Variables can be referred to in SQL comands by adding ":" in front
of the varible name.
</para>
<para>
Example:
</para>
<programlisting>
SELECT abalance FROM accounts WHERE aid = :aid
</programlisting>
<para>
Variables can also be defined by using -D option.
</para>
</listitem>
<listitem>
<para>
<literal>\sleep num [us|ms|s]</> - Causes script execution to sleep for the
specified duration of microseconds (us), milliseconds (ms) or the default
seconds (s).
</para>
<para>
Example:
</para>
<programlisting>
\setrandom millisec 1000 2500
\sleep : millisec ms
</programlisting>
</listitem>
</itemizedlist>
</sect2>
</programlisting>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<literal>\setrandom <replaceable>varname</> <replaceable>min</> <replaceable>max</></literal>
</term>
<listitem>
<para>
Sets variable <replaceable>varname</> to a random integer value
between the limits <replaceable>min</> and <replaceable>max</>.
Each limit can be either an integer constant or a
<literal>:</><replaceable>variablename</> reference to a variable
having an integer value.
</para>
<para>
Example:
<programlisting>
\setrandom aid 1 :naccounts
</programlisting>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<literal>\sleep <replaceable>number</> [ us | ms | s ]</literal>
</term>
<listitem>
<para>
Causes script execution to sleep for the specified duration in
microseconds (<literal>us</>), milliseconds (<literal>ms</>) or seconds
(<literal>s</>). If the unit is omitted then seconds are the default.
<replaceable>number</> can be either an integer constant or a
<literal>:</><replaceable>variablename</> reference to a variable
having an integer value.
</para>
<para>
Example:
<programlisting>
\sleep 10 ms
</programlisting>
</para>
</listitem>
</varlistentry>
</variablelist>
<sect2>
<title>Examples</title>
<para>
Example, TPC-B like benchmark can be defined as follows(scaling
factor = 1):
</para>
<programlisting>
As an example, the full definition of the built-in TPC-B-like
transaction is:
<programlisting>
\set nbranches :scale
\set ntellers 10 * :scale
\set naccounts 100000 * :scale
\setrandom aid 1 :naccounts
\setrandom bid 1 :nbranches
\setrandom tid 1 :ntellers
\setrandom delta 1 10000
BEGIN
UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid
SELECT abalance FROM accounts WHERE aid = :aid
UPDATE tellers SET tbalance = tbalance + :delta WHERE tid = :tid
UPDATE branches SET bbalance = bbalance + :delta WHERE bid = :bid
INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, 'now')
END
</programlisting>
\setrandom delta -5000 5000
BEGIN;
UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid;
SELECT abalance FROM accounts WHERE aid = :aid;
UPDATE tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
UPDATE branches SET bbalance = bbalance + :delta WHERE bid = :bid;
INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
END;
</programlisting>
This script allows each iteration of the transaction to reference
different, randomly-chosen rows. (This example also shows why it's
important for each client session to have its own variables &mdash;
otherwise they'd not be independently touching different rows.)
</para>
</sect2>
<sect2>
<title>Per-transaction logging</title>
<para>
If you want to automatically set the scaling factor from the number of
tuples in branches table, use -s option and shell command like this:
With the <literal>-l</> option, <application>pgbench</> writes the time
taken by each transaction to a logfile. The logfile will be named
<filename>pgbench_log.<replaceable>nnn</></filename>, where
<replaceable>nnn</> is the PID of the pgbench process.
The format of the log is:
<programlisting>
<replaceable>client_id</> <replaceable>transaction_no</> <replaceable>time</> <replaceable>file_no</> <replaceable>time_epoch</> <replaceable>time_us</>
</programlisting>
where <replaceable>time</> is the elapsed transaction time in microseconds,
<replaceable>file_no</> identifies which script file was used
(useful when multiple scripts were specified with <literal>-f</>),
and <replaceable>time_epoch</>/<replaceable>time_us</> are a
UNIX epoch format timestamp and an offset
in microseconds (suitable for creating a ISO 8601
timestamp with fractional seconds) showing when
the transaction completed.
</para>
<programlisting>
pgbench -s $(psql -At -c "SELECT count(*) FROM branches") -f tpc_b.sql
</programlisting>
<para>
Notice that -f option does not execute vacuum and clearing history
table before starting benchmark.
Here are example outputs:
<programlisting>
0 199 2241 0 1175850568 995598
0 200 2465 0 1175850568 998079
0 201 2513 0 1175850569 608
0 202 2038 0 1175850569 2663
</programlisting>
</para>
</sect2>
</sect1>
<sect2>
<title>Good Practices</title>
<para>
It is very easy to use <application>pgbench</> to produce completely
meaningless numbers. Here are some guidelines to help you get useful
results.
</para>
<para>
In the first place, <emphasis>never</> believe any test that runs
for only a few seconds. Increase the <literal>-t</> setting enough
to make the run last at least a few minutes, so as to average out noise.
In some cases you could need hours to get numbers that are reproducible.
It's a good idea to try the test run a few times, to find out if your
numbers are reproducible or not.
</para>
<para>
For the default TPC-B-like test scenario, the initialization scale factor
(<literal>-s</>) should be at least as large as the largest number of
clients you intend to test (<literal>-c</>); else you'll mostly be
measuring update contention. There are only <literal>-s</> rows in
the <structname>branches</> table, and every transaction wants to
update one of them, so <literal>-c</> values in excess of <literal>-s</>
will undoubtedly result in lots of transactions blocked waiting for
other transactions.
</para>
<para>
The default test scenario is also quite sensitive to how long it's been
since the tables were initialized: accumulation of dead rows and dead space
in the tables changes the results. To understand the results you must keep
track of the total number of updates and when vacuuming happens. If
autovacuum is enabled it can result in unpredictable changes in measured
performance.
</para>
<para>
A limitation of <application>pgbench</> is that it can itself become
the bottleneck when trying to test a large number of client sessions.
This can be alleviated by running <application>pgbench</> on a different
machine from the database server, although low network latency will be
essential. It might even be useful to run several <application>pgbench</>
instances concurrently, on several client machines, against the same
database server.
</para>
</sect2>
</sect1>
<!-- $PostgreSQL: pgsql/doc/src/sgml/pgbuffercache.sgml,v 2.2 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="pgbuffercache">
<title>pg_buffercache</title>
<indexterm zone="pgbuffercache">
<primary>pg_buffercache</primary>
</indexterm>
<para>
The <literal>pg_buffercache</literal> module provides a means for examining
what's happening to the buffercache at any given time without having to
restart or rebuild the server with debugging code added. The intent is to
do for the buffercache what pg_locks does for locks.
The <filename>pg_buffercache</filename> module provides a means for
examining what's happening in the shared buffer cache in real time.
</para>
<para>
This module consists of a C function <literal>pg_buffercache_pages()</literal>
that returns a set of records, plus a view <literal>pg_buffercache</literal>
to wrapper the function.
The module provides a C function <function>pg_buffercache_pages</function>
that returns a set of records, plus a view
<structname>pg_buffercache</structname> that wraps the function for
convenient use.
</para>
<para>
By default public access is REVOKED from both of these, just in case there
By default public access is revoked from both of these, just in case there
are security issues lurking.
</para>
<sect2>
<title>Notes</title>
<title>The <structname>pg_buffercache</structname> view</title>
<para>
The definition of the columns exposed in the view is:
The definitions of the columns exposed by the view are:
</para>
<programlisting>
Column | references | Description
----------------+----------------------+------------------------------------
bufferid | | Id, 1..shared_buffers.
relfilenode | pg_class.relfilenode | Refilenode of the relation.
reltablespace | pg_tablespace.oid | Tablespace oid of the relation.
reldatabase | pg_database.oid | Database for the relation.
relblocknumber | | Offset of the page in the relation.
isdirty | | Is the page dirty?
usagecount | | Page LRU count
</programlisting>
<table>
<title><structname>pg_buffercache</> Columns</title>
<tgroup cols="4">
<thead>
<row>
<entry>Name</entry>
<entry>Type</entry>
<entry>References</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><structfield>bufferid</structfield></entry>
<entry><type>integer</type></entry>
<entry></entry>
<entry>ID, in the range 1..<varname>shared_buffers</></entry>
</row>
<row>
<entry><structfield>relfilenode</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_class.relfilenode</literal></entry>
<entry>Relfilenode of the relation</entry>
</row>
<row>
<entry><structfield>reltablespace</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_tablespace.oid</literal></entry>
<entry>Tablespace OID of the relation</entry>
</row>
<row>
<entry><structfield>reldatabase</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_database.oid</literal></entry>
<entry>Database OID of the relation</entry>
</row>
<row>
<entry><structfield>relblocknumber</structfield></entry>
<entry><type>bigint</type></entry>
<entry></entry>
<entry>Page number within the relation</entry>
</row>
<row>
<entry><structfield>isdirty</structfield></entry>
<entry><type>boolean</type></entry>
<entry></entry>
<entry>Is the page dirty?</entry>
</row>
<row>
<entry><structfield>usagecount</structfield></entry>
<entry><type>smallint</type></entry>
<entry></entry>
<entry>Page LRU count</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
There is one row for each buffer in the shared cache. Unused buffers are
shown with all fields null except bufferid.
shown with all fields null except <structfield>bufferid</>. Shared system
catalogs are shown as belonging to database zero.
</para>
<para>
Because the cache is shared by all the databases, there are pages from
relations not belonging to the current database.
Because the cache is shared by all the databases, there will normally be
pages from relations not belonging to the current database. This means
that there may not be matching join rows in <structname>pg_class</> for
some rows, or that there could even be incorrect joins. If you are
trying to join against <structname>pg_class</>, it's a good idea to
restrict the join to rows having <structfield>reldatabase</> equal to
the current database's OID or zero.
</para>
<para>
When the pg_buffercache view is accessed, internal buffer manager locks are
taken, and a copy of the buffer cache data is made for the view to display.
This ensures that the view produces a consistent set of results, while not
blocking normal buffer activity longer than necessary. Nonetheless there
When the <structname>pg_buffercache</> view is accessed, internal buffer
manager locks are taken for long enough to copy all the buffer state
data that the view will display.
This ensures that the view produces a consistent set of results, while not
blocking normal buffer activity longer than necessary. Nonetheless there
could be some impact on database performance if this view is read often.
</para>
</sect2>
<sect2>
<title>Sample output</title>
<programlisting>
regression=# \d pg_buffercache;
View "public.pg_buffercache"
Column | Type | Modifiers
----------------+----------+-----------
bufferid | integer |
relfilenode | oid |
reltablespace | oid |
reldatabase | oid |
relblocknumber | bigint |
isdirty | boolean |
usagecount | smallint |
View definition:
SELECT p.bufferid, p.relfilenode, p.reltablespace, p.reldatabase,
p.relblocknumber, p.isdirty, p.usagecount
FROM pg_buffercache_pages() p(bufferid integer, relfilenode oid,
reltablespace oid, reldatabase oid, relblocknumber bigint,
isdirty boolean, usagecount smallint);
<programlisting>
regression=# SELECT c.relname, count(*) AS buffers
FROM pg_class c INNER JOIN pg_buffercache b
ON b.relfilenode = c.relfilenode INNER JOIN pg_database d
ON (b.reldatabase = d.oid AND d.datname = current_database())
FROM pg_buffercache b INNER JOIN pg_class c
ON b.relfilenode = c.relfilenode AND
b.reldatabase IN (0, (SELECT oid FROM pg_database
WHERE datname = current_database()))
GROUP BY c.relname
ORDER BY 2 DESC LIMIT 10;
relname | buffers
......@@ -95,26 +148,23 @@
pg_depend | 22
pg_depend_reference_index | 20
(10 rows)
regression=#
</programlisting>
</sect2>
<sect2>
<title>Authors</title>
<itemizedlist>
<listitem>
<para>
Mark Kirkwood <email>markir@paradise.net.nz</email>
</para>
</listitem>
<listitem>
<para>Design suggestions: Neil Conway <email>neilc@samurai.com</email></para>
</listitem>
<listitem>
<para>Debugging advice: Tom Lane <email>tgl@sss.pgh.pa.us</email></para>
</listitem>
</itemizedlist>
<para>
Mark Kirkwood <email>markir@paradise.net.nz</email>
</para>
<para>
Design suggestions: Neil Conway <email>neilc@samurai.com</email>
</para>
<para>
Debugging advice: Tom Lane <email>tgl@sss.pgh.pa.us</email>
</para>
</sect2>
</sect1>
<!-- $PostgreSQL: pgsql/doc/src/sgml/pgfreespacemap.sgml,v 2.2 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="pgfreespacemap">
<title>pg_freespacemap</title>
<indexterm zone="pgfreespacemap">
<primary>pg_freespacemap</primary>
</indexterm>
<para>
This module provides a means for examining the free space map (FSM). It
consists of two C functions: <literal>pg_freespacemap_relations()</literal>
and <literal>pg_freespacemap_pages()</literal> that return a set
of records, plus two views <literal>pg_freespacemap_relations</literal> and
<literal>pg_freespacemap_pages</literal> for more user-friendly access to
the functions.
</para>
<para>
The module provides the ability to examine the contents of the free space
map, without having to restart or rebuild the server with additional
debugging code.
The <filename>pg_freespacemap</> module provides a means for examining the
free space map (FSM). It provides two C functions:
<function>pg_freespacemap_relations</function> and
<function>pg_freespacemap_pages</function> that each return a set of
records, plus two views <structname>pg_freespacemap_relations</structname>
and <structname>pg_freespacemap_pages</structname> that wrap the functions
for convenient use.
</para>
<para>
By default public access is REVOKED from the functions and views, just in
case there are security issues present in the code.
By default public access is revoked from the functions and views, just in
case there are security issues lurking.
</para>
<sect2>
<title>Notes</title>
<title>The <filename>pg_freespacemap</> views</title>
<para>
The definitions for the columns exposed in the views are:
The definitions of the columns exposed by the views are:
</para>
<table>
<title>pg_freespacemap_relations</title>
<tgroup cols="3">
<title><structname>pg_freespacemap_relations</> Columns</title>
<tgroup cols="4">
<thead>
<row>
<entry>Column</entry>
<entry>references</entry>
<entry>Name</entry>
<entry>Type</entry>
<entry>References</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>reltablespace</entry>
<entry>pg_tablespace.oid</entry>
<entry>Tablespace oid of the relation.</entry>
<entry><structfield>reltablespace</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_tablespace.oid</literal></entry>
<entry>Tablespace OID of the relation</entry>
</row>
<row>
<entry>reldatabase</entry>
<entry>pg_database.oid</entry>
<entry>Database oid of the relation.</entry>
<entry><structfield>reldatabase</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_database.oid</literal></entry>
<entry>Database OID of the relation</entry>
</row>
<row>
<entry>relfilenode</entry>
<entry>pg_class.relfilenode</entry>
<entry>Relfilenode of the relation.</entry>
<entry><structfield>relfilenode</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_class.relfilenode</literal></entry>
<entry>Relfilenode of the relation</entry>
</row>
<row>
<entry>avgrequest</entry>
<entry><structfield>avgrequest</structfield></entry>
<entry><type>integer</type></entry>
<entry></entry>
<entry>Moving average of free space requests (NULL for indexes)</entry>
</row>
<row>
<entry>interestingpages</entry>
<entry><structfield>interestingpages</structfield></entry>
<entry><type>integer</type></entry>
<entry></entry>
<entry>Count of pages last reported as containing useful free space.</entry>
<entry>Count of pages last reported as containing useful free space</entry>
</row>
<row>
<entry>storedpages</entry>
<entry><structfield>storedpages</structfield></entry>
<entry><type>integer</type></entry>
<entry></entry>
<entry>Count of pages actually stored in free space map.</entry>
<entry>Count of pages actually stored in free space map</entry>
</row>
<row>
<entry>nextpage</entry>
<entry><structfield>nextpage</structfield></entry>
<entry><type>integer</type></entry>
<entry></entry>
<entry>Page index (from 0) to start next search at.</entry>
<entry>Page index (from 0) to start next search at</entry>
</row>
</tbody>
</tgroup>
</table>
<table>
<title>pg_freespacemap_pages</title>
<tgroup cols="3">
<title><structname>pg_freespacemap_pages</> Columns</title>
<tgroup cols="4">
<thead>
<row>
<entry>Column</entry>
<entry> references</entry>
<entry>Name</entry>
<entry>Type</entry>
<entry>References</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>reltablespace</entry>
<entry>pg_tablespace.oid</entry>
<entry>Tablespace oid of the relation.</entry>
<entry><structfield>reltablespace</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_tablespace.oid</literal></entry>
<entry>Tablespace OID of the relation</entry>
</row>
<row>
<entry>reldatabase</entry>
<entry>pg_database.oid</entry>
<entry>Database oid of the relation.</entry>
<entry><structfield>reldatabase</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_database.oid</literal></entry>
<entry>Database OID of the relation</entry>
</row>
<row>
<entry>relfilenode</entry>
<entry>pg_class.relfilenode</entry>
<entry>Relfilenode of the relation.</entry>
<entry><structfield>relfilenode</structfield></entry>
<entry><type>oid</type></entry>
<entry><literal>pg_class.relfilenode</literal></entry>
<entry>Relfilenode of the relation</entry>
</row>
<row>
<entry>relblocknumber</entry>
<entry><structfield>relblocknumber</structfield></entry>
<entry><type>bigint</type></entry>
<entry></entry>
<entry>Page number in the relation.</entry>
<entry>Page number within the relation</entry>
</row>
<row>
<entry>bytes</entry>
<entry><structfield>bytes</structfield></entry>
<entry><type>integer</type></entry>
<entry></entry>
<entry>Free bytes in the page, or NULL for an index page (see below).</entry>
<entry>Free bytes in the page, or NULL for an index page (see below)</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
For <literal>pg_freespacemap_relations</literal>, there is one row for each
relation in the free space map. <literal>storedpages</literal> is the
number of pages actually stored in the map, while
<literal>interestingpages</literal> is the number of pages the last VACUUM
thought had useful amounts of free space.
For <structname>pg_freespacemap_relations</structname>, there is one row
for each relation in the free space map.
<structfield>storedpages</structfield> is the number of pages actually
stored in the map, while <structfield>interestingpages</structfield> is the
number of pages the last <command>VACUUM</> thought had useful amounts of
free space.
</para>
<para>
If <literal>storedpages</literal> is consistently less than interestingpages
then it'd be a good idea to increase <literal>max_fsm_pages</literal>. Also,
if the number of rows in <literal>pg_freespacemap_relations</literal> is
close to <literal>max_fsm_relations</literal>, then you should consider
increasing <literal>max_fsm_relations</literal>.
If <structfield>storedpages</structfield> is consistently less than
<structfield>interestingpages</> then it'd be a good idea to increase
<varname>max_fsm_pages</varname>. Also, if the number of rows in
<structname>pg_freespacemap_relations</structname> is close to
<varname>max_fsm_relations</varname>, then you should consider increasing
<varname>max_fsm_relations</varname>.
</para>
<para>
For <literal>pg_freespacemap_pages</literal>, there is one row for each page
in the free space map. The number of rows for a relation will match the
<literal>storedpages</literal> column in
<literal>pg_freespacemap_relations</literal>.
For <structname>pg_freespacemap_pages</structname>, there is one row for
each page in the free space map. The number of rows for a relation will
match the <structfield>storedpages</structfield> column in
<structname>pg_freespacemap_relations</structname>.
</para>
<para>
For indexes, what is tracked is entirely-unused pages, rather than free
space within pages. Therefore, the average request size and free bytes
within a page are not meaningful, and are shown as NULL.
</para>
<para>
Because the map is shared by all the databases, it will include relations
not belonging to the current database.
Because the map is shared by all the databases, there will normally be
entries for relations not belonging to the current database. This means
that there may not be matching join rows in <structname>pg_class</> for
some rows, or that there could even be incorrect joins. If you are
trying to join against <structname>pg_class</>, it's a good idea to
restrict the join to rows having <structfield>reldatabase</> equal to
the current database's OID or zero.
</para>
<para>
When either of the views are accessed, internal free space map locks are
taken, and a copy of the map data is made for them to display.
This ensures that the views produce a consistent set of results, while not
blocking normal activity longer than necessary. Nonetheless there
When either of the views is accessed, internal free space map locks are
taken for long enough to copy all the state data that the view will display.
This ensures that the views produce a consistent set of results, while not
blocking normal activity longer than necessary. Nonetheless there
could be some impact on database performance if they are read often.
</para>
</sect2>
<sect2>
<title>Sample output - pg_freespacemap_relations</title>
<programlisting>
regression=# \d pg_freespacemap_relations
View "public.pg_freespacemap_relations"
Column | Type | Modifiers
------------------+---------+-----------
reltablespace | oid |
reldatabase | oid |
relfilenode | oid |
avgrequest | integer |
interestingpages | integer |
storedpages | integer |
nextpage | integer |
View definition:
SELECT p.reltablespace, p.reldatabase, p.relfilenode, p.avgrequest, p.interestingpages, p.storedpages, p.nextpage
FROM pg_freespacemap_relations() p(reltablespace oid, reldatabase oid, relfilenode oid, avgrequest integer, interestingpages integer, storedpages integer, nextpage integer);
<title>Sample output</title>
<programlisting>
regression=# SELECT c.relname, r.avgrequest, r.interestingpages, r.storedpages
FROM pg_freespacemap_relations r INNER JOIN pg_class c
ON c.relfilenode = r.relfilenode INNER JOIN pg_database d
ON r.reldatabase = d.oid AND (d.datname = current_database())
ON r.relfilenode = c.relfilenode AND
r.reldatabase IN (0, (SELECT oid FROM pg_database
WHERE datname = current_database()))
ORDER BY r.storedpages DESC LIMIT 10;
relname | avgrequest | interestingpages | storedpages
---------------------------------+------------+------------------+-------------
......@@ -193,31 +212,14 @@ regression=# SELECT c.relname, r.avgrequest, r.interestingpages, r.storedpages
pg_class_relname_nsp_index | | 10 | 10
pg_proc | 302 | 8 | 8
(10 rows)
</programlisting>
</sect2>
<sect2>
<title>Sample output - pg_freespacemap_pages</title>
<programlisting>
regression=# \d pg_freespacemap_pages
View "public.pg_freespacemap_pages"
Column | Type | Modifiers
----------------+---------+-----------
reltablespace | oid |
reldatabase | oid |
relfilenode | oid |
relblocknumber | bigint |
bytes | integer |
View definition:
SELECT p.reltablespace, p.reldatabase, p.relfilenode, p.relblocknumber, p.bytes
FROM pg_freespacemap_pages() p(reltablespace oid, reldatabase oid, relfilenode oid, relblocknumber bigint, bytes integer);
regression=# SELECT c.relname, p.relblocknumber, p.bytes
FROM pg_freespacemap_pages p INNER JOIN pg_class c
ON c.relfilenode = p.relfilenode INNER JOIN pg_database d
ON (p.reldatabase = d.oid AND d.datname = current_database())
ON p.relfilenode = c.relfilenode AND
p.reldatabase IN (0, (SELECT oid FROM pg_database
WHERE datname = current_database()))
ORDER BY c.relname LIMIT 10;
relname | relblocknumber | bytes
relname | relblocknumber | bytes
--------------+----------------+-------
a_star | 0 | 8040
abstime_tbl | 0 | 7908
......@@ -235,8 +237,10 @@ regression=# SELECT c.relname, p.relblocknumber, p.bytes
<sect2>
<title>Author</title>
<para>
Mark Kirkwood <email>markir@paradise.net.nz</email>
</para>
</sect2>
</sect1>
<!-- $PostgreSQL: pgsql/doc/src/sgml/pgrowlocks.sgml,v 1.4 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="pgrowlocks">
<title>pgrowlocks</title>
<indexterm zone="pgrowlocks">
<primary>pgrowlocks</primary>
</indexterm>
<para>
The <literal>pgrowlocks</literal> module provides a function to show row
The <filename>pgrowlocks</filename> module provides a function to show row
locking information for a specified table.
</para>
<sect2>
<title>Overview</title>
<programlisting>
pgrowlocks(text) RETURNS pgrowlocks_type
</programlisting>
<synopsis>
pgrowlocks(text) returns setof record
</synopsis>
<para>
The parameter is a name of table. And <literal>pgrowlocks_type</literal> is
defined as:
The parameter is the name of a table. The result is a set of records,
with one row for each locked row within the table. The output columns
are:
</para>
<programlisting>
CREATE TYPE pgrowlocks_type AS (
locked_row TID, -- row TID
lock_type TEXT, -- lock type
locker XID, -- locking XID
multi bool, -- multi XID?
xids xid[], -- multi XIDs
pids INTEGER[] -- locker's process id
);
</programlisting>
<table>
<title>pgrowlocks_type</title>
<tgroup cols="2">
<title><function>pgrowlocks</> output columns</title>
<tgroup cols="3">
<thead>
<row>
<entry>Name</entry>
<entry>Type</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>locked_row</entry>
<entry>tuple ID(TID) of each locked rows</entry>
<entry><structfield>locked_row</structfield></entry>
<entry><type>tid</type></entry>
<entry>Tuple ID (TID) of locked row</entry>
</row>
<row>
<entry>lock_type</entry>
<entry>"Shared" for shared lock, "Exclusive" for exclusive lock</entry>
<entry><structfield>lock_type</structfield></entry>
<entry><type>text</type></entry>
<entry><literal>Shared</> for shared lock, or
<literal>Exclusive</> for exclusive lock</entry>
</row>
<row>
<entry>locker</entry>
<entry>transaction ID of locker (Note 1)</entry>
<entry><structfield>locker</structfield></entry>
<entry><type>xid</type></entry>
<entry>Transaction ID of locker, or multixact ID if multi-transaction</entry>
</row>
<row>
<entry>multi</entry>
<entry>"t" if locker is a multi transaction, otherwise "f"</entry>
<entry><structfield>multi</structfield></entry>
<entry><type>boolean</type></entry>
<entry>True if locker is a multi-transaction</entry>
</row>
<row>
<entry>xids</entry>
<entry>XIDs of lockers (Note 2)</entry>
<entry><structfield>xids</structfield></entry>
<entry><type>xid[]</type></entry>
<entry>Transaction IDs of lockers (more than one if multi-transaction)</entry>
</row>
<row>
<entry>pids</entry>
<entry>process ids of locking backends</entry>
<entry><structfield>pids</structfield></entry>
<entry><type>integer[]</type></entry>
<entry>Process IDs of locking backends (more than one if multi-transaction)</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
Note1: If the locker is multi transaction, it represents the multi ID.
</para>
<para>
Note2: If the locker is multi, multiple data are shown.
</para>
<para>
The calling sequence for <literal>pgrowlocks</literal> is as follows:
<literal>pgrowlocks</literal> grabs AccessShareLock for the target table and
reads each row one by one to get the row locking information. You should
notice that:
<function>pgrowlocks</function> takes <literal>AccessShareLock</> for the
target table and reads each row one by one to collect the row locking
information. This is not very speedy for a large table. Note that:
</para>
<orderedlist>
<listitem>
<para>
if the table is exclusive locked by someone else,
<literal>pgrowlocks</literal> will be blocked.
If the table as a whole is exclusive-locked by someone else,
<function>pgrowlocks</function> will be blocked.
</para>
</listitem>
<listitem>
<para>
<literal>pgrowlocks</literal> may show incorrect information if there's a
new lock or a lock is freeed while its execution.
<function>pgrowlocks</function> is not guaranteed to produce a
self-consistent snapshot. It is possible that a new row lock is taken,
or an old lock is freed, during its execution.
</para>
</listitem>
</orderedlist>
<para>
<literal>pgrowlocks</literal> does not show the contents of locked rows. If
you want to take a look at the row contents at the same time, you could do
something like this:
</para>
<function>pgrowlocks</function> does not show the contents of locked
rows. If you want to take a look at the row contents at the same time, you
could do something like this:
<programlisting>
SELECT * FROM accounts AS a, pgrowlocks('accounts') AS p WHERE p.locked_ row = a.ctid;
SELECT * FROM accounts AS a, pgrowlocks('accounts') AS p
WHERE p.locked_row = a.ctid;
</programlisting>
Be aware however that (as of <productname>PostgreSQL</> 8.3) such a
query will be very inefficient.
</para>
</sect2>
<sect2>
<title>Example</title>
<para>
<literal>pgrowlocks</literal> returns the following columns:
</para>
<para>
Here is a sample execution of pgrowlocks:
</para>
<title>Sample output</title>
<programlisting>
test=# SELECT * FROM pgrowlocks('t1');
locked_row | lock_type | locker | multi | xids | pids
locked_row | lock_type | locker | multi | xids | pids
------------+-----------+--------+-------+-----------+---------------
(0,1) | Shared | 19 | t | {804,805} | {29066,29068}
(0,2) | Shared | 19 | t | {804,805} | {29066,29068}
......@@ -117,7 +124,14 @@ test=# SELECT * FROM pgrowlocks('t1');
(0,4) | Exclusive | 804 | f | {804} | {29066}
(4 rows)
</programlisting>
</sect2>
<sect2>
<title>Author</title>
<para>
Tatsuo Ishii
</para>
</sect2>
</sect1>
</sect1>
<!-- $PostgreSQL: pgsql/doc/src/sgml/pgstattuple.sgml,v 1.3 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="pgstattuple">
<title>pgstattuple</title>
<indexterm zone="pgstattuple">
<primary>pgstattuple</primary>
</indexterm>
<para>
<literal>pgstattuple</literal> modules provides various functions to obtain
tuple statistics.
The <filename>pgstattuple</filename> module provides various functions to
obtain tuple-level statistics.
</para>
<sect2>
<title>Functions</title>
<itemizedlist>
<listitem>
<para>
<literal>pgstattuple()</literal> returns the relation length, percentage
of the "dead" tuples of a relation and other info. This may help users to
determine whether vacuum is necessary or not. Here is an example session:
</para>
<programlisting>
test=> \x
Expanded display is on.
<variablelist>
<varlistentry>
<term>
<function>pgstattuple(text) returns record</>
</term>
<listitem>
<para>
<function>pgstattuple</function> returns a relation's physical length,
percentage of <quote>dead</> tuples, and other info. This may help users
to determine whether vacuum is necessary or not. The argument is the
target relation's name (optionally schema-qualified).
For example:
</para>
<programlisting>
test=> SELECT * FROM pgstattuple('pg_catalog.pg_proc');
-[ RECORD 1 ]------+-------
table_len | 458752
......@@ -35,86 +41,111 @@ dead_tuple_len | 3157
dead_tuple_percent | 0.69
free_space | 8932
free_percent | 1.95
</programlisting>
</programlisting>
<para>
Here are explanations for each column:
The output columns are:
</para>
<table>
<title><literal>pgstattuple()</literal> column descriptions</title>
<tgroup cols="2">
<title><function>pgstattuple</function> output columns</title>
<tgroup cols="3">
<thead>
<row>
<entry>Column</entry>
<entry>Type</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry>table_len</entry>
<entry>physical relation length in bytes</entry>
<entry><structfield>table_len</structfield></entry>
<entry><type>bigint</type></entry>
<entry>Physical relation length in bytes</entry>
</row>
<row>
<entry><structfield>tuple_count</structfield></entry>
<entry><type>bigint</type></entry>
<entry>Number of live tuples</entry>
</row>
<row>
<entry>tuple_count</entry>
<entry>number of live tuples</entry>
<entry><structfield>tuple_len</structfield></entry>
<entry><type>bigint</type></entry>
<entry>Total length of live tuples in bytes</entry>
</row>
<row>
<entry>tuple_len</entry>
<entry>total tuples length in bytes</entry>
<entry><structfield>tuple_percent</structfield></entry>
<entry><type>float8</type></entry>
<entry>Percentage of live tuples</entry>
</row>
<row>
<entry>tuple_percent</entry>
<entry>live tuples in %</entry>
<entry><structfield>dead_tuple_count</structfield></entry>
<entry><type>bigint</type></entry>
<entry>Number of dead tuples</entry>
</row>
<row>
<entry>dead_tuple_len</entry>
<entry>total dead tuples length in bytes</entry>
<entry><structfield>dead_tuple_len</structfield></entry>
<entry><type>bigint</type></entry>
<entry>Total length of dead tuples in bytes</entry>
</row>
<row>
<entry>dead_tuple_percent</entry>
<entry>dead tuples in %</entry>
<entry><structfield>dead_tuple_percent</structfield></entry>
<entry><type>float8</type></entry>
<entry>Percentage of dead tuples</entry>
</row>
<row>
<entry>free_space</entry>
<entry>free space in bytes</entry>
<entry><structfield>free_space</structfield></entry>
<entry><type>bigint</type></entry>
<entry>Total free space in bytes</entry>
</row>
<row>
<entry>free_percent</entry>
<entry>free space in %</entry>
<entry><structfield>free_percent</structfield></entry>
<entry><type>float8</type></entry>
<entry>Percentage of free space</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
<note>
<para>
<literal>pgstattuple</literal> acquires only a read lock on the relation. So
concurrent update may affect the result.
</para>
</note>
<note>
<para>
<literal>pgstattuple</literal> judges a tuple is "dead" if HeapTupleSatisfiesNow()
returns false.
</para>
</note>
</para>
</listitem>
<listitem>
<para>
<literal>pg_relpages()</literal> returns the number of pages in the relation.
<function>pgstattuple</function> acquires only a read lock on the
relation. So the results do not reflect an instantaneous snapshot;
concurrent updates will affect them.
</para>
</listitem>
<listitem>
<para>
<literal>pgstatindex()</literal> returns an array showing the information about an index:
<function>pgstattuple</function> judges a tuple is <quote>dead</> if
<function>HeapTupleSatisfiesNow</> returns false.
</para>
<programlisting>
test=> \x
Expanded display is on.
</listitem>
</varlistentry>
<varlistentry>
<term>
<function>pgstattuple(oid) returns record</>
</term>
<listitem>
<para>
This is the same as <function>pgstattuple(text)</function>, except
that the target relation is specified by OID.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<function>pgstatindex(text) returns record</>
</term>
<listitem>
<para>
<function>pgstatindex</function> returns a record showing information
about a btree index. For example:
</para>
<programlisting>
test=> SELECT * FROM pgstatindex('pg_cast_oid_index');
-[ RECORD 1 ]------+------
version | 2
......@@ -128,31 +159,116 @@ deleted_pages | 0
avg_leaf_density | 50.27
leaf_fragmentation | 0
</programlisting>
</listitem>
</itemizedlist>
<para>
The output columns are:
</para>
<table>
<title><function>pgstatindex</function> output columns</title>
<tgroup cols="3">
<thead>
<row>
<entry>Column</entry>
<entry>Type</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><structfield>version</structfield></entry>
<entry><type>integer</type></entry>
<entry>Btree version number</entry>
</row>
<row>
<entry><structfield>tree_level</structfield></entry>
<entry><type>integer</type></entry>
<entry>Tree level of the root page</entry>
</row>
<row>
<entry><structfield>index_size</structfield></entry>
<entry><type>integer</type></entry>
<entry>Total number of pages in index</entry>
</row>
<row>
<entry><structfield>root_block_no</structfield></entry>
<entry><type>integer</type></entry>
<entry>Location of root block</entry>
</row>
<row>
<entry><structfield>internal_pages</structfield></entry>
<entry><type>integer</type></entry>
<entry>Number of <quote>internal</> (upper-level) pages</entry>
</row>
<row>
<entry><structfield>leaf_pages</structfield></entry>
<entry><type>integer</type></entry>
<entry>Number of leaf pages</entry>
</row>
<row>
<entry><structfield>empty_pages</structfield></entry>
<entry><type>integer</type></entry>
<entry>Number of empty pages</entry>
</row>
<row>
<entry><structfield>deleted_pages</structfield></entry>
<entry><type>integer</type></entry>
<entry>Number of deleted pages</entry>
</row>
<row>
<entry><structfield>avg_leaf_density</structfield></entry>
<entry><type>float8</type></entry>
<entry>Average density of leaf pages</entry>
</row>
<row>
<entry><structfield>leaf_fragmentation</structfield></entry>
<entry><type>float8</type></entry>
<entry>Leaf page fragmentation</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
As with <function>pgstattuple</>, the results are accumulated
page-by-page, and should not be expected to represent an
instantaneous snapshot of the whole index.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<function>pg_relpages(text) returns integer</>
</term>
<listitem>
<para>
<function>pg_relpages</function> returns the number of pages in the
relation.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2>
<title>Usage</title>
<para>
<literal>pgstattuple</literal> may be called as a relation function and is
defined as follows:
</para>
<programlisting>
CREATE OR REPLACE FUNCTION pgstattuple(text) RETURNS pgstattuple_type
AS 'MODULE_PATHNAME', 'pgstattuple'
LANGUAGE C STRICT;
CREATE OR REPLACE FUNCTION pgstattuple(oid) RETURNS pgstattuple_type
AS 'MODULE_PATHNAME', 'pgstattuplebyid'
LANGUAGE C STRICT;
</programlisting>
<title>Author</title>
<para>
The argument is the relation name (optionally it may be qualified)
or the OID of the relation. Note that pgstattuple only returns
one row.
Tatsuo Ishii
</para>
</sect2>
</sect1>
<!-- $PostgreSQL: pgsql/doc/src/sgml/pgtrgm.sgml,v 2.2 2007/12/10 05:32:51 tgl Exp $ -->
<sect1 id="pgtrgm">
<title>pg_trgm</title>
<indexterm zone="pgtrgm">
<primary>pg_trgm</primary>
</indexterm>
<para>
The <literal>pg_trgm</literal> module provides functions and index classes
for determining the similarity of text based on trigram matching.
The <filename>pg_trgm</filename> module provides functions and operators
for determining the similarity of text based on trigram matching, as
well as index operator classes that support fast searching for similar
strings.
</para>
<sect2>
<title>Trigram (or Trigraph)</title>
<para>
A trigram is a set of three consecutive characters taken
from a string. A string is considered to have two spaces
prefixed and one space suffixed when determining the set
of trigrams that comprise the string.
</para>
<title>Trigram (or Trigraph) Concepts</title>
<para>
eg. The set of trigrams in the word "cat" is " c", " ca",
"at " and "cat".
A trigram is a group of three consecutive characters taken
from a string. We can measure the similarity of two strings by
counting the number of trigrams they share. This simple idea
turns out to be very effective for measuring the similarity of
words in many natural languages.
</para>
<note>
<para>
A string is considered to have two spaces
prefixed and one space suffixed when determining the set
of trigrams contained in the string.
For example, the set of trigrams in the string
<quote><literal>cat</literal></quote> is
<quote><literal> c</literal></quote>,
<quote><literal> ca</literal></quote>,
<quote><literal>cat</literal></quote>, and
<quote><literal>at </literal></quote>.
</para>
</note>
</sect2>
<sect2>
<title>Public Functions</title>
<table>
<title><literal>pg_trgm</literal> functions</title>
<tgroup cols="2">
<title>Functions and Operators</title>
<table id="pgtrgm-func-table">
<title><filename>pg_trgm</filename> functions</title>
<tgroup cols="3">
<thead>
<row>
<entry>Function</entry>
<entry>Returns</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>real similarity(text, text)</literal></entry>
<entry><function>similarity(text, text)</function></entry>
<entry><type>real</type></entry>
<entry>
<para>
Returns a number that indicates how closely matches the two
arguments are. A zero result indicates that the two words
are completely dissimilar, and a result of one indicates that
the two words are identical.
</para>
Returns a number that indicates how similar the two arguments are.
The range of the result is zero (indicating that the two strings are
completely dissimilar) to one (indicating that the two strings are
identical).
</entry>
</row>
<row>
<entry><literal>real show_limit()</literal></entry>
<entry><function>show_trgm(text)</function></entry>
<entry><type>text[]</type></entry>
<entry>
<para>
Returns the current similarity threshold used by the '%'
operator. This in effect sets the minimum similarity between
two words in order that they be considered similar enough to
be misspellings of each other, for example.
</para>
Returns an array of all the trigrams in the given string.
(In practice this is seldom useful except for debugging.)
</entry>
</row>
<row>
<entry><literal>real set_limit(real)</literal></entry>
<entry><function>show_limit()</function></entry>
<entry><type>real</type></entry>
<entry>
<para>
Sets the current similarity threshold that is used by the '%'
operator, and is returned by the show_limit() function.
</para>
Returns the current similarity threshold used by the <literal>%</>
operator. This sets the minimum similarity between
two words for them to be considered similar enough to
be misspellings of each other, for example.
</entry>
</row>
<row>
<entry><literal>text[] show_trgm(text)</literal></entry>
<entry><function>set_limit(real)</function></entry>
<entry><type>real</type></entry>
<entry>
<para>
Returns an array of all the trigrams of the supplied text
parameter.
</para>
Sets the current similarity threshold that is used by the <literal>%</>
operator. The threshold must be between 0 and 1 (default is 0.3).
Returns the same value passed in.
</entry>
</row>
</tbody>
</tgroup>
</table>
<table id="pgtrgm-op-table">
<title><filename>pg_trgm</filename> operators</title>
<tgroup cols="3">
<thead>
<row>
<entry>Operator: <literal>text % text (returns boolean)</literal></entry>
<entry>Operator</entry>
<entry>Returns</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><type>text</> <literal>%</literal> <type>text</></entry>
<entry><type>boolean</type></entry>
<entry>
<para>
The '%' operator returns TRUE if its two arguments have a similarity
that is greater than the similarity threshold set by set_limit(). It
will return FALSE if the similarity is less than the current
threshold.
</para>
Returns <literal>true</> if its arguments have a similarity that is
greater than the current similarity threshold set by
<function>set_limit</>.
</entry>
</row>
</tbody>
......@@ -93,97 +123,111 @@
</sect2>
<sect2>
<title>Public Index Operator Class</title>
<title>Index Support</title>
<para>
The <literal>pg_trgm</literal> module comes with the
<literal>gist_trgm_ops</literal> index operator class that allows a
developer to create an index over a text column for the purpose
of very fast similarity searches.
The <filename>pg_trgm</filename> module provides GiST and GIN index
operator classes that allow you to create an index over a text column for
the purpose of very fast similarity searches. These index types support
the <literal>%</> similarity operator (and no other operators, so you may
want a regular btree index too).
</para>
<para>
To use this index, the '%' operator must be used and an appropriate
similarity threshold for the application must be set. Example:
</para>
Example:
<programlisting>
CREATE TABLE test_trgm (t text);
CREATE INDEX trgm_idx ON test_trgm USING gist (t gist_trgm_ops);
</programlisting>
or
<programlisting>
CREATE INDEX trgm_idx ON test_trgm USING gin (t gin_trgm_ops);
</programlisting>
</para>
<para>
At this point, you will have an index on the t text column that you
can use for similarity searching. Example:
At this point, you will have an index on the <structfield>t</> column that
you can use for similarity searching. A typical query is
</para>
<programlisting>
SELECT
t,
similarity(t, 'word') AS sml
FROM
test_trgm
WHERE
t % 'word'
ORDER BY
sml DESC, t;
SELECT t, similarity(t, '<replaceable>word</>') AS sml
FROM test_trgm
WHERE t % '<replaceable>word</>'
ORDER BY sml DESC, t;
</programlisting>
<para>
This will return all values in the text column that are sufficiently
similar to 'word', sorted from best match to worst. The index will
be used to make this a fast operation over very large data sets.
similar to <replaceable>word</>, sorted from best match to worst. The
index will be used to make this a fast operation even over very large data
sets.
</para>
<para>
The choice between GiST and GIN indexing depends on the relative
performance characteristics of GiST and GIN, which are discussed elsewhere.
As a rule of thumb, a GIN index is faster to search than a GiST index, but
slower to build or update; so GIN is better suited for static data and GiST
for often-updated data.
</para>
</sect2>
<sect2>
<title>Text Search Integration</title>
<para>
Trigram matching is a very useful tool when used in conjunction
with a full text index.
with a full text index. In particular it can help to recognize
misspelled input words that will not be matched directly by the
full text search mechanism.
</para>
<para>
The first step is to generate an auxiliary table containing all
the unique words in the documents:
</para>
<programlisting>
CREATE TABLE words AS SELECT word FROM
stat('SELECT to_tsvector(''simple'', bodytext) FROM documents');
CREATE TABLE words AS SELECT word FROM
ts_stat('SELECT to_tsvector(''simple'', bodytext) FROM documents');
</programlisting>
<para>
where <structname>documents</> is a table that has a text field
<structfield>bodytext</> that we wish to search. The use of the
<literal>simple</> configuration with the <function>to_tsvector</>
function, instead of just using the already
existing vector is to avoid creating a list of already stemmed
words. This way, only the original, unstemmed words are added
to the word list.
<structfield>bodytext</> that we wish to search. The reason for using
the <literal>simple</> configuration with the <function>to_tsvector</>
function, instead of using a language-specific configuration,
is that we want a list of the original (unstemmed) words.
</para>
<para>
Next, create a trigram index on the word column:
</para>
<programlisting>
CREATE INDEX words_idx ON words USING gist(word gist_trgm_ops);
</programlisting>
<para>
or
</para>
<programlisting>
CREATE INDEX words_idx ON words USING gin(word gist_trgm_ops);
CREATE INDEX words_idx ON words USING gin(word gin_trgm_ops);
</programlisting>
<para>
Now, a <literal>SELECT</literal> query similar to the example above can be
used to suggest spellings for misspelled words in user search terms. A
useful extra clause is to ensure that the similar words are also
of similar length to the misspelled word.
</para>
<para>
<note>
<para>
Since the <structname>words</> table has been generated as a separate,
static table, it will need to be periodically regenerated so that
it remains up to date with the document collection.
</para>
</note>
Now, a <command>SELECT</command> query similar to the previous example can
be used to suggest spellings for misspelled words in user search terms.
A useful extra test is to require that the selected words are also of
similar length to the misspelled word.
</para>
<note>
<para>
Since the <structname>words</> table has been generated as a separate,
static table, it will need to be periodically regenerated so that
it remains reasonably up-to-date with the document collection.
Keeping it exactly current is usually unnecessary.
</para>
</note>
</sect2>
<sect2>
<title>References</title>
<para>
GiST Development Site
<ulink url="http://www.sai.msu.su/~megera/postgres/gist/"></ulink>
......@@ -196,6 +240,7 @@ CREATE INDEX words_idx ON words USING gin(word gist_trgm_ops);
<sect2>
<title>Authors</title>
<para>
Oleg Bartunov <email>oleg@sai.msu.su</email>, Moscow, Moscow University, Russia
</para>
......@@ -203,7 +248,7 @@ CREATE INDEX words_idx ON words USING gin(word gist_trgm_ops);
Teodor Sigaev <email>teodor@sigaev.ru</email>, Moscow, Delta-Soft Ltd.,Russia
</para>
<para>
Documentation: Christopher Kings-Lynne
Documentation: Christopher Kings-Lynne
</para>
<para>
This module is sponsored by Delta-Soft Ltd., Moscow, Russia.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment