Commit c3c69ab4 authored by Bruce Momjian's avatar Bruce Momjian

Move most /contrib README files into SGML. Some still need conversion

or will never be converted.
parent 6e414a17
PostgreSQL Administration Functions
===================================
This directory is a PostgreSQL 'contrib' module which implements a number of
support functions which pgAdmin and other administration and management tools
can use to provide additional functionality if installed on a server.
Installation
============
This module is normally distributed as a PostgreSQL 'contrib' module. To
install it from a pre-configured source tree run the following commands
as a user with appropriate privileges from the adminpack source directory:
make
make install
Alternatively, if you have a PostgreSQL 8.2 or higher installation but no
source tree you can install using PGXS. Simply run the following commands the
adminpack source directory:
make USE_PGXS=1
make USE_PGXS=1 install
pgAdmin will look for the functions in the Maintenance Database (usually
"postgres" for 8.2 servers) specified in the connection dialogue for the server.
To install the functions in the database, either run the adminpack.sql script
using the pgAdmin SQL tool (and then close and reopen the connection to the
freshly instrumented server), or run the script using psql, eg:
psql -U postgres postgres < adminpack.sql
Other administration tools that use this module may have different requirements,
please consult the tool's documentation for further details.
Objects implemented (superuser only)
====================================
int8 pg_catalog.pg_file_write(fname text, data text, append bool)
bool pg_catalog.pg_file_rename(oldname text, newname text, archivname text)
bool pg_catalog.pg_file_rename(oldname text, newname text)
bool pg_catalog.pg_file_unlink(fname text)
setof record pg_catalog.pg_logdir_ls()
/* Renaming of existing backend functions for pgAdmin compatibility */
int8 pg_catalog.pg_file_read(fname text, data text, append bool)
bigint pg_catalog.pg_file_length(text)
int4 pg_catalog.pg_logfile_rotate()
This is a B-Tree implementation using GiST that supports the int2, int4,
int8, float4, float8 timestamp with/without time zone, time
with/without time zone, date, interval, oid, money, macaddr, char,
varchar/text, bytea, numeric, bit, varbit and inet/cidr types.
All work was done by Teodor Sigaev (teodor@stack.net) , Oleg Bartunov
(oleg@sai.msu.su), Janko Richter (jankorichter@yahoo.de).
See http://www.sai.msu.su/~megera/postgres/gist for additional
information.
NEWS:
Apr 17, 2004 - Performance optimizing
Jan 21, 2004 - add support for bytea, numeric, bit, varbit, inet/cidr
Jan 17, 2004 - Reorganizing code and add support for char, varchar/text
Jan 10, 2004 - btree_gist now support oid , timestamp with time zone ,
time with and without time zone, date , interval
money, macaddr
Feb 5, 2003 - btree_gist now support int2, int8, float4, float8
NOTICE:
This version will only work with PostgreSQL version 7.4 and above
because of changes in the system catalogs and the function call
interface.
If you want to index varchar attributes, you have to index using
the function text(<varchar>):
Example:
CREATE TABLE test ( a varchar(23) );
CREATE INDEX testidx ON test USING GIST ( text(a) );
INSTALLATION:
gmake
gmake install
-- load functions
psql <database> < btree_gist.sql
REGRESSION TEST:
gmake installcheck
EXAMPLE USAGE:
create table test (a int4);
-- create index
create index testidx on test using gist (a);
-- query
select * from test where a < 10;
$PostgreSQL: pgsql/contrib/chkpass/README.chkpass,v 1.5 2007/10/01 19:06:48 darcy Exp $
Chkpass is a password type that is automatically checked and converted upon
entry. It is stored encrypted. To compare, simply compare against a clear
text password and the comparison function will encrypt it before comparing.
It also returns an error if the code determines that the password is easily
crackable. This is currently a stub that does nothing.
I haven't worried about making this type indexable. I doubt that anyone
would ever need to sort a file in order of encrypted password.
If you precede the string with a colon, the encryption and checking are
skipped so that you can enter existing passwords into the field.
On output, a colon is prepended. This makes it possible to dump and reload
passwords without re-encrypting them. If you want the password (encrypted)
without the colon then use the raw() function. This allows you to use the
type with things like Apache's Auth_PostgreSQL module.
The encryption uses the standard Unix function crypt(), and so it suffers
from all the usual limitations of that function; notably that only the
first eight characters of a password are considered.
Here is some sample usage:
test=# create table test (p chkpass);
CREATE TABLE
test=# insert into test values ('hello');
INSERT 0 1
test=# select * from test;
p
----------------
:dVGkpXdOrE3ko
(1 row)
test=# select raw(p) from test;
raw
---------------
dVGkpXdOrE3ko
(1 row)
test=# select p = 'hello' from test;
?column?
----------
t
(1 row)
test=# select p = 'goodbye' from test;
?column?
----------
f
(1 row)
D'Arcy J.M. Cain
darcy@druid.net
This diff is collapsed.
/*
* dblink
*
* Functions returning results from a remote database
*
* Joe Conway <mail@joeconway.com>
* And contributors:
* Darko Prenosil <Darko.Prenosil@finteh.hr>
* Shridhar Daithankar <shridhar_daithankar@persistent.co.in>
* Kai Londenberg (K.Londenberg@librics.de)
*
* Copyright (c) 2001-2007, PostgreSQL Global Development Group
* ALL RIGHTS RESERVED;
*
* Permission to use, copy, modify, and distribute this software and its
* documentation for any purpose, without fee, and without a written agreement
* is hereby granted, provided that the above copyright notice and this
* paragraph and the following two paragraphs appear in all copies.
*
* IN NO EVENT SHALL THE AUTHOR OR DISTRIBUTORS BE LIABLE TO ANY PARTY FOR
* DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING
* LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS
* DOCUMENTATION, EVEN IF THE AUTHOR OR DISTRIBUTORS HAVE BEEN ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
* THE AUTHOR AND DISTRIBUTORS SPECIFICALLY DISCLAIMS ANY WARRANTIES,
* INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
* AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS
* ON AN "AS IS" BASIS, AND THE AUTHOR AND DISTRIBUTORS HAS NO OBLIGATIONS TO
* PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
*
*/
Release Notes:
27 August 2006
- Added async query capability. Original patch by
Kai Londenberg (K.Londenberg@librics.de), modified by Joe Conway
Version 0.7 (as of 25 Feb, 2004)
- Added new version of dblink, dblink_exec, dblink_open, dblink_close,
and, dblink_fetch -- allows ERROR on remote side of connection to
throw NOTICE locally instead of ERROR
Version 0.6
- functions deprecated in 0.5 have been removed
- added ability to create "named" persistent connections
Version 0.5
- dblink now supports use directly as a table function; this is the new
preferred usage going forward
- Use of dblink_tok is now deprecated; original form of dblink is also
deprecated. They _will_ be removed in the next version.
- dblink_last_oid is also deprecated; use dblink_exec() which returns
the command status as a single row, single column result.
- Original dblink, dblink_tok, and dblink_last_oid are commented out in
dblink.sql; remove the comments to use the deprecated functions.
- dblink_strtok() and dblink_replace() functions were removed. Use
split() and replace() respectively (new backend functions in
PostgreSQL 7.3) instead.
- New functions: dblink_exec() for non-SELECT queries; dblink_connect()
opens connection that persists for duration of a backend;
dblink_disconnect() closes a persistent connection; dblink_open()
opens a cursor; dblink_fetch() fetches results from an open cursor.
dblink_close() closes a cursor.
- New test suite: dblink_check.sh, dblink.test.sql,
dblink.test.expected.out. Execute dblink_check.sh from the same
directory as the other two files. Output is dblink.test.out and
dblink.test.diff. Note that dblink.test.sql is a good source
of example usage.
Version 0.4
- removed cursor wrap around input sql to allow for remote
execution of INSERT/UPDATE/DELETE
- dblink now returns a resource id instead of a real pointer
- added several utility functions -- see below
Version 0.3
- fixed dblink invalid pointer causing corrupt elog message
- fixed dblink_tok improper handling of null results
- fixed examples in README.dblink
Version 0.2
- initial release
Installation:
Place these files in a directory called 'dblink' under 'contrib' in the PostgreSQL source tree. Then run:
make
make install
You can use dblink.sql to create the functions in your database of choice, e.g.
psql template1 < dblink.sql
installs dblink functions into database template1
Documentation:
Note: Parameters representing relation names must include double
quotes if the names are mixed-case or contain special characters. They
must also be appropriately qualified with schema name if applicable.
See the following files:
doc/connection
doc/cursor
doc/query
doc/execute
doc/misc
==================================================================
-- Joe Conway
This contrib package contains two different approaches to calculating
great circle distances on the surface of the Earth. The one described
first depends on the contrib/cube package (which MUST be installed before
earthdistance is installed). The second one is based on the point
datatype using latitude and longitude for the coordinates. The install
script makes the defined functions executable by anyone.
Make sure contrib/cube has been installed.
make
make install
make installcheck
To use these functions in a particular database as a postgres superuser do:
psql databasename < earthdistance.sql
-------------------------------------------
contrib/cube based Earth distance functions
Bruno Wolff III
September 2002
A spherical model of the Earth is used.
Data is stored in cubes that are points (both corners are the same) using 3
coordinates representing the distance from the center of the Earth.
The radius of the Earth is obtained from the earth() function. It is
given in meters. But by changing this one function you can change it
to use some other units or to use a different value of the radius
that you feel is more appropiate.
This package also has applications to astronomical databases as well.
Astronomers will probably want to change earth() to return a radius of
180/pi() so that distances are in degrees.
Functions are provided to allow for input in latitude and longitude (in
degrees), to allow for output of latitude and longitude, to calculate
the great circle distance between two points and to easily specify a
bounding box usable for index searches.
The functions are all 'sql' functions. If you want to make these functions
executable by other people you will also have to make the referenced
cube functions executable. cube(text), cube(float8), cube(cube,float8),
cube_distance(cube,cube), cube_ll_coord(cube,int) and
cube_enlarge(cube,float8,int) are used indirectly by the earth distance
functions. is_point(cube) and cube_dim(cube) are used in constraints for data
in domain earth. cube_ur_coord(cube,int) is used in the regression tests and
might be useful for looking at bounding box coordinates in user applications.
A domain of type cube named earth is defined.
There are constraints on it defined to make sure the cube is a point,
that it does not have more than 3 dimensions and that it is very near
the surface of a sphere centered about the origin with the radius of
the Earth.
The following functions are provided:
earth() - Returns the radius of the Earth in meters.
sec_to_gc(float8) - Converts the normal straight line (secant) distance between
between two points on the surface of the Earth to the great circle distance
between them.
gc_to_sec(float8) - Converts the great circle distance between two points
on the surface of the Earth to the normal straight line (secant) distance
between them.
ll_to_earth(float8, float8) - Returns the location of a point on the surface
of the Earth given its latitude (argument 1) and longitude (argument 2) in
degrees.
latitude(earth) - Returns the latitude in degrees of a point on the surface
of the Earth.
longitude(earth) - Returns the longitude in degrees of a point on the surface
of the Earth.
earth_distance(earth, earth) - Returns the great circle distance between
two points on the surface of the Earth.
earth_box(earth, float8) - Returns a box suitable for an indexed search using
the cube @> operator for points within a given great circle distance of a
location. Some points in this box are further than the specified great circle
distance from the location so a second check using earth_distance should be
made at the same time.
One advantage of using cube representation over a point using latitude and
longitude for coordinates, is that you don't have to worry about special
conditions at +/- 180 degrees of longitude or near the poles.
Below is the documentation for the Earth distance operator that works
with the point data type.
---------------------------------------------------------------------
I corrected a bug in the geo_distance code where two double constants
were declared as int. I also changed the distance function to use
the haversine formula which is more accurate for small distances.
Bruno Wolff
September 2002
---------------------------------------------------------------------
Date: Wed, 1 Apr 1998 15:19:32 -0600 (CST)
From: Hal Snyder <hal@vailsys.com>
To: vmehr@ctp.com
Subject: [QUESTIONS] Re: Spatial data, R-Trees
> From: Vivek Mehra <vmehr@ctp.com>
> Date: Wed, 1 Apr 1998 10:06:50 -0500
> Am just starting out with PostgreSQL and would like to learn more about
> the spatial data handling ablilities of postgreSQL - in terms of using
> R-tree indexes, user defined types, operators and functions.
>
> Would you be able to suggest where I could find some code and SQL to
> look at to create these?
Here's the setup for adding an operator '<@>' to give distance in
statute miles between two points on the Earth's surface. Coordinates
are in degrees. Points are taken as (longitude, latitude) and not vice
versa as longitude is closer to the intuitive idea of x-axis and
latitude to y-axis.
There's C source, Makefile for FreeBSD, and SQL for installing and
testing the function.
Let me know if anything looks fishy!
/*
* fuzzystrmatch.c
*
* Functions for "fuzzy" comparison of strings
*
* Joe Conway <mail@joeconway.com>
*
* Copyright (c) 2001-2007, PostgreSQL Global Development Group
* ALL RIGHTS RESERVED;
*
* levenshtein()
* -------------
* Written based on a description of the algorithm by Michael Gilleland
* found at http://www.merriampark.com/ld.htm
* Also looked at levenshtein.c in the PHP 4.0.6 distribution for
* inspiration.
*
* metaphone()
* -----------
* Modified for PostgreSQL by Joe Conway.
* Based on CPAN's "Text-Metaphone-1.96" by Michael G Schwern <schwern@pobox.com>
* Code slightly modified for use as PostgreSQL function (palloc, elog, etc).
* Metaphone was originally created by Lawrence Philips and presented in article
* in "Computer Language" December 1990 issue.
*
* dmetaphone() and dmetaphone_alt()
* ---------------------------------
* A port of the DoubleMetaphone perl module by Andrew Dunstan. See dmetaphone.c
* for more detail.
*
* soundex()
* -----------
* Folded existing soundex contrib into this one. Renamed text_soundex() (C function)
* to soundex() for consistency.
*
* difference()
* ------------
* Return the difference between two strings' soundex values. Kris Jurka
*
* Permission to use, copy, modify, and distribute this software and its
* documentation for any purpose, without fee, and without a written agreement
* is hereby granted, provided that the above copyright notice and this
* paragraph and the following two paragraphs appear in all copies.
*
* IN NO EVENT SHALL THE AUTHORS OR DISTRIBUTORS BE LIABLE TO ANY PARTY FOR
* DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING
* LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS
* DOCUMENTATION, EVEN IF THE AUTHOR OR DISTRIBUTORS HAVE BEEN ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*
* THE AUTHORS AND DISTRIBUTORS SPECIFICALLY DISCLAIM ANY WARRANTIES,
* INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY
* AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS
* ON AN "AS IS" BASIS, AND THE AUTHOR AND DISTRIBUTORS HAS NO OBLIGATIONS TO
* PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
*
*/
Version 0.3 (30 June, 2004):
Release Notes:
Version 0.3
- added double metaphone code from Andrew Dunstan
- change metaphone so that an empty input string causes an empty
output string to be returned, instead of throwing an ERROR
- fixed examples in README.soundex
Version 0.2
- folded soundex contrib into this one
Version 0.1
- initial release
Installation:
Place these files in a directory called 'fuzzystrmatch' under 'contrib' in the PostgreSQL source tree. Then run:
make
make install
You can use fuzzystrmatch.sql to create the functions in your database of choice, e.g.
psql -U postgres template1 < fuzzystrmatch.sql
installs following functions into database template1:
levenshtein() - calculates the levenshtein distance between two strings
metaphone() - calculates the metaphone code of an input string
Documentation
==================================================================
Name
levenshtein -- calculates the levenshtein distance between two strings
Synopsis
levenshtein(text source, text target)
Inputs
source
any text string, 255 characters max, NOT NULL
target
any text string, 255 characters max, NOT NULL
Outputs
Returns int
Example usage
select levenshtein('GUMBO','GAMBOL');
==================================================================
Name
metaphone -- calculates the metaphone code of an input string
Synopsis
metaphone(text source, int max_output_length)
Inputs
source
any text string, 255 characters max, NOT NULL
max_output_length
maximum length of the output metaphone code; if longer, the output
is truncated to this length
Outputs
Returns text
Example usage
select metaphone('GUMBO',4);
==================================================================
-- Joe Conway
Hstore - contrib module for storing (key,value) pairs
[Online version] (http://www.sai.msu.su/~megera/oddmuse/index.cgi?Hstore)
Motivation
Many attributes rarely searched, semistructural data, lazy DBA
Authors
* Oleg Bartunov <oleg@sai.msu.su>, Moscow, Moscow University, Russia
* Teodor Sigaev <teodor@sigaev.ru>, Moscow, Delta-Soft Ltd.,Russia
LEGAL NOTICES: This module is released under BSD license (as PostgreSQL
itself)
Operations
* hstore -> text - get value , perl analogy $h{key}
select 'a=>q, b=>g'->'a';
?
------
q
* hstore || hstore - concatenation, perl analogy %a=( %b, %c );
regression=# select 'a=>b'::hstore || 'c=>d'::hstore;
?column?
--------------------
"a"=>"b", "c"=>"d"
(1 row)
but, notice
regression=# select 'a=>b'::hstore || 'a=>d'::hstore;
?column?
----------
"a"=>"d"
(1 row)
* text => text - creates hstore type from two text strings
select 'a'=>'b';
?column?
----------
"a"=>"b"
* hstore @> hstore - contains operation, check if left operand contains right.
regression=# select 'a=>b, b=>1, c=>NULL'::hstore @> 'a=>c';
?column?
----------
f
(1 row)
regression=# select 'a=>b, b=>1, c=>NULL'::hstore @> 'b=>1';
?column?
----------
t
(1 row)
* hstore <@ hstore - contained operation, check if left operand is contained
in right
(Before PostgreSQL 8.2, the containment operators @> and <@ were
respectively called @ and ~. These names are still available, but are
deprecated and will eventually be retired. Notice that the old names
are reversed from the convention formerly followed by the core geometric
datatypes!)
Functions
* akeys(hstore) - returns all keys from hstore as array
regression=# select akeys('a=>1,b=>2');
akeys
-------
{a,b}
* skeys(hstore) - returns all keys from hstore as strings
regression=# select skeys('a=>1,b=>2');
skeys
-------
a
b
* avals(hstore) - returns all values from hstore as array
regression=# select avals('a=>1,b=>2');
avals
-------
{1,2}
* svals(hstore) - returns all values from hstore as strings
regression=# select svals('a=>1,b=>2');
svals
-------
1
2
* delete (hstore,text) - delete (key,value) from hstore if key matches
argument.
regression=# select delete('a=>1,b=>2','b');
delete
----------
"a"=>"1"
* each(hstore) return (key, value) pairs
regression=# select * from each('a=>1,b=>2');
key | value
-----+-------
a | 1
b | 2
* exist (hstore,text)
* hstore ? text
- returns 'true if key is exists in hstore and false otherwise.
regression=# select exist('a=>1','a'), 'a=>1' ? 'a';
exist | ?column?
-------+----------
t | t
* defined (hstore,text) - returns true if key is exists in hstore and
its value is not NULL.
regression=# select defined('a=>NULL','a');
defined
---------
f
Indices
Module provides index support for '@>' and '?' operations.
create index hidx on testhstore using gist(h);
create index hidx on testhstore using gin(h);
Note
Use parenthesis in select below, because priority of 'is' is higher than that of '->'
select id from entrants where (info->'education_period') is not null;
Examples
* add key
update tt set h=h||'c=>3';
* delete key
update tt set h=delete(h,'k1');
* Statistics
hstore type, because of its intrinsic liberality, could contain a lot of
different keys. Checking for valid keys is the task of application.
Examples below demonstrate several techniques how to check keys statistics.
o simple example
select * from each('aaa=>bq, b=>NULL, ""=>1 ');
o using table
select (each(h)).key, (each(h)).value into stat from testhstore ;
o online stat
select key, count(*) from (select (each(h)).key from testhstore) as stat group by key order by count desc, key;
key | count
-----------+-------
line | 883
query | 207
pos | 203
node | 202
space | 197
status | 195
public | 194
title | 190
org | 189
...................
Integer aggregator/enumerator.
Many database systems have the notion of a one to many table.
A one to many table usually sits between two indexed tables,
as:
create table one_to_many(left int, right int) ;
And it is used like this:
SELECT right.* from right JOIN one_to_many ON (right.id = one_to_many.right)
WHERE one_to_many.left = item;
This will return all the items in the right hand table for an entry
in the left hand table. This is a very common construct in SQL.
Now, this methodology can be cumbersome with a very large number of
entries in the one_to_many table. Depending on the order in which
data was entered, a join like this could result in an index scan
and a fetch for each right hand entry in the table for a particular
left hand entry.
If you have a very dynamic system, there is not much you can do.
However, if you have some data which is fairly static, you can
create a summary table with the aggregator.
CREATE TABLE summary as SELECT left, int_array_aggregate(right)
AS right FROM one_to_many GROUP BY left;
This will create a table with one row per left item, and an array
of right items. Now this is pretty useless without some way of using
the array, thats why there is an array enumerator.
SELECT left, int_array_enum(right) FROM summary WHERE left = item;
The above query using int_array_enum, produces the same results as:
SELECT left, right FROM one_to_many WHERE left = item;
The difference is that the query against the summary table has to get
only one row from the table, where as the query against "one_to_many"
must index scan and fetch a row for each entry.
On our system, an EXPLAIN shows a query with a cost of 8488 gets reduced
to a cost of 329. The query is a join between the one_to_many table,
select right, count(right) from
(
select left, int_array_enum(right) as right from summary join
(select left from left_table where left = item) as lefts
ON (summary.left = lefts.left )
) as list group by right order by count desc ;
This is an implementation of RD-tree data structure using GiST interface
of PostgreSQL. It has built-in lossy compression.
Current implementation provides index support for one-dimensional array of
integers: gist__int_ops, suitable for small and medium size of arrays (used by
default), and gist__intbig_ops for indexing large arrays (we use superimposed
signature with length of 4096 bits to represent sets). There is also a
non-default gin__int_ops for GIN indexes on integer arrays.
All work was done by Teodor Sigaev (teodor@stack.net) and Oleg Bartunov
(oleg@sai.msu.su). See http://www.sai.msu.su/~megera/postgres/gist
for additional information. Andrey Oktyabrski did a great work on
adding new functions and operations.
FUNCTIONS:
int icount(int[]) - the number of elements in intarray
test=# select icount('{1,2,3}'::int[]);
icount
--------
3
(1 row)
int[] sort(int[], 'asc' | 'desc') - sort intarray
test=# select sort('{1,2,3}'::int[],'desc');
sort
---------
{3,2,1}
(1 row)
int[] sort(int[]) - sort in ascending order
int[] sort_asc(int[]),sort_desc(int[]) - shortcuts for sort
int[] uniq(int[]) - returns unique elements
test=# select uniq(sort('{1,2,3,2,1}'::int[]));
uniq
---------
{1,2,3}
(1 row)
int idx(int[], int item) - returns index of first intarray matching element to item, or
'0' if matching failed.
test=# select idx('{1,2,3,2,1}'::int[],2);
idx
-----
2
(1 row)
int[] subarray(int[],int START [, int LEN]) - returns part of intarray starting from
element number START (from 1) and length LEN.
test=# select subarray('{1,2,3,2,1}'::int[],2,3);
subarray
----------
{2,3,2}
(1 row)
int[] intset(int4) - casting int4 to int[]
test=# select intset(1);
intset
--------
{1}
(1 row)
OPERATIONS:
int[] && int[] - overlap - returns TRUE if arrays have at least one common element
int[] @> int[] - contains - returns TRUE if left array contains right array
int[] <@ int[] - contained - returns TRUE if left array is contained in right array
# int[] - returns the number of elements in array
int[] + int - push element to array ( add to end of array)
int[] + int[] - merge of arrays (right array added to the end of left one)
int[] - int - remove entries matched by right argument from array
int[] - int[] - remove right array from left
int[] | int - returns intarray - union of arguments
int[] | int[] - returns intarray as a union of two arrays
int[] & int[] - returns intersection of arrays
int[] @@ query_int - returns TRUE if array satisfies query (like '1&(2|3)')
query_int ~~ int[] - returns TRUE if array satisfies query (commutator of @@)
(Before PostgreSQL 8.2, the containment operators @> and <@ were
respectively called @ and ~. These names are still available, but are
deprecated and will eventually be retired. Notice that the old names
are reversed from the convention formerly followed by the core geometric
datatypes!)
CHANGES:
August 6, 2002
1. Reworked patch from Andrey Oktyabrski (ano@spider.ru) with
functions: icount, sort, sort_asc, uniq, idx, subarray
operations: #, +, -, |, &
October 1, 2001
1. Change search method in array to binary
September 28, 2001
1. gist__int_ops now is without lossy
2. add sort entry in picksplit
September 21, 2001
1. Added support for boolean query (indexable operator @@, looks like
a @@ '1|(2&3)', perfomance is better in any case )
2. Done some small optimizations
March 19, 2001
1. Added support for toastable keys
2. Improved split algorithm for intbig (selection speedup is about 30%)
INSTALLATION:
gmake
gmake install
-- load functions
psql <database> < _int.sql
REGRESSION TEST:
gmake installcheck
EXAMPLE USAGE:
create table message (mid int not null,sections int[]);
create table message_section_map (mid int not null,sid int not null);
-- create indices
CREATE unique index message_key on message ( mid );
CREATE unique index message_section_map_key2 on message_section_map (sid, mid );
CREATE INDEX message_rdtree_idx on message using gist ( sections gist__int_ops);
-- select some messages with section in 1 OR 2 - OVERLAP operator
select message.mid from message where message.sections && '{1,2}';
-- select messages contains in sections 1 AND 2 - CONTAINS operator
select message.mid from message where message.sections @> '{1,2}';
-- the same, CONTAINED operator
select message.mid from message where '{1,2}' <@ message.sections;
BENCHMARK:
subdirectory bench contains benchmark suite.
cd ./bench
1. createdb TEST
2. psql TEST < ../_int.sql
3. ./create_test.pl | psql TEST
4. ./bench.pl - perl script to benchmark queries, supports OR, AND queries
with/without RD-Tree. Run script without arguments to
see availbale options.
a)test without RD-Tree (OR)
./bench.pl -d TEST -c -s 1,2 -v
b)test with RD-Tree
./bench.pl -d TEST -c -s 1,2 -v -r
BENCHMARKS:
Size of table <message>: 200000
Size of table <message_section_map>: 269133
Distribution of messages by sections:
section 0: 74377 messages
section 1: 16284 messages
section 50: 1229 messages
section 99: 683 messages
old - without RD-Tree support,
new - with RD-Tree
+----------+---------------+----------------+
|Search set|OR, time in sec|AND, time in sec|
| +-------+-------+--------+-------+
| | old | new | old | new |
+----------+-------+-------+--------+-------+
| 1| 0.625| 0.101| -| -|
+----------+-------+-------+--------+-------+
| 99| 0.018| 0.017| -| -|
+----------+-------+-------+--------+-------+
| 1,2| 0.766| 0.133| 0.628| 0.045|
+----------+-------+-------+--------+-------+
| 1,2,50,65| 0.794| 0.141| 0.030| 0.006|
+----------+-------+-------+--------+-------+
-- EAN13 - UPC - ISBN (books) - ISMN (music) - ISSN (serials)
-------------------------------------------------------------
Copyright Germán Méndez Bravo (Kronuz), 2004 - 2006
This module is released under the same BSD license as the rest of PostgreSQL.
The information to implement this module was collected through
several sites, including:
http://www.isbn-international.org/
http://www.issn.org/
http://www.ismn-international.org/
http://www.wikipedia.org/
the prefixes used for hyphenation where also compiled from:
http://www.gs1.org/productssolutions/idkeys/support/prefix_list.html
http://www.isbn-international.org/en/identifiers.html
http://www.ismn-international.org/ranges.html
Care was taken during the creation of the algorithms and they
were meticulously verified against the suggested algorithms
in the official ISBN, ISMN, ISSN User Manuals.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
THIS MODULE IS PROVIDED "AS IS" AND WITHOUT ANY WARRANTY
OF ANY KIND, EXPRESS OR IMPLIED.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
-- Content of the Module
-------------------------------------------------
This directory contains definitions for a few PostgreSQL
data types, for the following international-standard namespaces:
EAN13, UPC, ISBN (books), ISMN (music), and ISSN (serials). This module
is inspired by Garrett A. Wollman's isbn_issn code.
I wanted the database to fully validate numbers and also to use the
upcoming ISBN-13 and the EAN13 standards, as well as to have it
automatically doing hyphenations for ISBN numbers.
This new module validates, and automatically adds the correct
hyphenations to the numbers. Also, it supports the new ISBN-13
numbers to be used starting in January 2007.
Premises:
1. ISBN13, ISMN13, ISSN13 numbers are all EAN13 numbers
2. EAN13 numbers aren't always ISBN13, ISMN13 or ISSN13 (some are)
3. some ISBN13 numbers can be displayed as ISBN
4. some ISMN13 numbers can be displayed as ISMN
5. some ISSN13 numbers can be displayed as ISSN
6. all UPC, ISBN, ISMN and ISSN can be represented as EAN13 numbers
Note: All types are internally represented as 64 bit integers,
and internally all are consistently interchangeable.
We have the following data types:
+ EAN13 for European Article Numbers.
This type will always show the EAN13-display format.
Te output function for this is -> ean13_out()
+ ISBN13 for International Standard Book Numbers to be displayed in
the new EAN13-display format.
+ ISMN13 for International Standard Music Numbers to be displayed in
the new EAN13-display format.
+ ISSN13 for International Standard Serial Numbers to be displayed
in the new EAN13-display format.
These types will always display the long version of the ISxN (EAN13)
The output function to do this is -> ean13_out()
* The need for these types is just for displaying in different
ways the same data:
ISBN13 is actually the same as ISBN, ISMN13=ISMN and ISSN13=ISSN.
+ ISBN for International Standard Book Numbers to be displayed in
the current short-display format.
+ ISMN for International Standard Music Numbers to be displayed in
the current short-display format.
+ ISSN for International Standard Serial Numbers to be displayed
in the current short-display format.
These types will display the short version of the ISxN (ISxN 10)
whenever it's possible, and it will show ISxN 13 when it's
impossible to show the short version.
The output function to do this is -> isn_out()
+ UPC for Universal Product Codes.
UPC numbers are a subset of the EAN13 numbers (they are basically
EAN13 without the first '0' digit.)
The output function to do this is also -> isn_out()
We have the following input functions:
+ To take a string and return an EAN13 -> ean13_in()
+ To take a string and return valid ISBN or ISBN13 numbers -> isbn_in()
+ To take a string and return valid ISMN or ISMN13 numbers -> ismn_in()
+ To take a string and return valid ISSN or ISSN13 numbers -> issn_in()
+ To take a string and return an UPC codes -> upc_in()
We are able to cast from:
+ ISBN13 -> EAN13
+ ISMN13 -> EAN13
+ ISSN13 -> EAN13
+ ISBN -> EAN13
+ ISMN -> EAN13
+ ISSN -> EAN13
+ UPC -> EAN13
+ ISBN <-> ISBN13
+ ISMN <-> ISMN13
+ ISSN <-> ISSN13
We have two operator classes (for btree and for hash) so each data type
can be indexed for faster access.
The C API is implemented as:
extern Datum isn_out(PG_FUNCTION_ARGS);
extern Datum ean13_out(PG_FUNCTION_ARGS);
extern Datum ean13_in(PG_FUNCTION_ARGS);
extern Datum isbn_in(PG_FUNCTION_ARGS);
extern Datum ismn_in(PG_FUNCTION_ARGS);
extern Datum issn_in(PG_FUNCTION_ARGS);
extern Datum upc_in(PG_FUNCTION_ARGS);
On success:
+ isn_out() takes any of our types and returns a string containing
the shortes possible representation of the number.
+ ean13_out() takes any of our types and returns the
EAN13 (long) representation of the number.
+ ean13_in() takes a string and return a EAN13. Which, as stated in (2)
could or could not be any of our types, but it certainly is an EAN13
number. Only if the string is a valid EAN13 number, otherwise it fails.
+ isbn_in() takes a string and return an ISBN/ISBN13. Only if the string
is really a ISBN/ISBN13, otherwise it fails.
+ ismn_in() takes a string and return an ISMN/ISMN13. Only if the string
is really a ISMN/ISMN13, otherwise it fails.
+ issn_in() takes a string and return an ISSN/ISSN13. Only if the string
is really a ISSN/ISSN13, otherwise it fails.
+ upc_in() takes a string and return an UPC. Only if the string is
really a UPC, otherwise it fails.
(on failure, the functions 'ereport' the error)
-- Testing/Playing Functions
-------------------------------------------------
isn_weak(boolean) - Sets the weak input mode.
This function is intended for testing use only!
isn_weak() gets the current status of the weak mode.
"Weak" mode is used to be able to insert "invalid" data to a table.
"Invalid" as in the check digit being wrong, not missing numbers.
Why would you want to use the weak mode? well, it could be that
you have a huge collection of ISBN numbers, and that there are so many of
them that for weird reasons some have the wrong check digit (perhaps the
numbers where scanned from a printed list and the OCR got the numbers wrong,
perhaps the numbers were manually captured... who knows.) Anyway, the thing
is you might want to clean the mess up, but you still want to be able to have
all the numbers in your database and maybe use an external tool to access
the invalid numbers in the database so you can verify the information and
validate it more easily; as selecting all the invalid numbers in the table.
When you insert invalid numbers in a table using the weak mode, the number
will be inserted with the corrected check digit, but it will be flagged
with an exclamation mark ('!') at the end (i.e. 0-11-000322-5!)
You can also force the insertion of invalid numbers even not in the weak mode,
appending the '!' character at the end of the number.
To work with invalid numbers, you can use two functions:
+ make_valid(), which validates an invalid number (deleting the invalid flag)
+ is_valid(), which checks for the invalid flag presence.
-- Examples of Use
-------------------------------------------------
--Using the types directly:
select isbn('978-0-393-04002-9');
select isbn13('0901690546');
select issn('1436-4522');
--Casting types:
-- note that you can only cast from ean13 to other type when the casted
-- number would be valid in the realm of the casted type;
-- thus, the following will NOT work: select isbn(ean13('0220356483481'));
-- but these will:
select upc(ean13('0220356483481'));
select ean13(upc('220356483481'));
--Create a table with a single column to hold ISBN numbers:
create table test ( id isbn );
insert into test values('9780393040029');
--Automatically calculating check digits (observe the '?'):
insert into test values('220500896?');
insert into test values('978055215372?');
select issn('3251231?');
select ismn('979047213542?');
--Using the weak mode:
select isn_weak(true);
insert into test values('978-0-11-000533-4');
insert into test values('9780141219307');
insert into test values('2-205-00876-X');
select isn_weak(false);
select id from test where not is_valid(id);
update test set id=make_valid(id) where id = '2-205-00876-X!';
select * from test;
select isbn13(id) from test;
-- Contact
-------------------------------------------------
Please suggestions or bug reports to kronuz at users.sourceforge.net
Last reviewed on August 23, 2006 by Kronuz.
PostgreSQL type extension for managing Large Objects
----------------------------------------------------
Overview
One of the problems with the JDBC driver (and this affects the ODBC driver
also), is that the specification assumes that references to BLOBS (Binary
Large OBjectS) are stored within a table, and if that entry is changed, the
associated BLOB is deleted from the database.
As PostgreSQL stands, this doesn't occur. Large objects are treated as
objects in their own right; a table entry can reference a large object by
OID, but there can be multiple table entries referencing the same large
object OID, so the system doesn't delete the large object just because you
change or remove one such entry.
Now this is fine for new PostgreSQL-specific applications, but existing ones
using JDBC or ODBC won't delete the objects, resulting in orphaning - objects
that are not referenced by anything, and simply occupy disk space.
The Fix
I've fixed this by creating a new data type 'lo', some support functions, and
a Trigger which handles the orphaning problem. The trigger essentially just
does a 'lo_unlink' whenever you delete or modify a value referencing a large
object. When you use this trigger, you are assuming that there is only one
database reference to any large object that is referenced in a
trigger-controlled column!
The 'lo' type was created because we needed to differentiate between plain
OIDs and Large Objects. Currently the JDBC driver handles this dilemma easily,
but (after talking to Byron), the ODBC driver needed a unique type. They had
created an 'lo' type, but not the solution to orphaning.
You don't actually have to use the 'lo' type to use the trigger, but it may be
convenient to use it to keep track of which columns in your database represent
large objects that you are managing with the trigger.
Install
Ok, first build the shared library, and install. Typing 'make install' in the
contrib/lo directory should do it.
Then, as the postgres super user, run the lo.sql script in any database that
needs the features. This will install the type, and define the support
functions. You can run the script once in template1, and the objects will be
inherited by subsequently-created databases.
How to Use
The easiest way is by an example:
> create table image (title text, raster lo);
> create trigger t_raster before update or delete on image
> for each row execute procedure lo_manage(raster);
Create a trigger for each column that contains a lo type, and give the column
name as the trigger procedure argument. You can have more than one trigger on
a table if you need multiple lo columns in the same table, but don't forget to
give a different name to each trigger.
Issues
* Dropping a table will still orphan any objects it contains, as the trigger
is not executed.
Avoid this by preceding the 'drop table' with 'delete from {table}'.
If you already have, or suspect you have, orphaned large objects, see
the contrib/vacuumlo module to help you clean them up. It's a good idea
to run contrib/vacuumlo occasionally as a back-stop to the lo_manage
trigger.
* Some frontends may create their own tables, and will not create the
associated trigger(s). Also, users may not remember (or know) to create
the triggers.
As the ODBC driver needs a permanent lo type (& JDBC could be optimised to
use it if it's Oid is fixed), and as the above issues can only be fixed by
some internal changes, I feel it should become a permanent built-in type.
I'm releasing this into contrib, just to get it out, and tested.
Peter Mount <peter@retep.org.uk> June 13 1998
This diff is collapsed.
pg_standby README 2006/12/08 Simon Riggs
o What is pg_standby?
pg_standby allows the creation of a Warm Standby server.
It is designed to be a production-ready program, as well as a
customisable template should you require specific modifications.
Other configuration is required as well, all of which is
described in the main server manual.
The program is designed to be a wait-for restore_command,
required to turn a normal archive recovery into a Warm Standby.
Within the restore_command of the recovery.conf you could
configure pg_standby in the following way:
restore_command = 'pg_standby archiveDir %f %p %r'
which would be sufficient to define that files will be restored
from archiveDir.
o features of pg_standby
- pg_standby is written in C. So it is very portable
and easy to install.
- supports copy or link from a directory (only)
- source easy to modify, with specifically designated
sections to modify for your own needs, allowing
interfaces to be written for additional Backup Archive Restore
(BAR) systems
- portable: tested on Linux and Windows
o How to install pg_standby
$make
$make install
o How to use pg_standby?
pg_standby should be used within the restore_command of the
recovery.conf file. See the main PostgreSQL manual for details.
The basic usage should be like this:
restore_command = 'pg_standby archiveDir %f %p %r'
with the pg_standby command usage as
pg_standby [OPTION]... ARCHIVELOCATION NEXTWALFILE XLOGFILEPATH [RESTARTWALFILE]
When used within the restore_command the %f and %p macros
will provide the actual file and path required for the restore/recovery.
pg_standby assumes that ARCHIVELOCATION is directory accessible by the
server-owning user.
If RESTARTWALFILE is specified, typically by using the %r option, then all files
prior to this file will be removed from ARCHIVELOCATION. This then minimises
the number of files that need to be held, whilst at the same time maintaining
restart capability. This capability additionally assumes that ARCHIVELOCATION
directory is writable.
o options
pg_standby allows the following command line switches
-c
use copy/cp command to restore WAL files from archive
-d
debug/logging option.
-k numfiles
Cleanup files in the archive so that we maintain no more
than this many files in the archive. This parameter will
be silently ignored if RESTARTWALFILE is specified, since
that specification method is more accurate in determining
the correct cut-off point in archive.
You should be wary against setting this number too low,
since this may mean you cannot restart the standby. This
is because the last restartpoint marked in the WAL files
may be many files in the past and can vary considerably.
This should be set to a value exceeding the number of WAL
files that can be recovered in 2*checkpoint_timeout seconds,
according to the value in the warm standby postgresql.conf.
It is wholly unrelated to the setting of checkpoint_segments
on either primary or standby.
Setting numfiles to be zero will disable deletion of files
from ARCHIVELOCATION.
If in doubt, use a large value or do not set a value at all.
If you specify neither RESTARTWALFILE nor -k, then -k 0
will be assumed, i.e. keep all files in archive.
Default=0, Min=0
-l
use ln command to restore WAL files from archive
WAL files will remain in archive
Link is more efficient, but the default is copy to
allow you to maintain the WAL archive for recovery
purposes as well as high-availability.
The default setting is not necessarily recommended,
consult the main database server manual for discussion.
This option uses the Windows Vista command mklink
to provide a file-to-file symbolic link. -l will
not work on versions of Windows prior to Vista.
Use the -c option instead.
see http://en.wikipedia.org/wiki/NTFS_symbolic_link
-r maxretries
the maximum number of times to retry the restore command if it
fails. After each failure, we wait for sleeptime * num_retries
so that the wait time increases progressively, so by default
we will wait 5 secs, 10 secs then 15 secs before reporting
the failure back to the database server. This will be
interpreted as and end of recovery and the Standby will come
up fully as a result.
Default=3, Min=0
-s sleeptime
the number of seconds to sleep between testing to see
if the file to be restored is available in the archive yet.
The default setting is not necessarily recommended,
consult the main database server manual for discussion.
Default=5, Min=1, Max=60
-t triggerfile
the presence of the triggerfile will cause recovery to end
whether or not the next file is available
It is recommended that you use a structured filename to
avoid confusion as to which server is being triggered
when multiple servers exist on same system.
e.g. /tmp/pgsql.trigger.5432
-w maxwaittime
the maximum number of seconds to wait for the next file,
after which recovery will end and the Standby will come up.
A setting of zero means wait forever.
The default setting is not necessarily recommended,
consult the main database server manual for discussion.
Default=0, Min=0
Note: --help is not supported since pg_standby is not intended
for interactive use, except during dev/test
o examples
Linux
archive_command = 'cp %p ../archive/%f'
restore_command = 'pg_standby -l -d -k 255 -r 2 -s 2 -w 0 -t /tmp/pgsql.trigger.5442 $PWD/../archive %f %p 2>> standby.log'
which will
- use a ln command to restore WAL files from archive
- produce logfile output in standby.log
- keep the last 255 full WAL files, plus the current one
- sleep for 2 seconds between checks for next WAL file is full
- never timeout if file not found
- stop waiting when a trigger file called /tmp.pgsql.trigger.5442 appears
Windows
archive_command = 'copy %p ..\\archive\\%f'
Note that backslashes need to be doubled in the archive_command, but
*not* in the restore_command, in 8.2, 8.1, 8.0 on Windows.
restore_command = 'pg_standby -c -d -s 5 -w 0 -t C:\pgsql.trigger.5442 ..\archive %f %p 2>> standby.log'
which will
- use a copy command to restore WAL files from archive
- produce logfile output in standby.log
- sleep for 5 seconds between checks for next WAL file is full
- never timeout if file not found
- stop waiting when a trigger file called C:\pgsql.trigger.5442 appears
o supported versions
pg_standby is designed to work with PostgreSQL 8.2 and later. It is
currently compatible across minor changes between the way 8.3 and 8.2
operate.
PostgreSQL 8.3 provides the %r command line substitution, designed to
let pg_standby know the last file it needs to keep. If the last
parameter is omitted, no error is generated, allowing pg_standby to
function correctly with PostgreSQL 8.2 also. With PostgreSQL 8.2,
the -k option must be used if archive cleanup is required. This option
remains available in 8.3.
o reported test success
SUSE Linux 10.2
Windows XP Pro
o additional design notes
The use of a move command seems like it would be a good idea, but
this would prevent recovery from being restartable. Also, the last WAL
file is always requested twice from the archive.
trgm - Trigram matching for PostgreSQL
--------------------------------------
Introduction
This module is sponsored by Delta-Soft Ltd., Moscow, Russia.
The pg_trgm contrib module provides functions and index classes
for determining the similarity of text based on trigram
matching.
Definitions
Trigram (or Trigraph)
A trigram is a set of three consecutive characters taken
from a string. A string is considered to have two spaces
prefixed and one space suffixed when determining the set
of trigrams that comprise the string.
eg. The set of trigrams in the word "cat" is " c", " ca",
"at " and "cat".
Public Functions
real similarity(text, text)
Returns a number that indicates how closely matches the two
arguments are. A zero result indicates that the two words
are completely dissimilar, and a result of one indicates that
the two words are identical.
real show_limit()
Returns the current similarity threshold used by the '%'
operator. This in effect sets the minimum similarity between
two words in order that they be considered similar enough to
be misspellings of each other, for example.
real set_limit(real)
Sets the current similarity threshold that is used by the '%'
operator, and is returned by the show_limit() function.
text[] show_trgm(text)
Returns an array of all the trigrams of the supplied text
parameter.
Public Operators
text % text (returns boolean)
The '%' operator returns TRUE if its two arguments have a similarity
that is greater than the similarity threshold set by set_limit(). It
will return FALSE if the similarity is less than the current
threshold.
Public Index Operator Classes
gist_trgm_ops
The pg_trgm module comes with an index operator class that allows a
developer to create an index over a text column for the purpose
of very fast similarity searches.
To use this index, the '%' operator must be used and an appropriate
similarity threshold for the application must be set.
eg.
CREATE TABLE test_trgm (t text);
CREATE INDEX trgm_idx ON test_trgm USING gist (t gist_trgm_ops);
At this point, you will have an index on the t text column that you
can use for similarity searching.
eg.
SELECT
t,
similarity(t, 'word') AS sml
FROM
test_trgm
WHERE
t % 'word'
ORDER BY
sml DESC, t;
This will return all values in the text column that are sufficiently
similar to 'word', sorted from best match to worst. The index will
be used to make this a fast operation over very large data sets.
Tsearch2 Integration
Trigram matching is a very useful tool when used in conjunction
with a text index created by the Tsearch2 contrib module. (See
contrib/tsearch2)
The first step is to generate an auxiliary table containing all
the unique words in the Tsearch2 index:
CREATE TABLE words AS SELECT word FROM
stat('SELECT to_tsvector(''simple'', bodytext) FROM documents');
Where 'documents' is a table that has a text field 'bodytext'
that TSearch2 is used to search. The use of the 'simple' dictionary
with the to_tsvector function, instead of just using the already
existing vector is to avoid creating a list of already stemmed
words. This way, only the original, unstemmed words are added
to the word list.
Next, create a trigram index on the word column:
CREATE INDEX words_idx ON words USING gist(word gist_trgm_ops);
or
CREATE INDEX words_idx ON words USING gin(word gist_trgm_ops);
Now, a SELECT query similar to the example above can be used to
suggest spellings for misspelled words in user search terms. A
useful extra clause is to ensure that the similar words are also
of similar length to the misspelled word.
Note: Since the 'words' table has been generated as a separate,
static table, it will need to be periodically regenerated so that
it remains up to date with the word list in the Tsearch2 index.
Authors
Oleg Bartunov <oleg@sai.msu.su>, Moscow, Moscow University, Russia
Teodor Sigaev <teodor@sigaev.ru>, Moscow, Delta-Soft Ltd.,Russia
Contributors
Christopher Kings-Lynne wrote this README file
References
Tsearch2 Development Site
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/
GiST Development Site
http://www.sai.msu.su/~megera/postgres/gist/
$PostgreSQL: pgsql/contrib/pgbench/README.pgbench,v 1.20 2007/07/06 20:17:02 wieck Exp $
pgbench README
o What is pgbench?
pgbench is a simple program to run a benchmark test. pgbench is a
client application of PostgreSQL and runs with PostgreSQL only. It
performs lots of small and simple transactions including
SELECT/UPDATE/INSERT operations then calculates number of
transactions successfully completed within a second (transactions
per second, tps). Targeting data includes a table with at least 100k
tuples.
Example outputs from pgbench look like:
number of clients: 4
number of transactions per client: 100
number of processed transactions: 400/400
tps = 19.875015(including connections establishing)
tps = 20.098827(excluding connections establishing)
Similar program called "JDBCBench" already exists, but it requires
Java that may not be available on every platform. Moreover some
people concerned about the overhead of Java that might lead
inaccurate results. So I decided to write in pure C, and named
it "pgbench."
o features of pgbench
- pgbench is written in C using libpq only. So it is very portable
and easy to install.
- pgbench can simulate concurrent connections using asynchronous
capability of libpq. No threading is required.
o How to install pgbench
$make
$make install
o How to use pgbench?
(1) (optional)Initialize database by:
pgbench -i <dbname>
where <dbname> is the name of database. pgbench uses four tables
accounts, branches, history and tellers. These tables will be
destroyed. Be very careful if you have tables having same
names. Default test data contains:
table # of tuples
-------------------------
branches 1
tellers 10
accounts 100000
history 0
You can increase the number of tuples by using -s option. branches,
tellers and accounts tables are created with a fillfactor which is
set using -F option. See below.
(2) Run the benchmark test
pgbench <dbname>
The default configuration is:
number of clients: 1
number of transactions per client: 10
o options
pgbench has number of options.
-h hostname
hostname where the backend is running. If this option
is omitted, pgbench will connect to the localhost via
Unix domain socket.
-p port
the port number that the backend is accepting. default is
libpq's default, usually 5432.
-c number_of_clients
Number of clients simulated. default is 1.
-t number_of_transactions
Number of transactions each client runs. default is 10.
-s scaling_factor
this should be used with -i (initialize) option.
number of tuples generated will be multiple of the
scaling factor. For example, -s 100 will imply 10M
(10,000,000) tuples in the accounts table.
default is 1. NOTE: scaling factor should be at least
as large as the largest number of clients you intend
to test; else you'll mostly be measuring update contention.
Regular (not initializing) runs using one of the
built-in tests will detect scale based on the number of
branches in the database. For custom (-f) runs it can
be manually specified with this parameter.
-D varname=value
Define a variable. It can be refered to by a script
provided by using -f option. Multiple -D options are allowed.
-U login
Specify db user's login name if it is different from
the Unix login name.
-P password
Specify the db password. CAUTION: using this option
might be a security hole since ps command will
show the password. Use this for TESTING PURPOSE ONLY.
-n
No vacuuming and cleaning the history table prior to the
test is performed.
-v
Do vacuuming before testing. This will take some time.
With neither -n nor -v, pgbench will vacuum tellers and
branches tables only.
-S
Perform select only transactions instead of TPC-B.
-N Do not update "branches" and "tellers". This will
avoid heavy update contention on branches and tellers,
while it will not make pgbench supporting TPC-B like
transactions.
-f filename
Read transaction script from file. Detailed
explanation will appear later.
-C
Establish connection for each transaction, rather than
doing it just once at beginning of pgbench in the normal
mode. This is useful to measure the connection overhead.
-l
Write the time taken by each transaction to a logfile,
with the name "pgbench_log.xxx", where xxx is the PID
of the pgbench process. The format of the log is:
client_id transaction_no time file_no time-epoch time-us
where time is measured in microseconds, , the file_no is
which test file was used (useful when multiple were
specified with -f), and time-epoch/time-us are a
UNIX epoch format timestamp followed by an offset
in microseconds (suitable for creating a ISO 8601
timestamp with a fraction of a second) of when
the transaction completed.
Here are example outputs:
0 199 2241 0 1175850568 995598
0 200 2465 0 1175850568 998079
0 201 2513 0 1175850569 608
0 202 2038 0 1175850569 2663
-F fillfactor
Create tables(accounts, tellers and branches) with the given
fillfactor. Default is 100. This should be used with -i
(initialize) option.
-d
debug option.
o What is the "transaction" actually performed in pgbench?
(1) begin;
(2) update accounts set abalance = abalance + :delta where aid = :aid;
(3) select abalance from accounts where aid = :aid;
(4) update tellers set tbalance = tbalance + :delta where tid = :tid;
(5) update branches set bbalance = bbalance + :delta where bid = :bid;
(6) insert into history(tid,bid,aid,delta) values(:tid,:bid,:aid,:delta);
(7) end;
If you specify -N, (4) and (5) aren't included in the transaction.
o -f option
This supports for reading transaction script from a specified
file. This file should include SQL commands in each line. SQL
command consists of multiple lines are not supported. Empty lines
and lines begging with "--" will be ignored.
Multiple -f options are allowed. In this case each transaction is
assigned randomly chosen script.
SQL commands can include "meta command" which begins with "\" (back
slash). A meta command takes some arguments separted by white
spaces. Currently following meta command is supported:
\set name operand1 [ operator operand2 ]
set the calculated value using "operand1" "operator"
"operand2" to variable "name". If "operator" and "operand2"
are omitted, the value of operand1 is set to variable "name".
example:
\set ntellers 10 * :scale
\setrandom name min max
assign random integer to name between min and max
example:
\setrandom aid 1 100000
variables can be reffered to in SQL comands by adding ":" in front
of the varible name.
example:
SELECT abalance FROM accounts WHERE aid = :aid
Variables can also be defined by using -D option.
\sleep num [us|ms|s]
causes script execution to sleep for the specified duration of
microseconds (us), milliseconds (ms) or the default seconds (s).
example:
\setrandom millisec 1000 2500
\sleep :millisec ms
Example, TPC-B like benchmark can be defined as follows(scaling
factor = 1):
\set nbranches :scale
\set ntellers 10 * :scale
\set naccounts 100000 * :scale
\setrandom aid 1 :naccounts
\setrandom bid 1 :nbranches
\setrandom tid 1 :ntellers
\setrandom delta 1 10000
BEGIN
UPDATE accounts SET abalance = abalance + :delta WHERE aid = :aid
SELECT abalance FROM accounts WHERE aid = :aid
UPDATE tellers SET tbalance = tbalance + :delta WHERE tid = :tid
UPDATE branches SET bbalance = bbalance + :delta WHERE bid = :bid
INSERT INTO history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, 'now')
END
If you want to automatically set the scaling factor from the number of
tuples in branches table, use -s option and shell command like this:
pgbench -s $(psql -At -c "SELECT count(*) FROM branches") -f tpc_b.sql
Notice that -f option does not execute vacuum and clearing history
table before starting benchmark.
o License?
Basically it is same as BSD license. See pgbench.c for more details.
o History before contributed to PostgreSQL
2000/1/15 pgbench-1.2 contributed to PostgreSQL
* Add -v option
1999/09/29 pgbench-1.1 released
* Apply cygwin patches contributed by Yutaka Tanida
* More robust when backends die
* Add -S option (select only)
1999/09/04 pgbench-1.0 released
This diff is collapsed.
$PostgreSQL: pgsql/contrib/pgrowlocks/README.pgrowlocks,v 1.2 2007/08/27 00:13:51 tgl Exp $
pgrowlocks README Tatsuo Ishii
1. What is pgrowlocks?
pgrowlocks shows row locking information for specified table.
pgrowlocks returns following columns:
locked_row TID, -- row TID
lock_type TEXT, -- lock type
locker XID, -- locking XID
multi bool, -- multi XID?
xids xid[], -- multi XIDs
pids INTEGER[] -- locker's process id
Here is a sample execution of pgrowlocks:
test=# SELECT * FROM pgrowlocks('t1');
locked_row | lock_type | locker | multi | xids | pids
------------+-----------+--------+-------+-----------+---------------
(0,1) | Shared | 19 | t | {804,805} | {29066,29068}
(0,2) | Shared | 19 | t | {804,805} | {29066,29068}
(0,3) | Exclusive | 804 | f | {804} | {29066}
(0,4) | Exclusive | 804 | f | {804} | {29066}
(4 rows)
locked_row -- tuple ID(TID) of each locked rows
lock_type -- "Shared" for shared lock, "Exclusive" for exclusive lock
locker -- transaction ID of locker (note 1)
multi -- "t" if locker is a multi transaction, otherwise "f"
xids -- XIDs of lockers (note 2)
pids -- process ids of locking backends
note1: if the locker is multi transaction, it represents the multi ID
note2: if the locker is multi, multiple data are shown
2. Installing pgrowlocks
Installing pgrowlocks requires PostgreSQL 8.0 or later source tree.
$ cd /usr/local/src/postgresql-8.1/contrib
$ tar xfz /tmp/pgrowlocks-1.0.tar.gz
If you are using PostgreSQL 8.0, you need to modify pgrowlocks source code.
Around line 61, you will see:
#undef MAKERANGEVARFROMNAMELIST_HAS_TWO_ARGS
change this to:
#define MAKERANGEVARFROMNAMELIST_HAS_TWO_ARGS
$ make
$ make install
$ psql -e -f pgrowlocks.sql test
3. How to use pgrowlocks
pgrowlocks grab AccessShareLock for the target table and read each
row one by one to get the row locking information. You should
notice that:
1) if the table is exclusive locked by someone else, pgrowlocks
will be blocked.
2) pgrowlocks may show incorrect information if there's a new
lock or a lock is freeed while its execution.
pgrowlocks does not show the contents of locked rows. If you want
to take a look at the row contents at the same time, you could do
something like this:
SELECT * FROM accounts AS a, pgrowlocks('accounts') AS p WHERE p.locked_ row = a.ctid;
4. License
pgrowlocks is distribute under (modified) BSD license described in
the source file.
5. History
2006/03/21 pgrowlocks version 1.1 released (tested on 8.2 current)
2005/08/22 pgrowlocks version 1.0 released
pgstattuple README 2002/08/29 Tatsuo Ishii
1. Functions supported:
pgstattuple
-----------
pgstattuple() returns the relation length, percentage of the "dead"
tuples of a relation and other info. This may help users to determine
whether vacuum is necessary or not. Here is an example session:
test=> \x
Expanded display is on.
test=> SELECT * FROM pgstattuple('pg_catalog.pg_proc');
-[ RECORD 1 ]------+-------
table_len | 458752
tuple_count | 1470
tuple_len | 438896
tuple_percent | 95.67
dead_tuple_count | 11
dead_tuple_len | 3157
dead_tuple_percent | 0.69
free_space | 8932
free_percent | 1.95
Here are explanations for each column:
table_len -- physical relation length in bytes
tuple_count -- number of live tuples
tuple_len -- total tuples length in bytes
tuple_percent -- live tuples in %
dead_tuple_len -- total dead tuples length in bytes
dead_tuple_percent -- dead tuples in %
free_space -- free space in bytes
free_percent -- free space in %
pg_relpages
-----------
pg_relpages() returns the number of pages in the relation.
pgstatindex
-----------
pgstatindex() returns an array showing the information about an index:
test=> \x
Expanded display is on.
test=> SELECT * FROM pgstatindex('pg_cast_oid_index');
-[ RECORD 1 ]------+------
version | 2
tree_level | 0
index_size | 8192
root_block_no | 1
internal_pages | 0
leaf_pages | 1
empty_pages | 0
deleted_pages | 0
avg_leaf_density | 50.27
leaf_fragmentation | 0
2. Installing pgstattuple
$ make
$ make install
$ psql -e -f /usr/local/pgsql/share/contrib/pgstattuple.sql test
3. Using pgstattuple
pgstattuple may be called as a relation function and is
defined as follows:
CREATE OR REPLACE FUNCTION pgstattuple(text) RETURNS pgstattuple_type
AS 'MODULE_PATHNAME', 'pgstattuple'
LANGUAGE C STRICT;
CREATE OR REPLACE FUNCTION pgstattuple(oid) RETURNS pgstattuple_type
AS 'MODULE_PATHNAME', 'pgstattuplebyid'
LANGUAGE C STRICT;
The argument is the relation name (optionally it may be qualified)
or the OID of the relation. Note that pgstattuple only returns
one row.
4. Notes
pgstattuple acquires only a read lock on the relation. So concurrent
update may affect the result.
pgstattuple judges a tuple is "dead" if HeapTupleSatisfiesNow()
returns false.
5. History
2007/05/17
Moved page-level functions to contrib/pageinspect.
2006/06/28
Extended to work against indexes.
This diff is collapsed.
sslinfo - information about current SSL certificate for PostgreSQL
==================================================================
Author: Victor Wagner <vitus@cryptocom.ru>, Cryptocom LTD
E-Mail of Cryptocom OpenSSL development group: <openssl@cryptocom.ru>
1. Notes
--------
This extension won't build unless your PostgreSQL server is configured
with --with-openssl. Information provided with these functions would
be completely useless if you don't use SSL to connect to database.
2. Functions Description
------------------------
2.1. ssl_is_used()
~~~~~~~~~~~~~~~~~~
ssl_is_used() RETURNS boolean;
Returns TRUE, if current connection to server uses SSL and FALSE
otherwise.
2.2. ssl_client_cert_present()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ssl_client_cert_present() RETURNS boolean
Returns TRUE if current client have presented valid SSL client
certificate to the server and FALSE otherwise (e.g., no SSL,
certificate hadn't be requested by server).
2.3. ssl_client_serial()
~~~~~~~~~~~~~~~~~~~~~~~~
ssl_client_serial() RETURNS numeric
Returns serial number of current client certificate. The combination
of certificate serial number and certificate issuer is guaranteed to
uniquely identify certificate (but not its owner -- the owner ought to
regularily change his keys, and get new certificates from the issuer).
So, if you run you own CA and allow only certificates from this CA to
be accepted by server, the serial number is the most reliable (albeit
not very mnemonic) means to indentify user.
2.4. ssl_client_dn()
~~~~~~~~~~~~~~~~~~~~
ssl_client_dn() RETURNS text
Returns the full subject of current client certificate, converting
character data into the current database encoding. It is assumed that
if you use non-Latin characters in the certificate names, your
database is able to represent these characters, too. If your database
uses the SQL_ASCII encoding, non-Latin characters in the name will be
represented as UTF-8 sequences.
The result looks like '/CN=Somebody /C=Some country/O=Some organization'.
2.5. ssl_issuer_dn()
~~~~~~~~~~~~~~~~~~~~
Returns the full issuer name of the client certificate, converting
character data into current database encoding.
The combination of the return value of this function with the
certificate serial number uniquely identifies the certificate.
The result of this function is really useful only if you have more
than one trusted CA certificate in your server's root.crt file, or if
this CA has issued some intermediate certificate authority
certificates.
2.6. ssl_client_dn_field()
~~~~~~~~~~~~~~~~~~~~~~~~~~
ssl_client_dn_field(fieldName text) RETURNS text
This function returns the value of the specified field in the
certificate subject. Field names are string constants that are
converted into ASN1 object identificators using the OpenSSL object
database. The following values are acceptable:
commonName (alias CN)
surname (alias SN)
name
givenName (alias GN)
countryName (alias C)
localityName (alias L)
stateOrProvinceName (alias ST)
organizationName (alias O)
organizationUnitName (alias OU)
title
description
initials
postalCode
streetAddress
generationQualifier
description
dnQualifier
x500UniqueIdentifier
pseudonim
role
emailAddress
All of these fields are optional, except commonName. It depends
entirely on your CA policy which of them would be included and which
wouldn't. The meaning of these fields, howeer, is strictly defined by
the X.500 and X.509 standards, so you cannot just assign arbitrary
meaning to them.
2.7 ssl_issuer_field()
~~~~~~~~~~~~~~~~~~~
ssl_issuer_field(fieldName text) RETURNS text;
Does same as ssl_client_dn_field, but for the certificate issuer
rather than the certificate subject.
UUID Generation Functions
=========================
Peter Eisentraut <peter_e@gmx.net>
This module provides functions to generate universally unique
identifiers (UUIDs) using one of the several standard algorithms, as
well as functions to produce certain special UUID constants.
Installation
------------
The extra library required can be found at
<http://www.ossp.org/pkg/lib/uuid/>.
UUID Generation
---------------
The relevant standards ITU-T Rec. X.667, ISO/IEC 9834-8:2005, and RFC
4122 specify four algorithms for generating UUIDs, identified by the
version numbers 1, 3, 4, and 5. (There is no version 2 algorithm.)
Each of these algorithms could be suitable for a different set of
applications.
uuid_generate_v1()
~~~~~~~~~~~~~~~~~~
This function generates a version 1 UUID. This involves the MAC
address of the computer and a time stamp. Note that UUIDs of this
kind reveal the identity of the computer that created the identifier
and the time at which it did so, which might make it unsuitable for
certain security-sensitive applications.
uuid_generate_v1mc()
~~~~~~~~~~~~~~~~~~~~
This function generates a version 1 UUID but uses a random multicast
MAC address instead of the real MAC address of the computer.
uuid_generate_v3(namespace uuid, name text)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This function generates a version 3 UUID in the given namespace using
the specified input name. The namespace should be one of the special
constants produced by the uuid_ns_*() functions shown below. (It
could be any UUID in theory.) The name is an identifier in the
selected namespace. For example:
uuid_generate_v3(uuid_ns_url(), 'http://www.postgresql.org')
The name parameter will be MD5-hashed, so the cleartext cannot be
derived from the generated UUID.
The generation of UUIDs by this method has no random or
environment-dependent element and is therefore reproducible.
uuid_generate_v4()
~~~~~~~~~~~~~~~~~~
This function generates a version 4 UUID, which is derived entirely
from random numbers.
uuid_generate_v5(namespace uuid, name text)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This function generates a version 5 UUID, which works like a version 3
UUID except that SHA-1 is used as a hashing method. Version 5 should
be preferred over version 3 because SHA-1 is thought to be more secure
than MD5.
UUID Constants
--------------
uuid_nil()
A "nil" UUID constant, which does not occur as a real UUID.
uuid_ns_dns()
Constant designating the DNS namespace for UUIDs.
uuid_ns_url()
Constant designating the URL namespace for UUIDs.
uuid_ns_oid()
Constant designating the ISO object identifier (OID) namespace for
UUIDs. (This pertains to ASN.1 OIDs, unrelated to the OIDs used in
PostgreSQL.)
uuid_ns_x500()
Constant designating the X.500 distinguished name (DN) namespace for
UUIDs.
$PostgreSQL: pgsql/contrib/vacuumlo/README.vacuumlo,v 1.5 2005/06/23 00:06:37 tgl Exp $
This is a simple utility that will remove any orphaned large objects out of a
PostgreSQL database. An orphaned LO is considered to be any LO whose OID
does not appear in any OID data column of the database.
If you use this, you may also be interested in the lo_manage trigger in
contrib/lo. lo_manage is useful to try to avoid creating orphaned LOs
in the first place.
Compiling
--------
Simply run make. A single executable "vacuumlo" is created.
Usage
-----
vacuumlo [options] database [database2 ... databasen]
All databases named on the command line are processed. Available options
include:
-v Write a lot of progress messages
-n Don't remove large objects, just show what would be done
-U username Username to connect as
-W Prompt for password
-h hostname Database server host
-p port Database server port
Method
------
First, it builds a temporary table which contains all of the OIDs of the
large objects in that database.
It then scans through all columns in the database that are of type "oid"
or "lo", and removes matching entries from the temporary table.
The remaining entries in the temp table identify orphaned LOs. These are
removed.
Notes
-----
I decided to place this in contrib as it needs further testing, but hopefully,
this (or a variant of it) would make it into the backend as a "vacuum lo"
command in a later release.
Peter Mount <peter@retep.org.uk>
http://www.retep.org.uk
March 21 1999
Committed April 10 1999 Peter
This diff is collapsed.
<sect1>
<title>adminpack</title>
<para>
adminpack is a PostgreSQL standard module that implements a number of
support functions which pgAdmin and other administration and management tools
can use to provide additional functionality if installed on a server.
</para>
<sect2>
<title>Functions implemented</title>
<para>
Functions implemented by adminpack can only be run by a superuser. Here's a
list of these functions:
</para>
<para>
<programlisting>
int8 pg_catalog.pg_file_write(fname text, data text, append bool)
bool pg_catalog.pg_file_rename(oldname text, newname text, archivname text)
bool pg_catalog.pg_file_rename(oldname text, newname text)
bool pg_catalog.pg_file_unlink(fname text)
setof record pg_catalog.pg_logdir_ls()
/* Renaming of existing backend functions for pgAdmin compatibility */
int8 pg_catalog.pg_file_read(fname text, data text, append bool)
bigint pg_catalog.pg_file_length(text)
int4 pg_catalog.pg_logfile_rotate()
</programlisting>
</para>
</sect2>
</sect1>
<sect1>
<!--
<indexterm zone="btree-gist">
<primary>btree-gist</primary>
</indexterm>
-->
<title>btree-gist</title>
<para>
btree-gist is a B-Tree implementation using GiST that supports the int2, int4,
int8, float4, float8 timestamp with/without time zone, time
with/without time zone, date, interval, oid, money, macaddr, char,
varchar/text, bytea, numeric, bit, varbit and inet/cidr types.
</para>
<sect2>
<title>Example usage</title>
<programlisting>
CREATE TABLE test (a int4);
-- create index
CREATE INDEX testidx ON test USING gist (a);
-- query
SELECT * FROM test WHERE a < 10;
</programlisting>
</sect2>
<sect2>
<title>Authors</title>
<para>
All work was done by Teodor Sigaev (<email>teodor@stack.net</email>) ,
Oleg Bartunov (<email>oleg@sai.msu.su</email>), Janko Richter
(<email>jankorichter@yahoo.de</email>). See
<ulink url="http://www.sai.msu.su/~megera/postgres/gist"></ulink> for additional
information.
</para>
</sect2>
</sect1>
Pg_buffercache - Real time queries on the shared buffer cache.
--------------
This module consists of a C function 'pg_buffercache_pages()' that returns
a set of records, plus a view 'pg_buffercache' to wrapper the function.
The intent is to do for the buffercache what pg_locks does for locks, i.e -
ability to examine what is happening at any given time without having to
restart or rebuild the server with debugging code added.
<sect1 id="buffercache">
<title>pg_buffercache</title>
<indexterm zone="buffercache">
<primary>pg_buffercache</primary>
</indexterm>
<para>
<literal>pg_buffercache</literal> module provides the means for examining
what's happening to the buffercache at any given time without having to
restart or rebuild the server with debugging code added. The intent is to
do for the buffercache what pg_locks does for locks.
</para>
<para>
This module consists of a C function <literal>pg_buffercache_pages()</literal>
that returns a set of records, plus a view <literal>pg_buffercache</literal>
to wrapper the function.
</para>
<para>
By default public access is REVOKED from both of these, just in case there
are security issues lurking.
Installation
------------
Build and install the main Postgresql source, then this contrib module:
$ cd contrib/pg_buffercache
$ gmake
$ gmake install
To register the functions:
$ psql -d <database> -f pg_buffercache.sql
Notes
-----
The definition of the columns exposed in the view is:
</para>
<sect2>
<title>Notes</title>
<para>
The definition of the columns exposed in the view is:
</para>
<programlisting>
Column | references | Description
----------------+----------------------+------------------------------------
bufferid | | Id, 1..shared_buffers.
......@@ -41,23 +36,27 @@ Notes
relblocknumber | | Offset of the page in the relation.
isdirty | | Is the page dirty?
usagecount | | Page LRU count
There is one row for each buffer in the shared cache. Unused buffers are
shown with all fields null except bufferid.
Because the cache is shared by all the databases, there are pages from
relations not belonging to the current database.
When the pg_buffercache view is accessed, internal buffer manager locks are
taken, and a copy of the buffer cache data is made for the view to display.
This ensures that the view produces a consistent set of results, while not
blocking normal buffer activity longer than necessary. Nonetheless there
could be some impact on database performance if this view is read often.
Sample output
-------------
</programlisting>
<para>
There is one row for each buffer in the shared cache. Unused buffers are
shown with all fields null except bufferid.
</para>
<para>
Because the cache is shared by all the databases, there are pages from
relations not belonging to the current database.
</para>
<para>
When the pg_buffercache view is accessed, internal buffer manager locks are
taken, and a copy of the buffer cache data is made for the view to display.
This ensures that the view produces a consistent set of results, while not
blocking normal buffer activity longer than necessary. Nonetheless there
could be some impact on database performance if this view is read often.
</para>
</sect2>
<sect2>
<title>Sample output</title>
<programlisting>
regression=# \d pg_buffercache;
View "public.pg_buffercache"
Column | Type | Modifiers
......@@ -98,18 +97,25 @@ Sample output
(10 rows)
regression=#
</programlisting>
</sect2>
<sect2>
<title>Authors</title>
<itemizedlist>
<listitem>
<para>
Mark Kirkwood <email>markir@paradise.net.nz</email>
</para>
</listitem>
<listitem>
<para>Design suggestions: Neil Conway <email>neilc@samurai.com</email></para>
</listitem>
<listitem>
<para>Debugging advice: Tom Lane <email>tgl@sss.pgh.pa.us</email></para>
</listitem>
</itemizedlist>
</sect2>
</sect1>
Author
------
* Mark Kirkwood <markir@paradise.net.nz>
Help
----
* Design suggestions : Neil Conway <neilc@samurai.com>
* Debugging advice : Tom Lane <tgl@sss.pgh.pa.us>
Thanks guys!
<sect1 id="chkpass">
<title>chkpass</title>
<!--
<indexterm zone="chkpass">
<primary>chkpass</primary>
</indexterm>
-->
<para>
chkpass is a password type that is automatically checked and converted upon
entry. It is stored encrypted. To compare, simply compare against a clear
text password and the comparison function will encrypt it before comparing.
It also returns an error if the code determines that the password is easily
crackable. This is currently a stub that does nothing.
</para>
<para>
Note that the chkpass data type is not indexable.
<!--
I haven't worried about making this type indexable. I doubt that anyone
would ever need to sort a file in order of encrypted password.
-->
</para>
<para>
If you precede the string with a colon, the encryption and checking are
skipped so that you can enter existing passwords into the field.
</para>
<para>
On output, a colon is prepended. This makes it possible to dump and reload
passwords without re-encrypting them. If you want the password (encrypted)
without the colon then use the raw() function. This allows you to use the
type with things like Apache's Auth_PostgreSQL module.
</para>
<para>
The encryption uses the standard Unix function crypt(), and so it suffers
from all the usual limitations of that function; notably that only the
first eight characters of a password are considered.
</para>
<para>
Here is some sample usage:
</para>
<programlisting>
test=# create table test (p chkpass);
CREATE TABLE
test=# insert into test values ('hello');
INSERT 0 1
test=# select * from test;
p
----------------
:dVGkpXdOrE3ko
(1 row)
test=# select raw(p) from test;
raw
---------------
dVGkpXdOrE3ko
(1 row)
test=# select p = 'hello' from test;
?column?
----------
t
(1 row)
test=# select p = 'goodbye' from test;
?column?
----------
f
(1 row)
</programlisting>
<sect2>
<title>Author</title>
<para>
D'Arcy J.M. Cain <email>darcy@druid.net</email>
</para>
</sect2>
</sect1>
<chapter id="contrib">
<title>Standard Modules</title>
<para>
This section contains information regarding the standard modules which
can be found in the <literal>contrib</literal> directory of the
PostgreSQL distribution. These are porting tools, analysis utilities,
and plug-in features that are not part of the core PostgreSQL system,
mainly because they address a limited audience or are too experimental
to be part of the main source tree. This does not preclude their
usefulness.
</para>
<para>
Some modules supply new user-defined functions, operators, or types. In
these cases, you will need to run <literal>make</literal> and <literal>make
install</literal> in <literal>contrib/module</literal>. After you have
installed the files you need to register the new entities in the database
system by running the commands in the supplied .sql file. For example,
<programlisting>
$ psql -d dbname -f module.sql
</programlisting>
</para>
&adminpack;
&btree-gist;
&chkpass;
&cube;
&dblink;
&earthdistance;
&fuzzystrmatch;
&hstore;
&intagg;
&intarray;
&isn;
&lo;
&ltree;
&oid2name;
&pageinspect;
&pgbench;
&buffercache;
&pgcrypto;
&freespacemap;
&pgrowlocks;
&standby;
&pgstattuple;
&trgm;
&seg;
&sslinfo;
&tablefunc;
&uuid-ossp;
&vacuumlo;
&xml2;
</chapter>
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
<!-- $PostgreSQL: pgsql/doc/src/sgml/filelist.sgml,v 1.51 2007/11/01 17:00:18 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/filelist.sgml,v 1.52 2007/11/10 23:30:46 momjian Exp $ -->
<!entity history SYSTEM "history.sgml">
<!entity info SYSTEM "info.sgml">
......@@ -89,6 +89,38 @@
<!entity sources SYSTEM "sources.sgml">
<!entity storage SYSTEM "storage.sgml">
<!-- contrib information -->
<!entity contrib SYSTEM "contrib.sgml">
<!entity adminpack SYSTEM "adminpack.sgml">
<!entity btree-gist SYSTEM "btree-gist.sgml">
<!entity chkpass SYSTEM "chkpass.sgml">
<!entity cube SYSTEM "cube.sgml">
<!entity dblink SYSTEM "dblink.sgml">
<!entity earthdistance SYSTEM "earthdistance.sgml">
<!entity fuzzystrmatch SYSTEM "fuzzystrmatch.sgml">
<!entity hstore SYSTEM "hstore.sgml">
<!entity intagg SYSTEM "intagg.sgml">
<!entity intarray SYSTEM "intarray.sgml">
<!entity isn SYSTEM "isn.sgml">
<!entity lo SYSTEM "lo.sgml">
<!entity ltree SYSTEM "ltree.sgml">
<!entity oid2name SYSTEM "oid2name.sgml">
<!entity pageinspect SYSTEM "pageinspect.sgml">
<!entity pgbench SYSTEM "pgbench.sgml">
<!entity buffercache SYSTEM "buffercache.sgml">
<!entity pgcrypto SYSTEM "pgcrypto.sgml">
<!entity freespacemap SYSTEM "freespacemap.sgml">
<!entity pgrowlocks SYSTEM "pgrowlocks.sgml">
<!entity standby SYSTEM "standby.sgml">
<!entity pgstattuple SYSTEM "pgstattuple.sgml">
<!entity trgm SYSTEM "trgm.sgml">
<!entity seg SYSTEM "seg.sgml">
<!entity sslinfo SYSTEM "sslinfo.sgml">
<!entity tablefunc SYSTEM "tablefunc.sgml">
<!entity uuid-ossp SYSTEM "uuid-ossp.sgml">
<!entity vacuumlo SYSTEM "vacuumlo.sgml">
<!entity xml2 SYSTEM "xml2.sgml">
<!-- appendixes -->
<!entity contacts SYSTEM "contacts.sgml">
<!entity cvs SYSTEM "cvs.sgml">
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
<!-- $PostgreSQL: pgsql/doc/src/sgml/postgres.sgml,v 1.83 2007/11/01 17:00:18 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/postgres.sgml,v 1.84 2007/11/10 23:30:46 momjian Exp $ -->
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [
......@@ -102,6 +102,7 @@
&typeconv;
&indices;
&textsearch;
&contrib;
&mvcc;
&perform;
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment