Commit 706a32cd authored by Peter Eisentraut's avatar Peter Eisentraut

Big editing for consistent content and presentation.

parent 31e69ccb
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/advanced.sgml,v 1.32 2003/02/19 04:06:27 momjian Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/advanced.sgml,v 1.33 2003/03/13 01:30:24 petere Exp $
-->
<chapter id="tutorial-advanced">
......@@ -344,14 +344,14 @@ SELECT name, altitude
which returns:
<screen>
<screen>
name | altitude
-----------+----------
Las Vegas | 2174
Mariposa | 1953
Madison | 845
(3 rows)
</screen>
</screen>
</para>
<para>
......
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/array.sgml,v 1.24 2002/11/11 20:14:02 petere Exp $ -->
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/array.sgml,v 1.25 2003/03/13 01:30:26 petere Exp $ -->
<sect1 id="arrays">
<title>Arrays</title>
......@@ -10,8 +10,14 @@
<para>
<productname>PostgreSQL</productname> allows columns of a table to be
defined as variable-length multidimensional arrays. Arrays of any
built-in type or user-defined type can be created. To illustrate
their use, we create this table:
built-in type or user-defined type can be created.
</para>
<sect2>
<title>Declaration of Array Types</title>
<para>
To illustrate the use of array types, we create this table:
<programlisting>
CREATE TABLE sal_emp (
name text,
......@@ -20,24 +26,27 @@ CREATE TABLE sal_emp (
);
</programlisting>
As shown, an array data type is named by appending square brackets
(<literal>[]</>) to the data type name of the array elements.
The above command will create a table named
<structname>sal_emp</structname> with columns including
a <type>text</type> string (<structfield>name</structfield>),
a one-dimensional array of type
<type>integer</type> (<structfield>pay_by_quarter</structfield>),
which represents the employee's salary by quarter, and a
two-dimensional array of <type>text</type>
(<structfield>schedule</structfield>), which represents the
employee's weekly schedule.
(<literal>[]</>) to the data type name of the array elements. The
above command will create a table named
<structname>sal_emp</structname> with a column of type
<type>text</type> (<structfield>name</structfield>), a
one-dimensional array of type <type>integer</type>
(<structfield>pay_by_quarter</structfield>), which represents the
employee's salary by quarter, and a two-dimensional array of
<type>text</type> (<structfield>schedule</structfield>), which
represents the employee's weekly schedule.
</para>
</sect2>
<sect2>
<title>Array Value Input</title>
<para>
Now we do some <command>INSERT</command>s. Observe that to write an array
Now we can show some <command>INSERT</command> statements. To write an array
value, we enclose the element values within curly braces and separate them
by commas. If you know C, this is not unlike the syntax for
initializing structures. (More details appear below.)
<programlisting>
INSERT INTO sal_emp
VALUES ('Bill',
......@@ -51,8 +60,21 @@ INSERT INTO sal_emp
</programlisting>
</para>
<note>
<para>
A limitation of the present array implementation is that individual
elements of an array cannot be SQL null values. The entire array can be set
to null, but you can't have an array with some elements null and some
not. Fixing this is on the to-do list.
</para>
</note>
</sect2>
<sect2>
<title>Array Value References</title>
<para>
Now, we can run some queries on <structname>sal_emp</structname>.
Now, we can run some queries on the table.
First, we show how to access a single element of an array at a time.
This query retrieves the names of the employees whose pay changed in
the second quarter:
......@@ -91,7 +113,7 @@ SELECT pay_by_quarter[3] FROM sal_emp;
We can also access arbitrary rectangular slices of an array, or
subarrays. An array slice is denoted by writing
<literal><replaceable>lower-bound</replaceable>:<replaceable>upper-bound</replaceable></literal>
for one or more array dimensions. This query retrieves the first
for one or more array dimensions. For example, this query retrieves the first
item on Bill's schedule for the first two days of the week:
<programlisting>
......@@ -109,7 +131,7 @@ SELECT schedule[1:2][1:1] FROM sal_emp WHERE name = 'Bill';
SELECT schedule[1:2][1] FROM sal_emp WHERE name = 'Bill';
</programlisting>
with the same result. An array subscripting operation is taken to
with the same result. An array subscripting operation is always taken to
represent an array slice if any of the subscripts are written in the
form
<literal><replaceable>lower</replaceable>:<replaceable>upper</replaceable></literal>.
......@@ -199,10 +221,15 @@ SELECT array_dims(schedule) FROM sal_emp WHERE name = 'Carol';
array_lower</function> return the upper/lower bound of the
given array dimension, respectively.
</para>
</sect2>
<sect2>
<title>Searching in Arrays</title>
<para>
To search for a value in an array, you must check each value of the
array. This can be done by hand (if you know the size of the array):
array. This can be done by hand (if you know the size of the array).
For example:
<programlisting>
SELECT * FROM sal_emp WHERE pay_by_quarter[1] = 10000 OR
......@@ -212,8 +239,8 @@ SELECT * FROM sal_emp WHERE pay_by_quarter[1] = 10000 OR
</programlisting>
However, this quickly becomes tedious for large arrays, and is not
helpful if the size of the array is unknown. Although it is not part
of the primary <productname>PostgreSQL</productname> distribution,
helpful if the size of the array is unknown. Although it is not built
into <productname>PostgreSQL</productname>,
there is an extension available that defines new functions and
operators for iterating over array values. Using this, the above
query could be:
......@@ -222,7 +249,7 @@ SELECT * FROM sal_emp WHERE pay_by_quarter[1] = 10000 OR
SELECT * FROM sal_emp WHERE pay_by_quarter[1:4] *= 10000;
</programlisting>
To search the entire array (not just specified columns), you could
To search the entire array (not just specified slices), you could
use:
<programlisting>
......@@ -249,18 +276,11 @@ SELECT * FROM sal_emp WHERE pay_by_quarter **= 10000;
Tables can obviously be searched easily.
</para>
</tip>
</sect2>
<note>
<para>
A limitation of the present array implementation is that individual
elements of an array cannot be SQL null values. The entire array can be set
to null, but you can't have an array with some elements null and some
not. Fixing this is on the to-do list.
</para>
</note>
<sect2>
<title>Array Input and Output Syntax</title>
<formalpara>
<title>Array input and output syntax.</title>
<para>
The external representation of an array value consists of items that
are interpreted according to the I/O conversion rules for the array's
......@@ -280,10 +300,11 @@ SELECT * FROM sal_emp WHERE pay_by_quarter **= 10000;
is not ignored, however: after skipping leading whitespace, everything
up to the next right brace or delimiter is taken as the item value.
</para>
</formalpara>
</sect2>
<sect2>
<title>Quoting Array Elements</title>
<formalpara>
<title>Quoting array elements.</title>
<para>
As shown above, when writing an array value you may write double
quotes around any individual array
......@@ -295,7 +316,6 @@ SELECT * FROM sal_emp WHERE pay_by_quarter **= 10000;
Alternatively, you can use backslash-escaping to protect all data characters
that would otherwise be taken as array syntax or ignorable white space.
</para>
</formalpara>
<para>
The array output routine will put double quotes around element values
......@@ -308,7 +328,7 @@ SELECT * FROM sal_emp WHERE pay_by_quarter **= 10000;
<productname>PostgreSQL</productname> releases.)
</para>
<tip>
<note>
<para>
Remember that what you write in an SQL command will first be interpreted
as a string literal, and then as an array. This doubles the number of
......@@ -325,6 +345,7 @@ INSERT ... VALUES ('{"\\\\","\\""}');
<type>bytea</> for example, we might need as many as eight backslashes
in the command to get one backslash into the stored array element.)
</para>
</tip>
</note>
</sect2>
</sect1>
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/charset.sgml,v 2.31 2003/01/19 00:13:28 momjian Exp $ -->
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/charset.sgml,v 2.32 2003/03/13 01:30:26 petere Exp $ -->
<chapter id="charset">
<title>Localization</>
......@@ -75,7 +75,7 @@
<command>initdb</command> exactly which locale you want with the
option <option>--locale</option>. For example:
<screen>
<prompt>$ </><userinput>initdb --locale=sv_SE</>
initdb --locale=sv_SE
</screen>
</para>
......@@ -517,7 +517,7 @@ perl: warning: Falling back to the standard locale ("C").
for a <productname>PostgreSQL</productname> installation. For example:
<screen>
$ <userinput>initdb -E EUC_JP</>
initdb -E EUC_JP
</screen>
sets the default encoding to <literal>EUC_JP</literal> (Extended Unix Code for Japanese).
......@@ -531,7 +531,7 @@ $ <userinput>initdb -E EUC_JP</>
You can create a database with a different encoding:
<screen>
$ <userinput>createdb -E EUC_KR korean</>
createdb -E EUC_KR korean
</screen>
will create a database named <database>korean</database> with <literal>EUC_KR</literal> encoding.
......
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/client-auth.sgml,v 1.45 2003/02/13 05:47:46 momjian Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/client-auth.sgml,v 1.46 2003/03/13 01:30:26 petere Exp $
-->
<chapter id="client-authentication">
......@@ -40,7 +40,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/client-auth.sgml,v 1.45 2003/02/13 05:47:46
runs. If all the users of a particular server also have accounts on
the server's machine, it makes sense to assign database user names
that match their operating system user names. However, a server that
accepts remote connections may have many users who have no local
accepts remote connections may have many database users who have no local operating system
account, and in such cases there need be no connection between
database user names and OS user names.
</para>
......@@ -64,7 +64,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/client-auth.sgml,v 1.45 2003/02/13 05:47:46
<para>
The general format of the <filename>pg_hba.conf</filename> file is
a set of records, one per line. Blank lines are ignored, as is any
text after the <quote>#</quote> comment character. A record is made
text after the <literal>#</literal> comment character. A record is made
up of a number of fields which are separated by spaces and/or tabs.
Fields can contain white space if the field value is quoted. Records
cannot be continued across lines.
......@@ -84,11 +84,11 @@ $Header: /cvsroot/pgsql/doc/src/sgml/client-auth.sgml,v 1.45 2003/02/13 05:47:46
<para>
A record may have one of the three formats
<synopsis>
<synopsis>
local <replaceable>database</replaceable> <replaceable>user</replaceable> <replaceable>authentication-method</replaceable> <optional><replaceable>authentication-option</replaceable></optional>
host <replaceable>database</replaceable> <replaceable>user</replaceable> <replaceable>IP-address</replaceable> <replaceable>IP-mask</replaceable> <replaceable>authentication-method</replaceable> <optional><replaceable>authentication-option</replaceable></optional>
hostssl <replaceable>database</replaceable> <replaceable>user</replaceable> <replaceable>IP-address</replaceable> <replaceable>IP-mask</replaceable> <replaceable>authentication-method</replaceable> <optional><replaceable>authentication-option</replaceable></optional>
</synopsis>
</synopsis>
The meaning of the fields is as follows:
<variablelist>
......@@ -96,7 +96,7 @@ hostssl <replaceable>database</replaceable> <replaceable>user</replaceable> <
<term><literal>local</literal></term>
<listitem>
<para>
This record matches connection attempts using Unix domain
This record matches connection attempts using Unix-domain
sockets. Without a record of this type, Unix-domain socket
connections are disallowed
</para>
......@@ -181,11 +181,9 @@ hostssl <replaceable>database</replaceable> <replaceable>user</replaceable> <
numerically, not as domain or host names.) Taken together they
specify the client machine IP addresses that this record
matches. The precise logic is that
<blockquote>
<informalfigure>
<programlisting>(<replaceable>actual-IP-address</replaceable> xor <replaceable>IP-address-field</replaceable>) and <replaceable>IP-mask-field</replaceable></programlisting>
</informalfigure>
</blockquote>
<programlisting>
(<replaceable>actual-IP-address</replaceable> xor <replaceable>IP-address-field</replaceable>) and <replaceable>IP-mask-field</replaceable>
</programlisting>
must be zero for the record to match. (Of course IP addresses
can be spoofed but this consideration is beyond the scope of
<productname>PostgreSQL</productname>.) If you machine supports
......@@ -217,7 +215,7 @@ hostssl <replaceable>database</replaceable> <replaceable>user</replaceable> <
<para>
The connection is allowed unconditionally. This method
allows anyone that can connect to the
<productname>PostgreSQL</productname> database to login as
<productname>PostgreSQL</productname> database server to login as
any <productname>PostgreSQL</productname> user they like,
without the need for a password. See <xref
linkend="auth-trust"> for details.
......@@ -251,7 +249,7 @@ hostssl <replaceable>database</replaceable> <replaceable>user</replaceable> <
<term><literal>crypt</></term>
<listitem>
<para>
Like <literal>md5</literal> method but uses older crypt
Like the <literal>md5</literal> method but uses older <function>crypt()</>
encryption, which is needed for pre-7.2 clients.
<literal>md5</literal> is preferred for 7.2 and later clients.
See <xref linkend="auth-password"> for details.
......@@ -263,7 +261,7 @@ hostssl <replaceable>database</replaceable> <replaceable>user</replaceable> <
<term><literal>password</></term>
<listitem>
<para>
Same as "md5", but the password is sent in clear text over the
Same as <literal>md5</>, but the password is sent in clear text over the
network. This should not be used on untrusted networks.
See <xref linkend="auth-password"> for details.
</para>
......@@ -306,11 +304,11 @@ hostssl <replaceable>database</replaceable> <replaceable>user</replaceable> <
<para>
If you use the map <literal>sameuser</literal>, the user
names are assumed to be identical. If not, the map name is
names are required to be identical. If not, the map name is
looked up in the file <filename>pg_ident.conf</filename>
in the same directory as <filename>pg_hba.conf</filename>.
The connection is accepted if that file contains an
entry for this map name with the ident-supplied user name
entry for this map name with the operating-system user name
and the requested <productname>PostgreSQL</productname> user
name.
</para>
......@@ -365,8 +363,8 @@ hostssl <replaceable>database</replaceable> <replaceable>user</replaceable> <
match parameters and weaker authentication methods, while later
records will have looser match parameters and stronger authentication
methods. For example, one might wish to use <literal>trust</>
authentication for local TCP connections but require a password for
remote TCP connections. In this case a record specifying
authentication for local TCP/IP connections but require a password for
remote TCP/IP connections. In this case a record specifying
<literal>trust</> authentication for connections from 127.0.0.1 would
appear before a record specifying password authentication for a wider
range of allowed client IP addresses.
......@@ -374,27 +372,26 @@ hostssl <replaceable>database</replaceable> <replaceable>user</replaceable> <
<important>
<para>
Do not prevent the superuser from accessing the template1
database. Various utility commands need access to template1.
Do not prevent the superuser from accessing the <literal>template1</literal>
database. Various utility commands need access to <literal>template1</literal>.
</para>
</important>
<para>
<indexterm>
<primary>SIGHUP</primary>
</indexterm>
The <filename>pg_hba.conf</filename> file is read on start-up and when
the <application>postmaster</> receives a
<systemitem>SIGHUP</systemitem> signal. If you edit the file on an
active system, you will need to signal the <application>postmaster</>
the main server process (<command>postmaster</>) receives a
<systemitem>SIGHUP</systemitem><indexterm><primary>SIGHUP</primary></indexterm>
signal. If you edit the file on an
active system, you will need to signal the <command>postmaster</>
(using <literal>pg_ctl reload</> or <literal>kill -HUP</>) to make it
re-read the file.
</para>
<para>
An example of a <filename>pg_hba.conf</filename> file is shown in
<xref linkend="example-pg-hba.conf">. See below for details on the
<xref linkend="example-pg-hba.conf">. See the next section for details on the
different authentication methods.
</para>
<example id="example-pg-hba.conf">
<title>An example <filename>pg_hba.conf</filename> file</title>
......@@ -462,7 +459,6 @@ local all @admins,+support md5
local db1,db2,@demodbs all md5
</programlisting>
</example>
</para>
</sect1>
<sect1 id="auth-methods">
......@@ -479,8 +475,8 @@ local db1,db2,@demodbs all md5
<productname>PostgreSQL</productname> assumes that anyone who can
connect to the server is authorized to access the database as
whatever database user he specifies (including the database superuser).
This method should only be used when there is adequate system-level
protection on connections to the postmaster port.
This method should only be used when there is adequate operating system-level
protection on connections to the server.
</para>
<para>
......@@ -488,8 +484,8 @@ local db1,db2,@demodbs all md5
convenient for local connections on a single-user workstation. It
is usually <emphasis>not</> appropriate by itself on a multiuser
machine. However, you may be able to use <literal>trust</> even
on a multiuser machine, if you restrict access to the postmaster's
socket file using file-system permissions. To do this, set the
on a multiuser machine, if you restrict access to the server's
Unix-domain socket file using file-system permissions. To do this, set the
<varname>unix_socket_permissions</varname> (and possibly
<varname>unix_socket_group</varname>) configuration parameters as
described in <xref linkend="runtime-config-general">. Or you
......@@ -500,18 +496,18 @@ local db1,db2,@demodbs all md5
<para>
Setting file-system permissions only helps for Unix-socket connections.
Local TCP connections are not restricted by it; therefore, if you want
to use permissions for local security, remove the <literal>host ...
Local TCP/IP connections are not restricted by it; therefore, if you want
to use file-system permissions for local security, remove the <literal>host ...
127.0.0.1 ...</> line from <filename>pg_hba.conf</>, or change it to a
non-<literal>trust</> authentication method.
</para>
<para>
<literal>trust</> authentication is only suitable for TCP connections
<literal>trust</> authentication is only suitable for TCP/IP connections
if you trust every user on every machine that is allowed to connect
to the server by the <filename>pg_hba.conf</> lines that specify
<literal>trust</>. It is seldom reasonable to use <literal>trust</>
for any TCP connections other than those from <systemitem>localhost</> (127.0.0.1).
for any TCP/IP connections other than those from <systemitem>localhost</> (127.0.0.1).
</para>
</sect2>
......@@ -530,7 +526,7 @@ local db1,db2,@demodbs all md5
</indexterm>
<para>
Password-based authentication methods include <literal>md5</>,
The password-based authentication methods are <literal>md5</>,
<literal>crypt</>, and <literal>password</>. These methods operate
similarly except for the way that the password is sent across the
connection. If you are at all concerned about password
......@@ -545,7 +541,7 @@ local db1,db2,@demodbs all md5
<productname>PostgreSQL</productname> database passwords are
separate from operating system user passwords. The password for
each database user is stored in the <literal>pg_shadow</> system
catalog table. Passwords can be managed with the query language
catalog table. Passwords can be managed with the SQL
commands <command>CREATE USER</command> and <command>ALTER
USER</command>, e.g., <userinput>CREATE USER foo WITH PASSWORD
'secret';</userinput>. By default, that is, if no password has
......@@ -554,15 +550,10 @@ local db1,db2,@demodbs all md5
</para>
<para>
To restrict the set of users that are allowed to connect to certain
databases, list the users separated by commas, or in a separate
file. The file should contain user names separated by commas or one
user name per line, and be in the same directory as
<filename>pg_hba.conf</>. Mention the (base) name of the file
preceded with <literal>@</> in the user column. The
database column can similarly accept a list of values or
a file name. You can also specify group names by preceding the group
name with <literal>+</>.
To restrict the set of users that are allowed to connect to
certain databases, list the users in the <replaceable>user</>
column of <filename>pg_hba.conf</filename>, as explained in the
previous section.
</para>
</sect2>
......@@ -598,11 +589,11 @@ local db1,db2,@demodbs all md5
<para>
<productname>PostgreSQL</> operates like a normal Kerberos service.
The name of the service principal is
<replaceable>servicename/hostname@realm</>, where
<literal><replaceable>servicename</>/<replaceable>hostname</>@<replaceable>realm</></literal>, where
<replaceable>servicename</> is <literal>postgres</literal> (unless a
different service name was selected at configure time with
<literal>./configure --with-krb-srvnam=whatever</>).
<replaceable>hostname</> is the fully qualified domain name of the
<replaceable>hostname</> is the fully qualified host name of the
server machine. The service principal's realm is the preferred realm
of the server machine.
</para>
......@@ -610,7 +601,7 @@ local db1,db2,@demodbs all md5
<para>
Client principals must have their <productname>PostgreSQL</> user
name as their first component, for example
<replaceable>pgusername/otherstuff@realm</>. At present the realm of
<literal>pgusername/otherstuff@realm</>. At present the realm of
the client is not checked by <productname>PostgreSQL</>; so if you
have cross-realm authentication enabled, then any principal in any
realm that can communicate with yours will be accepted.
......@@ -619,9 +610,9 @@ local db1,db2,@demodbs all md5
<para>
Make sure that your server key file is readable (and preferably only
readable) by the <productname>PostgreSQL</productname> server
account (see <xref linkend="postgres-user">). The location of the
key file is specified with the <varname>krb_server_keyfile</> run
time configuration parameter. (See also <xref
account. (See also <xref linkend="postgres-user">). The location of the
key file is specified with the <varname>krb_server_keyfile</> run-time
configuration parameter. (See also <xref
linkend="runtime-config">.) The default is <filename>/etc/srvtab</>
if you are using Kerberos 4 and
<filename>FILE:/usr/local/pgsql/etc/krb5.keytab</> (or whichever
......@@ -745,7 +736,7 @@ local db1,db2,@demodbs all md5
<productname>PostgreSQL</productname> checks whether that user is
allowed to connect as the database user he is requesting to connect
as. This is controlled by the ident map argument that follows the
<literal>ident</> keyword in the <filename>pg_hba.conf</filename>
<literal>ident</> key word in the <filename>pg_hba.conf</filename>
file. There is a predefined ident map <literal>sameuser</literal>,
which allows any operating system user to connect as the database
user of the same name (if the latter exists). Other maps must be
......@@ -753,10 +744,10 @@ local db1,db2,@demodbs all md5
</para>
<para>
<indexterm><primary>pg_ident.conf</primary></indexterm> Ident maps
Ident maps
other than <literal>sameuser</literal> are defined in the file
<filename>pg_ident.conf</filename> in the data directory, which
contains lines of the general form:
<filename>pg_ident.conf</filename><indexterm><primary>pg_ident.conf</primary></indexterm>
in the data directory, which contains lines of the general form:
<synopsis>
<replaceable>map-name</> <replaceable>ident-username</> <replaceable>database-username</>
</synopsis>
......@@ -771,13 +762,11 @@ local db1,db2,@demodbs all md5
</para>
<para>
<indexterm>
<primary>SIGHUP</primary>
</indexterm>
The <filename>pg_ident.conf</filename> file is read on start-up and
when the <application>postmaster</> receives a
<systemitem>SIGHUP</systemitem> signal. If you edit the file on an
active system, you will need to signal the <application>postmaster</>
when the main server process (<command>postmaster</>) receives a
<systemitem>SIGHUP</systemitem><indexterm><primary>SIGHUP</primary></indexterm>
signal. If you edit the file on an
active system, you will need to signal the <command>postmaster</>
(using <literal>pg_ctl reload</> or <literal>kill -HUP</>) to make it
re-read the file.
</para>
......@@ -788,14 +777,14 @@ local db1,db2,@demodbs all md5
linkend="example-pg-hba.conf"> is shown in <xref
linkend="example-pg-ident.conf">. In this example setup, anyone
logged in to a machine on the 192.168 network that does not have the
Unix user name <systemitem>bryanh</>, <systemitem>ann</>, or
<systemitem>robert</> would not be granted access. Unix user
<systemitem>robert</> would only be allowed access when he tries to
connect as <productname>PostgreSQL</> user <systemitem>bob</>, not
as <systemitem>robert</> or anyone else. <systemitem>ann</> would
only be allowed to connect as <systemitem>ann</>. User
<systemitem>bryanh</> would be allowed to connect as either
<systemitem>bryanh</> himself or as <systemitem>guest1</>.
Unix user name <literal>bryanh</>, <literal>ann</>, or
<literal>robert</> would not be granted access. Unix user
<literal>robert</> would only be allowed access when he tries to
connect as <productname>PostgreSQL</> user <literal>bob</>, not
as <literal>robert</> or anyone else. <literal>ann</> would
only be allowed to connect as <literal>ann</>. User
<literal>bryanh</> would be allowed to connect as either
<literal>bryanh</> himself or as <literal>guest1</>.
</para>
<example id="example-pg-ident.conf">
......@@ -818,12 +807,12 @@ omicron bryanh guest1
<title>PAM Authentication</title>
<para>
This authentication type operates similarly to
<firstterm>password</firstterm> except that it uses PAM (Pluggable
This authentication method operates similarly to
<literal>password</literal> except that it uses PAM (Pluggable
Authentication Modules) as the authentication mechanism. The
default PAM service name is <literal>postgresql</literal>. You can
optionally supply you own service name after the <literal>pam</>
keyword in the file. For more information about PAM, please read
key word in the file <filename>pg_hba.conf</filename>. For more information about PAM, please read
the <ulink
url="http://www.kernel.org/pub/linux/libs/pam/"><productname>Linux-PAM</>
Page</ulink> and the <ulink
......
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.115 2003/02/19 04:06:27 momjian Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.116 2003/03/13 01:30:27 petere Exp $
-->
<chapter id="datatype">
......@@ -22,8 +22,8 @@ $Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.115 2003/02/19 04:06:27 m
</para>
<para>
<xref linkend="datatype-table"> shows all general-purpose data types
included in the standard distribution. Most of the alternative names
<xref linkend="datatype-table"> shows all built-in general-purpose data types.
Most of the alternative names
listed in the
<quote>Aliases</quote> column are the names used internally by
<productname>PostgreSQL</productname> for historical reasons. In
......@@ -31,13 +31,12 @@ $Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.115 2003/02/19 04:06:27 m
but they are not listed here.
</para>
<para>
<table id="datatype-table">
<title>Data Types</title>
<tgroup cols="3">
<thead>
<row>
<entry>Type Name</entry>
<entry>Name</entry>
<entry>Aliases</entry>
<entry>Description</entry>
</row>
......@@ -77,7 +76,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.115 2003/02/19 04:06:27 m
<row>
<entry><type>box</type></entry>
<entry></entry>
<entry>rectangular box in 2D plane</entry>
<entry>rectangular box in the plane</entry>
</row>
<row>
......@@ -107,7 +106,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.115 2003/02/19 04:06:27 m
<row>
<entry><type>circle</type></entry>
<entry></entry>
<entry>circle in 2D plane</entry>
<entry>circle in the plane</entry>
</row>
<row>
......@@ -137,19 +136,19 @@ $Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.115 2003/02/19 04:06:27 m
<row>
<entry><type>interval(<replaceable>p</replaceable>)</type></entry>
<entry></entry>
<entry>general-use time span</entry>
<entry>time span</entry>
</row>
<row>
<entry><type>line</type></entry>
<entry></entry>
<entry>infinite line in 2D plane (not implemented)</entry>
<entry>infinite line in the plane (not fully implemented)</entry>
</row>
<row>
<entry><type>lseg</type></entry>
<entry></entry>
<entry>line segment in 2D plane</entry>
<entry>line segment in the plane</entry>
</row>
<row>
......@@ -175,19 +174,19 @@ $Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.115 2003/02/19 04:06:27 m
<row>
<entry><type>path</type></entry>
<entry></entry>
<entry>open and closed geometric path in 2D plane</entry>
<entry>open and closed geometric path in the plane</entry>
</row>
<row>
<entry><type>point</type></entry>
<entry></entry>
<entry>geometric point in 2D plane</entry>
<entry>geometric point in the plane</entry>
</row>
<row>
<entry><type>polygon</type></entry>
<entry></entry>
<entry>closed geometric path in 2D plane</entry>
<entry>closed geometric path in the plane</entry>
</row>
<row>
......@@ -240,7 +239,6 @@ $Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.115 2003/02/19 04:06:27 m
</tbody>
</tgroup>
</table>
</para>
<note>
<title>Compatibility</title>
......@@ -264,11 +262,8 @@ $Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.115 2003/02/19 04:06:27 m
to <productname>PostgreSQL</productname>, such as open and closed
paths, or have several possibilities for formats, such as the date
and time types.
Most of the input and output functions corresponding to the
base types (e.g., integers and floating-point numbers) do some
error-checking.
Some of the input and output functions are not invertible. That is,
the result of an output function may lose precision when compared to
the result of an output function may lose accuracy when compared to
the original input.
</para>
......@@ -277,7 +272,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.115 2003/02/19 04:06:27 m
addition and multiplication) do not perform run-time error-checking in the
interests of improving execution speed.
On some systems, for example, the numeric operators for some data types may
silently underflow or overflow.
silently cause underflow or overflow.
</para>
<sect1 id="datatype-numeric">
......@@ -358,8 +353,8 @@ $Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.115 2003/02/19 04:06:27 m
<tgroup cols="4">
<thead>
<row>
<entry>Type name</entry>
<entry>Storage size</entry>
<entry>Name</entry>
<entry>Storage Size</entry>
<entry>Description</entry>
<entry>Range</entry>
</row>
......@@ -369,19 +364,19 @@ $Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.115 2003/02/19 04:06:27 m
<row>
<entry><type>smallint</></entry>
<entry>2 bytes</entry>
<entry>small range fixed-precision</entry>
<entry>small-range integer</entry>
<entry>-32768 to +32767</entry>
</row>
<row>
<entry><type>integer</></entry>
<entry>4 bytes</entry>
<entry>usual choice for fixed-precision</entry>
<entry>usual choice for integer</entry>
<entry>-2147483648 to +2147483647</entry>
</row>
<row>
<entry><type>bigint</></entry>
<entry>8 bytes</entry>
<entry>large range fixed-precision</entry>
<entry>large-range integer</entry>
<entry>-9223372036854775808 to 9223372036854775807</entry>
</row>
......@@ -437,10 +432,10 @@ $Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.115 2003/02/19 04:06:27 m
</para>
<sect2 id="datatype-int">
<title>The Integer Types</title>
<title>Integer Types</title>
<para>
The types <type>smallint</type>, <type>integer</type>,
The types <type>smallint</type>, <type>integer</type>, and
<type>bigint</type> store whole numbers, that is, numbers without
fractional components, of various ranges. Attempts to store
values outside of the allowed range will result in an error.
......@@ -501,7 +496,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.115 2003/02/19 04:06:27 m
<title>Arbitrary Precision Numbers</title>
<para>
The type <type>numeric</type> can store numbers with up to 1,000
The type <type>numeric</type> can store numbers with up to 1000
digits of precision and perform calculations exactly. It is
especially recommended for storing monetary amounts and other
quantities where exactness is required. However, the
......@@ -625,7 +620,7 @@ NUMERIC
</sect2>
<sect2 id="datatype-serial">
<title>The Serial Types</title>
<title>Serial Types</title>
<indexterm zone="datatype-serial">
<primary>serial</primary>
......@@ -654,7 +649,8 @@ NUMERIC
</indexterm>
<para>
The <type>serial</type> data type is not a true type, but merely
The data types <type>serial</type> and <type>bigserial</type>
are not true types, but merely
a notational convenience for setting up identifier columns
(similar to the <literal>AUTO_INCREMENT</literal> property
supported by some other databases). In the current
......@@ -684,6 +680,16 @@ CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
not automatic.
</para>
<note>
<para>
Prior to <productname>PostgreSQL</productname> 7.3, <type>serial</type>
implied <literal>UNIQUE</literal>. This is no longer automatic. If
you wish a serial column to be in a unique constraint or a
primary key, it must now be specified, same as with
any other data type.
</para>
</note>
<para>
To use a <type>serial</type> column to insert the next value of
the sequence into the table, specify that the <type>serial</type>
......@@ -705,7 +711,7 @@ CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
<para>
The sequence created by a <type>serial</type> type is
automatically dropped when the owning column is dropped, and
automatically dropped when the owning column is dropped and
cannot be dropped otherwise. (This was not true in
<productname>PostgreSQL</productname> releases before 7.3. Note
that this automatic drop linkage will not occur for a sequence
......@@ -714,49 +720,32 @@ CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
dependency link.) Furthermore, this dependency between sequence
and column is made only for the <type>serial</> column itself; if
any other columns reference the sequence (perhaps by manually
calling the <function>nextval()</>) function), they may be broken
calling the <function>nextval</>) function), they may be broken
if the sequence is removed. Using <type>serial</> columns in
fashion is considered bad form.
</para>
<note>
<para>
Prior to <productname>PostgreSQL</> 7.3, <type>serial</type>
implied <literal>UNIQUE</literal>. This is no longer automatic.
If you wish a serial column to be <literal>UNIQUE</literal> or a
<literal>PRIMARY KEY</literal> it must now be specified, just as
with any other data type.
</para>
</note>
</sect2>
</sect1>
<sect1 id="datatype-money">
<title>Monetary Type</title>
<title>Monetary Types</title>
<note>
<title>Note</title>
<para>
The <type>money</type> type is deprecated. Use
<type>numeric</type> or <type>decimal</type> instead, in
combination with the <function>to_char</function> function. The
money type may become a locale-aware layer over the
<type>numeric</type> type in a future release.
combination with the <function>to_char</function> function.
</para>
</note>
<para>
The <type>money</type> type stores a currency amount with fixed
decimal point representation; see <xref
linkend="datatype-money-table">. The output format is
locale-specific.
</para>
<para>
The <type>money</type> type stores a currency amount with a fixed
fractional precision; see <xref
linkend="datatype-money-table">.
Input is accepted in a variety of formats, including integer and
floating-point literals, as well as <quote>typical</quote>
currency formatting, such as <literal>'$1,000.00'</literal>.
Output is in the latter form.
Output is generally in the latter form but depends on the locale.
</para>
<table id="datatype-money-table">
......@@ -764,8 +753,8 @@ CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
<tgroup cols="4">
<thead>
<row>
<entry>Type Name</entry>
<entry>Storage</entry>
<entry>Name</entry>
<entry>Storage Size</entry>
<entry>Description</entry>
<entry>Range</entry>
</row>
......@@ -806,7 +795,7 @@ CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
<tgroup cols="2">
<thead>
<row>
<entry>Type name</entry>
<entry>Name</entry>
<entry>Description</entry>
</row>
</thead>
......@@ -850,7 +839,6 @@ CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
string.
</para>
<note>
<para>
If one explicitly casts a value to <type>character
varying(<replaceable>n</>)</type> or
......@@ -859,7 +847,6 @@ CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
raising an error. (This too is required by the
<acronym>SQL</acronym> standard.)
</para>
</note>
<note>
<para>
......@@ -881,13 +868,11 @@ CREATE TABLE <replaceable class="parameter">tablename</replaceable> (
</para>
<para>
In addition, <productname>PostgreSQL</productname> supports the
more general <type>text</type> type, which stores strings of any
length. Unlike <type>character varying</type>, <type>text</type>
does not require an explicit declared upper limit on the size of
the string. Although the type <type>text</type> is not in the
<acronym>SQL</acronym> standard, many other RDBMS packages have it
as well.
In addition, <productname>PostgreSQL</productname> provides the
<type>text</type> type, which stores strings of any
length. Although the type <type>text</type> is not in the
<acronym>SQL</acronym> standard, several other SQL database products
have it as well.
</para>
<para>
......@@ -963,8 +948,8 @@ SELECT b, char_length(b) FROM test2;
There are two other fixed-length character types in
<productname>PostgreSQL</productname>, shown in <xref
linkend="datatype-character-special-table">. The <type>name</type>
type exists <emphasis>only</emphasis> for storage of internal
catalog names and is not intended for use by the general user. Its
type exists <emphasis>only</emphasis> for storage of identifiers
in the internal system catalogs and is not intended for use by the general user. Its
length is currently defined as 64 bytes (63 usable characters plus
terminator) but should be referenced using the constant
<symbol>NAMEDATALEN</symbol>. The length is set at compile time (and
......@@ -976,12 +961,12 @@ SELECT b, char_length(b) FROM test2;
</para>
<table id="datatype-character-special-table">
<title>Specialty Character Types</title>
<title>Special Character Types</title>
<tgroup cols="3">
<thead>
<row>
<entry>Type Name</entry>
<entry>Storage</entry>
<entry>Name</entry>
<entry>Storage Size</entry>
<entry>Description</entry>
</row>
</thead>
......@@ -989,12 +974,12 @@ SELECT b, char_length(b) FROM test2;
<row>
<entry><type>"char"</type></entry>
<entry>1 byte</entry>
<entry>single character internal type</entry>
<entry>single-character internal type</entry>
</row>
<row>
<entry><type>name</type></entry>
<entry>64 bytes</entry>
<entry>sixty-three character internal type</entry>
<entry>internal type for object names</entry>
</row>
</tbody>
</tgroup>
......@@ -1003,19 +988,19 @@ SELECT b, char_length(b) FROM test2;
</sect1>
<sect1 id="datatype-binary">
<title>Binary Strings</title>
<title>Binary Data Types</title>
<para>
The <type>bytea</type> data type allows storage of binary strings;
see <xref linkend="datatype-binary-table">.
</para>
<table id="datatype-binary-table">
<title>Binary String Types</title>
<title>Binary Data Types</title>
<tgroup cols="3">
<thead>
<row>
<entry>Type Name</entry>
<entry>Storage</entry>
<entry>Name</entry>
<entry>Storage Size</entry>
<entry>Description</entry>
</row>
</thead>
......@@ -1023,8 +1008,7 @@ SELECT b, char_length(b) FROM test2;
<row>
<entry><type>bytea</type></entry>
<entry>4 bytes plus the actual binary string</entry>
<entry>Variable (not specifically limited)
length binary string</entry>
<entry>variable-length binary string</entry>
</row>
</tbody>
</tgroup>
......@@ -1034,7 +1018,7 @@ SELECT b, char_length(b) FROM test2;
A binary string is a sequence of octets (or bytes). Binary
strings are distinguished from characters strings by two
characteristics: First, binary strings specifically allow storing
octets of zero value and other <quote>non-printable</quote>
octets of value zero and other <quote>non-printable</quote>
octets. Second, operations on binary strings process the actual
bytes, whereas the encoding and processing of character strings
depends on locale settings.
......@@ -1058,9 +1042,9 @@ SELECT b, char_length(b) FROM test2;
<row>
<entry>Decimal Octet Value</entry>
<entry>Description</entry>
<entry>Input Escaped Representation</entry>
<entry>Escaped Input Representation</entry>
<entry>Example</entry>
<entry>Printed Result</entry>
<entry>Output Representation</entry>
</row>
</thead>
......@@ -1096,13 +1080,37 @@ SELECT b, char_length(b) FROM test2;
<para>
Note that the result in each of the examples in <xref linkend="datatype-binary-sqlesc"> was exactly one
octet in length, even though the output representation of the zero
octet and backslash are more than one character. <type>Bytea</type>
output octets are also escaped. In general, each
<quote>non-printable</quote> octet decimal value is converted into
its equivalent three digit octal value, and preceded by one backslash.
octet and backslash are more than one character.
</para>
<para>
The reason that you have to write so many backslashes, as shown in
<xref linkend="datatype-binary-sqlesc">, is that an input string
written as a string literal must pass through two parse phases in
the <productname>PostgreSQL</productname> server. The first
backslash of each pair is interpreted as an escape character by
the string-literal parser and is therefore consumed, leaving the
second backslash of the pair. The remaining backslash is then
recognized by the <type>bytea</type> input function as starting
either a three digit octal value or escaping another backslash.
For example, a string literal passed to the server as
<literal>'\\001'</literal> becomes <literal>\001</literal> after
passing through the string-literal parser. The
<literal>\001</literal> is then sent to the <type>bytea</type>
input function, where it is converted to a single octet with a
decimal value of 1. Note that the apostrophe character is not
treated specially by <type>bytea</type>, so it follows the normal
rules for string literals. (See also <xref
linkend="sql-syntax-strings">.)
</para>
<para>
<type>Bytea</type> octets are also escaped in the output. In general, each
<quote>non-printable</quote> octet is converted into
its equivalent three-digit octal value and preceded by one backslash.
Most <quote>printable</quote> octets are represented by their standard
representation in the client character set. The octet with decimal
value 92 (backslash) has a special alternate output representation.
value 92 (backslash) has a special alternative output representation.
Details are in <xref linkend="datatype-binary-resesc">.
</para>
......@@ -1113,9 +1121,9 @@ SELECT b, char_length(b) FROM test2;
<row>
<entry>Decimal Octet Value</entry>
<entry>Description</entry>
<entry>Output Escaped Representation</entry>
<entry>Escaped Output Representation</entry>
<entry>Example</entry>
<entry>Printed Result</entry>
<entry>Output Result</entry>
</row>
</thead>
......@@ -1132,7 +1140,7 @@ SELECT b, char_length(b) FROM test2;
<row>
<entry>0 to 31 and 127 to 255</entry>
<entry><quote>non-printable</quote> octets</entry>
<entry><literal>\### (octal value)</literal></entry>
<entry><literal>\<replaceable>xxx</></literal> (octal value)</entry>
<entry><literal>SELECT '\\001'::bytea;</literal></entry>
<entry><literal>\001</literal></entry>
</row>
......@@ -1149,60 +1157,12 @@ SELECT b, char_length(b) FROM test2;
</tgroup>
</table>
<para>
To use the <type>bytea</type> escaped octet notation, string
literals (input strings) must contain two backslashes because they
must pass through two parsers in the <productname>PostgreSQL</>
server. The first backslash is interpreted as an escape character
by the string-literal parser, and therefore is consumed, leaving
the characters that follow. The remaining backslash is recognized
by the <type>bytea</type> input function as the prefix of a three
digit octal value. For example, a string literal passed to the
backend as <literal>'\\001'</literal> becomes
<literal>'\001'</literal> after passing through the string-literal
parser. The <literal>'\001'</literal> is then sent to the
<type>bytea</type> input function, where it is converted to a
single octet with a decimal value of 1.
</para>
<para>
For a similar reason, a backslash must be input as
<literal>'\\\\'</literal> (or <literal>'\\134'</literal>). The first
and third backslashes are interpreted as escape characters by the
string-literal parser, and therefore are consumed, leaving two
backslashes in the string passed to the <type>bytea</type> input function,
which interprets them as representing a single backslash.
For example, a string literal passed to the
server as <literal>'\\\\'</literal> becomes <literal>'\\'</literal>
after passing through the string-literal parser. The
<literal>'\\'</literal> is then sent to the <type>bytea</type> input
function, where it is converted to a single octet with a decimal
value of 92.
</para>
<para>
A single quote is a bit different in that it must be input as
<literal>'\''</literal> (or <literal>'\\047'</literal>),
<emphasis>not</emphasis> as <literal>'\\''</literal>. This is because,
while the literal parser interprets the single quote as a special
character, and will consume the single backslash, the
<type>bytea</type> input function does <emphasis>not</emphasis>
recognize a single quote as a special octet. Therefore a string
literal passed to the backend as <literal>'\''</literal> becomes
<literal>'''</literal> after passing through the string-literal
parser. The <literal>'''</literal> is then sent to the
<type>bytea</type> input function, where it is retains its single
octet decimal value of 39.
</para>
<para>
Depending on the front end to <productname>PostgreSQL</> you use,
you may have additional work to do in terms of escaping and
unescaping <type>bytea</type> strings. For example, you may also
have to escape line feeds and carriage returns if your interface
automatically translates these. Or you may have to double up on
backslashes if the parser for your language or choice also treats
them as an escape character.
automatically translates these.
</para>
<para>
......@@ -1229,59 +1189,59 @@ SELECT b, char_length(b) FROM test2;
<tgroup cols="6">
<thead>
<row>
<entry>Type</entry>
<entry>Name</entry>
<entry>Storage Size</entry>
<entry>Description</entry>
<entry>Storage</entry>
<entry>Earliest</entry>
<entry>Latest</entry>
<entry>Low Value</entry>
<entry>High Value</entry>
<entry>Resolution</entry>
</row>
</thead>
<tbody>
<row>
<entry><type>timestamp [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
<entry>both date and time</entry>
<entry>8 bytes</entry>
<entry>both date and time</entry>
<entry>4713 BC</entry>
<entry>AD 5874897</entry>
<entry>1 microsecond / 14 digits</entry>
</row>
<row>
<entry><type>timestamp [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
<entry>both date and time</entry>
<entry>8 bytes</entry>
<entry>both date and time, with time zone</entry>
<entry>4713 BC</entry>
<entry>AD 5874897</entry>
<entry>1 microsecond / 14 digits</entry>
</row>
<row>
<entry><type>interval [ (<replaceable>p</replaceable>) ]</type></entry>
<entry>time intervals</entry>
<entry>12 bytes</entry>
<entry>time intervals</entry>
<entry>-178000000 years</entry>
<entry>178000000 years</entry>
<entry>1 microsecond</entry>
</row>
<row>
<entry><type>date</type></entry>
<entry>dates only</entry>
<entry>4 bytes</entry>
<entry>dates only</entry>
<entry>4713 BC</entry>
<entry>32767 AD</entry>
<entry>1 day</entry>
</row>
<row>
<entry><type>time [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
<entry>times of day only</entry>
<entry>8 bytes</entry>
<entry>times of day only</entry>
<entry>00:00:00.00</entry>
<entry>23:59:59.99</entry>
<entry>1 microsecond</entry>
</row>
<row>
<entry><type>time [ (<replaceable>p</replaceable>) ] with time zone</type></entry>
<entry>times of day only</entry>
<entry>12 bytes</entry>
<entry>times of day only, with time zone</entry>
<entry>00:00:00.00+12</entry>
<entry>23:59:59.99-12</entry>
<entry>1 microsecond</entry>
......@@ -1304,8 +1264,8 @@ SELECT b, char_length(b) FROM test2;
<para>
When <type>timestamp</> values are stored as double precision floating-point
numbers (currently the default), the effective limit of precision
may be less than 6, since timestamp values are stored as seconds
since 2000-01-01. Microsecond precision is achieved for dates within
may be less than 6. Timestamp values are stored as seconds
since 2000-01-01, and microsecond precision is achieved for dates within
a few years of 2000-01-01, but the precision degrades for dates further
away. When timestamps are stored as eight-byte integers (a compile-time
option), microsecond precision is available over the full range of
......@@ -1314,34 +1274,26 @@ SELECT b, char_length(b) FROM test2;
</para>
</note>
<note>
<para>
Prior to <productname>PostgreSQL</productname> 7.3, writing just
<type>timestamp</type> was equivalent to <type>timestamp with
time zone</type>. This was changed for SQL compliance.
</para>
</note>
<para>
For the <type>time</type> types, the allowed range of
<replaceable>p</replaceable> is from 0 to 6 when eight-byte integer
storage is used, or from 0 to 10 when floating-point storage is used.
</para>
<para>
Time zones, and time-zone conventions, are influenced by
political decisions, not just earth geometry. Time zones around the
world became somewhat standardized during the 1900's,
but continue to be prone to arbitrary changes.
<productname>PostgreSQL</productname> uses your operating
system's underlying features to provide output time-zone
support, and these systems usually contain information for only
the time period 1902 through 2038 (corresponding to the full
range of conventional Unix system time).
<type>timestamp with time zone</type> and <type>time with time
zone</type> will use time zone
information only within that year range, and assume that times
outside that range are in <acronym>UTC</acronym>.
</para>
<para>
The type <type>time with time zone</type> is defined by the SQL
standard, but the definition exhibits properties which lead to
questionable usefulness. In most cases, a combination of
<type>date</type>, <type>time</type>, <type>timestamp without time
zone</type> and <type>timestamp with time zone</type> should
zone</type>, and <type>timestamp with time zone</type> should
provide a complete range of date/time functionality required by
any application.
</para>
......@@ -1360,22 +1312,22 @@ SELECT b, char_length(b) FROM test2;
<para>
Date and time input is accepted in almost any reasonable format, including
<acronym>ISO 8601</acronym>, <acronym>SQL</acronym>-compatible,
traditional <productname>PostgreSQL</productname>, and others.
ISO 8601, <acronym>SQL</acronym>-compatible,
traditional <productname>POSTGRES</productname>, and others.
For some formats, ordering of month and day in date input can be
ambiguous and there is support for specifying the expected
ordering of these fields.
The command
<literal>SET DateStyle TO 'US'</literal>
or <literal>SET DateStyle TO 'NonEuropean'</literal>
<literal>SET datestyle TO 'US'</literal>
or <literal>SET datestyle TO 'NonEuropean'</literal>
specifies the variant <quote>month before day</quote>, the command
<literal>SET DateStyle TO 'European'</literal> sets the variant
<literal>SET datestyle TO 'European'</literal> sets the variant
<quote>day before month</quote>.
</para>
<para>
<productname>PostgreSQL</productname> is more flexible in
handling date/time than the
handling date/time input than the
<acronym>SQL</acronym> standard requires.
See <xref linkend="datetime-appendix">
for the exact parsing rules of date/time input and for the
......@@ -1393,11 +1345,12 @@ SELECT b, char_length(b) FROM test2;
<replaceable>type</replaceable> [ (<replaceable>p</replaceable>) ] '<replaceable>value</replaceable>'
</synopsis>
where <replaceable>p</replaceable> in the optional precision
specification is an integer corresponding to the
number of fractional digits in the seconds field. Precision can
be specified
for <type>time</type>, <type>timestamp</type>, and
<type>interval</type> types.
specification is an integer corresponding to the number of
fractional digits in the seconds field. Precision can be
specified for <type>time</type>, <type>timestamp</type>, and
<type>interval</type> types. The allowed values are mentioned
above. If no precision is specified in a constant specification,
it defaults to the precision of the literal value.
</para>
<sect3>
......@@ -1433,23 +1386,19 @@ SELECT b, char_length(b) FROM test2;
</row>
<row>
<entry>1/8/1999</entry>
<entry>U.S.; read as August 1 in European mode</entry>
</row>
<row>
<entry>8/1/1999</entry>
<entry>European; read as August 1 in U.S. mode</entry>
<entry>ambiguous (January 8 in U.S. mode; August 1 in European mode)</entry>
</row>
<row>
<entry>1/18/1999</entry>
<entry>U.S.; read as January 18 in any mode</entry>
<entry>U.S. notation; January 18 in any mode</entry>
</row>
<row>
<entry>19990108</entry>
<entry>ISO-8601 year, month, day</entry>
<entry>ISO-8601; year, month, day</entry>
</row>
<row>
<entry>990108</entry>
<entry>ISO-8601 year, month, day</entry>
<entry>ISO-8601; year, month, day</entry>
</row>
<row>
<entry>1999.008</entry>
......@@ -1497,12 +1446,10 @@ SELECT b, char_length(b) FROM test2;
</para>
<para>
Valid input for these types consists of a time of day followed by an
optional time zone. (See <xref linkend="datatype-datetime-time-table">.)
The optional precision
<replaceable>p</replaceable> should be between 0 and 6, and
defaults to the precision of the input time literal. If a time zone
is specified in the input for <type>time without time zone</type>,
Valid input for these types consists of a time of day followed
by an optional time zone. (See <xref
linkend="datatype-datetime-time-table">.) If a time zone is
specified in the input for <type>time without time zone</type>,
it is silently ignored.
</para>
......@@ -1571,7 +1518,7 @@ SELECT b, char_length(b) FROM test2;
</sect3>
<sect3>
<title>Time stamps</title>
<title>Time Stamps</title>
<indexterm>
<primary>timestamp</primary>
......@@ -1588,22 +1535,6 @@ SELECT b, char_length(b) FROM test2;
<secondary>data type</secondary>
</indexterm>
<para>
The time stamp types are <type>timestamp [
(<replaceable>p</replaceable>) ] without time zone</type> and
<type>timestamp [ (<replaceable>p</replaceable>) ] with time
zone</type>. Writing just <type>timestamp</type> is equivalent to
<type>timestamp without time zone</type>.
</para>
<note>
<para>
Prior to <productname>PostgreSQL</productname> 7.3, writing just
<type>timestamp</type> was equivalent to <type>timestamp with time
zone</type>. This was changed for SQL spec compliance.
</para>
</note>
<para>
Valid input for the time stamp types consists of a concatenation
of a date and a time, followed by an optional
......@@ -1629,13 +1560,7 @@ January 8 04:05:06 1999 PST
</para>
<para>
The optional precision
<replaceable>p</replaceable> should be between 0 and 6, and
defaults to the precision of the input <type>timestamp</type> literal.
</para>
<para>
For <type>timestamp without time zone</type>, any explicit time
For <type>timestamp [without time zone]</type>, any explicit time
zone specified in the input is silently ignored. That is, the
resulting date/time value is derived from the explicit date/time
fields in the input value, and is not adjusted for time zone.
......@@ -1643,20 +1568,22 @@ January 8 04:05:06 1999 PST
<para>
For <type>timestamp with time zone</type>, the internally stored
value is always in UTC (GMT). An input value that has an explicit
value is always in UTC (Universal
Coordinated Time, traditionally known as Greenwich Mean Time,
<acronym>GMT</>). An input value that has an explicit
time zone specified is converted to UTC using the appropriate offset
for that time zone. If no time zone is stated in the input string,
then it is assumed to be in the time zone indicated by the system's
<varname>TimeZone</> parameter, and is converted to UTC using the
offset for the <varname>TimeZone</> zone.
<varname>timezone</> parameter, and is converted to UTC using the
offset for the <varname>timezone</> zone.
</para>
<para>
When a <type>timestamp with time
zone</type> value is output, it is always converted from UTC to the
current <varname>TimeZone</> zone, and displayed as local time in that
current <varname>timezone</> zone, and displayed as local time in that
zone. To see the time in another time zone, either change
<varname>TimeZone</> or use the <literal>AT TIME ZONE</> construct
<varname>timezone</> or use the <literal>AT TIME ZONE</> construct
(see <xref linkend="functions-datetime-zoneconvert">).
</para>
......@@ -1664,7 +1591,7 @@ January 8 04:05:06 1999 PST
Conversions between <type>timestamp without time zone</type> and
<type>timestamp with time zone</type> normally assume that the
<type>timestamp without time zone</type> value should be taken or given
as <varname>TimeZone</> local time. A different zone reference can
as <varname>timezone</> local time. A different zone reference can
be specified for the conversion using <literal>AT TIME ZONE</>.
</para>
......@@ -1673,7 +1600,7 @@ January 8 04:05:06 1999 PST
<tgroup cols="2">
<thead>
<row>
<entry>Time Zone</entry>
<entry>Example</entry>
<entry>Description</entry>
</row>
</thead>
......@@ -1710,17 +1637,16 @@ January 8 04:05:06 1999 PST
<type>interval</type> values can be written with the following syntax:
<programlisting>
Quantity Unit [Quantity Unit...] [Direction]
@ Quantity Unit [Quantity Unit...] [Direction]
<optional>@</> <replaceable>quantity</> <replaceable>unit</> <optional><replaceable>quantity</> <replaceable>unit</>...</> <optional><replaceable>direction</></optional>
</programlisting>
where: <literal>Quantity</literal> is a number (possibly signed),
<literal>Unit</literal> is <literal>second</literal>,
Where: <replaceable>quantity</> is a number (possibly signed);
<replaceable>unit</> is <literal>second</literal>,
<literal>minute</literal>, <literal>hour</literal>, <literal>day</literal>,
<literal>week</literal>, <literal>month</literal>, <literal>year</literal>,
<literal>decade</literal>, <literal>century</literal>, <literal>millennium</literal>,
or abbreviations or plurals of these units;
<literal>Direction</literal> can be <literal>ago</literal> or
<replaceable>direction</> can be <literal>ago</literal> or
empty. The at sign (<literal>@</>) is optional noise. The amounts
of different units are implicitly added up with appropriate
sign accounting.
......@@ -1740,7 +1666,7 @@ January 8 04:05:06 1999 PST
</sect3>
<sect3>
<title>Special values</title>
<title>Special Values</title>
<indexterm>
<primary>time</primary>
......@@ -1769,6 +1695,8 @@ January 8 04:05:06 1999 PST
are specially represented inside the system and will be displayed
the same way; but the others are simply notational shorthands
that will be converted to ordinary date/time values when read.
All of these values are treated as normal constants and need to be
written in single quotes.
</para>
<table id="datatype-datetime-special-table">
......@@ -1776,44 +1704,51 @@ January 8 04:05:06 1999 PST
<tgroup cols="2">
<thead>
<row>
<entry>Input string</entry>
<entry>Input String</entry>
<entry>Valid Types</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><literal>epoch</literal></entry>
<entry><type>date</type>, <type>timestamp</type></entry>
<entry>1970-01-01 00:00:00+00 (Unix system time zero)</entry>
</row>
<row>
<entry><literal>infinity</literal></entry>
<entry>later than all other timestamps (not available for
type <type>date</>)</entry>
<entry><type>timestamp</type></entry>
<entry>later than all other time stamps</entry>
</row>
<row>
<entry><literal>-infinity</literal></entry>
<entry>earlier than all other timestamps (not available for
type <type>date</>)</entry>
<entry><type>timestamp</type></entry>
<entry>earlier than all other time stamps</entry>
</row>
<row>
<entry><literal>now</literal></entry>
<entry><type>date</type>, <type>time</type>, <type>timestamp</type></entry>
<entry>current transaction time</entry>
</row>
<row>
<entry><literal>today</literal></entry>
<entry><type>date</type>, <type>timestamp</type></entry>
<entry>midnight today</entry>
</row>
<row>
<entry><literal>tomorrow</literal></entry>
<entry><type>date</type>, <type>timestamp</type></entry>
<entry>midnight tomorrow</entry>
</row>
<row>
<entry><literal>yesterday</literal></entry>
<entry><type>date</type>, <type>timestamp</type></entry>
<entry>midnight yesterday</entry>
</row>
<row>
<entry><literal>zulu</>, <literal>allballs</>, <literal>z</></entry>
<entry>00:00:00.00 GMT</entry>
<entry><type>time</type></entry>
<entry>00:00:00.00 UTC</entry>
</row>
</tbody>
</tgroup>
......@@ -1838,9 +1773,9 @@ January 8 04:05:06 1999 PST
</indexterm>
<para>
Output formats can be set to one of the four styles ISO 8601,
<acronym>SQL</acronym> (Ingres), traditional PostgreSQL, and
German, using the <command>SET DateStyle</command>. The default
The output format of the date/time types can be set to one of the four styles ISO 8601,
<acronym>SQL</acronym> (Ingres), traditional POSTGRES, and
German, using the <literal>SET datestyle</literal>. The default
is the <acronym>ISO</acronym> format. (The
<acronym>SQL</acronym> standard requires the use of the ISO 8601
format. The name of the <quote>SQL</quote> output format is a
......@@ -1873,7 +1808,7 @@ January 8 04:05:06 1999 PST
<entry>12/17/1997 07:37:16.00 PST</entry>
</row>
<row>
<entry>PostgreSQL</entry>
<entry>POSTGRES</entry>
<entry>original style</entry>
<entry>Wed Dec 17 07:37:16 1997 PST</entry>
</row>
......@@ -1909,7 +1844,7 @@ January 8 04:05:06 1999 PST
<row>
<entry>European</entry>
<entry><replaceable>day</replaceable>/<replaceable>month</replaceable>/<replaceable>year</replaceable></entry>
<entry>17/12/1997 15:37:16.00 MET</entry>
<entry>17/12/1997 15:37:16.00 CET</entry>
</row>
<row>
<entry>US</entry>
......@@ -1921,18 +1856,20 @@ January 8 04:05:06 1999 PST
</table>
<para>
<type>interval</type> output looks like the input format, except that units like
<literal>week</literal> or <literal>century</literal> are converted to years and days.
In ISO mode the output looks like
<type>interval</type> output looks like the input format, except
that units like <literal>century</literal> or
<literal>wek</literal> are converted to years and days and that
<literal>ago</literal> is converted to an appropriate sign. In
ISO mode the output looks like
<programlisting>
[ Quantity Units [ ... ] ] [ Days ] Hours:Minutes [ ago ]
<optional> <replaceable>quantity</> <replaceable>unit</> <optional> ... </> </> <optional> <replaceable>days</> </> <optional> <replaceable>hours</>:<replaceable>minutes</>:<replaceable>sekunden</> </optional>
</programlisting>
</para>
<para>
The date/time styles can be selected by the user using the
<command>SET DATESTYLE</command> command, the
<command>SET datestyle</command> command, the
<varname>datestyle</varname> parameter in the
<filename>postgresql.conf</filename> configuration file, and the
<envar>PGDATESTYLE</envar> environment variable on the server or
......@@ -1949,6 +1886,25 @@ January 8 04:05:06 1999 PST
<primary>time zones</primary>
</indexterm>
<para>
Time zones, and time-zone conventions, are influenced by
political decisions, not just earth geometry. Time zones around the
world became somewhat standardized during the 1900's,
but continue to be prone to arbitrary changes.
<productname>PostgreSQL</productname> uses your operating
system's underlying features to provide output time-zone
support, and these systems usually contain information for only
the time period 1902 through 2038 (corresponding to the full
range of conventional Unix system time).
<type>timestamp with time zone</type> and <type>time with time
zone</type> will use time zone
information only within that year range, and assume that times
outside that range are in <acronym>UTC</acronym>.
But since time zone support is derived from the underlying operating
system time-zone capabilities, it can handle daylight-saving time
and other special behavior.
</para>
<para>
<productname>PostgreSQL</productname> endeavors to be compatible with
the <acronym>SQL</acronym> standard definitions for typical usage.
......@@ -1970,8 +1926,8 @@ January 8 04:05:06 1999 PST
<listitem>
<para>
The default time zone is specified as a constant integer offset
from <acronym>GMT</>/<acronym>UTC</>. It is not possible to adapt to daylight-saving
The default time zone is specified as a constant numeric offset
from <acronym>UTC</>. It is not possible to adapt to daylight-saving
time when doing date/time arithmetic across
<acronym>DST</acronym> boundaries.
</para>
......@@ -1988,26 +1944,13 @@ January 8 04:05:06 1999 PST
<productname>PostgreSQL</productname> for legacy applications and
for compatibility with other <acronym>SQL</acronym>
implementations). <productname>PostgreSQL</productname> assumes
your local time zone for any type containing only date or
time. Further, time zone support is derived from the underlying
operating system time-zone capabilities, and hence can handle
daylight-saving time and other expected behavior.
</para>
<para>
<productname>PostgreSQL</productname> obtains time-zone support
from the underlying operating system for dates between 1902 and
2038 (near the typical date limits for Unix-style
systems). Outside of this range, all dates are assumed to be
specified and used in Universal Coordinated Time
(<acronym>UTC</acronym>).
your local time zone for any type containing only date or time.
</para>
<para>
All dates and times are stored internally in
<acronym>UTC</acronym>, traditionally known as Greenwich Mean
Time (<acronym>GMT</acronym>). Times are converted to local time
on the database server before being sent to the client frontend,
<acronym>UTC</acronym>. Times are converted to local time
on the database server before being sent to the client,
hence by default are in the server time zone.
</para>
......@@ -2026,7 +1969,7 @@ January 8 04:05:06 1999 PST
<listitem>
<para>
The <varname>timezone</varname> configuration parameter can be
set in <filename>postgresql.conf</>.
set in the file <filename>postgresql.conf</>.
</para>
</listitem>
......@@ -2191,8 +2134,8 @@ SELECT * FROM test1 WHERE a;
<tgroup cols="4">
<thead>
<row>
<entry>Geometric Type</entry>
<entry>Storage</entry>
<entry>Name</entry>
<entry>Storage Size</entry>
<entry>Representation</entry>
<entry>Description</entry>
</row>
......@@ -2201,50 +2144,50 @@ SELECT * FROM test1 WHERE a;
<row>
<entry><type>point</type></entry>
<entry>16 bytes</entry>
<entry>Point on the plane</entry>
<entry>(x,y)</entry>
<entry>Point in space</entry>
</row>
<row>
<entry><type>line</type></entry>
<entry>32 bytes</entry>
<entry>((x1,y1),(x2,y2))</entry>
<entry>Infinite line (not fully implemented)</entry>
<entry>((x1,y1),(x2,y2))</entry>
</row>
<row>
<entry><type>lseg</type></entry>
<entry>32 bytes</entry>
<entry>((x1,y1),(x2,y2))</entry>
<entry>Finite line segment</entry>
<entry>((x1,y1),(x2,y2))</entry>
</row>
<row>
<entry><type>box</type></entry>
<entry>32 bytes</entry>
<entry>((x1,y1),(x2,y2))</entry>
<entry>Rectangular box</entry>
<entry>((x1,y1),(x2,y2))</entry>
</row>
<row>
<entry><type>path</type></entry>
<entry>16+16n bytes</entry>
<entry>((x1,y1),...)</entry>
<entry>Closed path (similar to polygon)</entry>
<entry>((x1,y1),...)</entry>
</row>
<row>
<entry><type>path</type></entry>
<entry>16+16n bytes</entry>
<entry>[(x1,y1),...]</entry>
<entry>Open path</entry>
<entry>[(x1,y1),...]</entry>
</row>
<row>
<entry><type>polygon</type></entry>
<entry>40+16n bytes</entry>
<entry>((x1,y1),...)</entry>
<entry>Polygon (similar to closed path)</entry>
<entry>((x1,y1),...)</entry>
</row>
<row>
<entry><type>circle</type></entry>
<entry>24 bytes</entry>
<entry><(x,y),r></entry>
<entry>Circle (center and radius)</entry>
<entry>Circle</entry>
<entry><(x,y),r> (center and radius)</entry>
</row>
</tbody>
</tgroup>
......@@ -2257,7 +2200,7 @@ SELECT * FROM test1 WHERE a;
</para>
<sect2>
<title>Point</title>
<title>Points</title>
<indexterm>
<primary>point</primary>
......@@ -2265,39 +2208,20 @@ SELECT * FROM test1 WHERE a;
<para>
Points are the fundamental two-dimensional building block for geometric types.
<type>point</type> is specified using the following syntax:
Values of type <type>point</type> are specified using the following syntax:
<synopsis>
( <replaceable>x</replaceable> , <replaceable>y</replaceable> )
<replaceable>x</replaceable> , <replaceable>y</replaceable>
</synopsis>
where the arguments are
<variablelist>
<varlistentry>
<term><replaceable>x</replaceable></term>
<listitem>
<para>
the x-axis coordinate as a floating-point number
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable>y</replaceable></term>
<listitem>
<para>
the y-axis coordinate as a floating-point number
</para>
</listitem>
</varlistentry>
</variablelist>
where <replaceable>x</> and <replaceable>y</> are the respective
coordinates as floating-point numbers.
</para>
</sect2>
<sect2>
<title>Line Segment</title>
<title>Line Segments</title>
<indexterm>
<primary>line</primary>
......@@ -2305,7 +2229,7 @@ SELECT * FROM test1 WHERE a;
<para>
Line segments (<type>lseg</type>) are represented by pairs of points.
<type>lseg</type> is specified using the following syntax:
Values of type <type>lseg</type> are specified using the following syntax:
<synopsis>
( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) )
......@@ -2313,24 +2237,16 @@ SELECT * FROM test1 WHERE a;
<replaceable>x1</replaceable> , <replaceable>y1</replaceable> , <replaceable>x2</replaceable> , <replaceable>y2</replaceable>
</synopsis>
where the arguments are
<variablelist>
<varlistentry>
<term>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</term>
<term>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</term>
<listitem>
<para>
the end points of the line segment
</para>
</listitem>
</varlistentry>
</variablelist>
where
<literal>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</literal>
and
<literal>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</literal>
are the end points of the line segment.
</para>
</sect2>
<sect2>
<title>Box</title>
<title>Boxes</title>
<indexterm>
<primary>box (data type)</primary>
......@@ -2339,7 +2255,7 @@ SELECT * FROM test1 WHERE a;
<para>
Boxes are represented by pairs of points that are opposite
corners of the box.
<type>box</type> is specified using the following syntax:
Values of type <type>box</type> is specified using the following syntax:
<synopsis>
( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ( <replaceable>x2</replaceable> , <replaceable>y2</replaceable> ) )
......@@ -2347,19 +2263,11 @@ SELECT * FROM test1 WHERE a;
<replaceable>x1</replaceable> , <replaceable>y1</replaceable> , <replaceable>x2</replaceable> , <replaceable>y2</replaceable>
</synopsis>
where the arguments are
<variablelist>
<varlistentry>
<term>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</term>
<term>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</term>
<listitem>
<para>
opposite corners of the box
</para>
</listitem>
</varlistentry>
</variablelist>
where
<literal>(<replaceable>x1</replaceable>,<replaceable>y1</replaceable>)</literal>
and
<literal>(<replaceable>x2</replaceable>,<replaceable>y2</replaceable>)</literal>
are the opposite corners of the box.
</para>
<para>
......@@ -2372,7 +2280,7 @@ SELECT * FROM test1 WHERE a;
</sect2>
<sect2>
<title>Path</title>
<title>Paths</title>
<indexterm>
<primary>path (data type)</primary>
......@@ -2382,19 +2290,19 @@ SELECT * FROM test1 WHERE a;
Paths are represented by connected sets of points. Paths can be
<firstterm>open</firstterm>, where
the first and last points in the set are not connected, and <firstterm>closed</firstterm>,
where the first and last point are connected. Functions
<function>popen(p)</function>
where the first and last point are connected. The functions
<function>popen(<replaceable>p</>)</function>
and
<function>pclose(p)</function>
are supplied to force a path to be open or closed, and functions
<function>isopen(p)</function>
<function>pclose(<replaceable>p</>)</function>
are supplied to force a path to be open or closed, and the functions
<function>isopen(<replaceable>p</>)</function>
and
<function>isclosed(p)</function>
are supplied to test for either type in a query.
<function>isclosed(<replaceable>p</>)</function>
are supplied to test for either type in an expression.
</para>
<para>
<type>path</type> is specified using the following syntax:
Values of type <type>path</type> are specified using the following syntax:
<synopsis>
( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) )
......@@ -2404,20 +2312,10 @@ SELECT * FROM test1 WHERE a;
<replaceable>x1</replaceable> , <replaceable>y1</replaceable> , ... , <replaceable>xn</replaceable> , <replaceable>yn</replaceable>
</synopsis>
where the arguments are
<variablelist>
<varlistentry>
<term>(<replaceable>x</replaceable>,<replaceable>y</replaceable>)</term>
<listitem>
<para>
End points of the line segments comprising the path.
A leading square bracket (<literal>[</>) indicates an open path, while
a leading parenthesis (<literal>(</>) indicates a closed path.
</para>
</listitem>
</varlistentry>
</variablelist>
where the points are the end points of the line segments
comprising the path. Square brackets (<literal>[]</>) indicate
an open path, while parentheses (<literal>()</>) indicate a
closed path.
</para>
<para>
......@@ -2426,7 +2324,7 @@ SELECT * FROM test1 WHERE a;
</sect2>
<sect2>
<title>Polygon</title>
<title>Polygons</title>
<indexterm>
<primary>polygon</primary>
......@@ -2439,7 +2337,7 @@ SELECT * FROM test1 WHERE a;
</para>
<para>
<type>polygon</type> is specified using the following syntax:
Values of type <type>polygon</type> are specified using the following syntax:
<synopsis>
( ( <replaceable>x1</replaceable> , <replaceable>y1</replaceable> ) , ... , ( <replaceable>xn</replaceable> , <replaceable>yn</replaceable> ) )
......@@ -2448,19 +2346,8 @@ SELECT * FROM test1 WHERE a;
<replaceable>x1</replaceable> , <replaceable>y1</replaceable> , ... , <replaceable>xn</replaceable> , <replaceable>yn</replaceable>
</synopsis>
where the arguments are
<variablelist>
<varlistentry>
<term>(<replaceable>x</replaceable>,<replaceable>y</replaceable>)</term>
<listitem>
<para>
End points of the line segments comprising the boundary of the
polygon
</para>
</listitem>
</varlistentry>
</variablelist>
where the points are the end points of the line segments
comprising the boundary of the polygon.
</para>
<para>
......@@ -2469,7 +2356,7 @@ SELECT * FROM test1 WHERE a;
</sect2>
<sect2>
<title>Circle</title>
<title>Circles</title>
<indexterm>
<primary>circle</primary>
......@@ -2477,7 +2364,7 @@ SELECT * FROM test1 WHERE a;
<para>
Circles are represented by a center point and a radius.
<type>circle</type> is specified using the following syntax:
Values of type <type>circle</type> are specified using the following syntax:
<synopsis>
&lt; ( <replaceable>x</replaceable> , <replaceable>y</replaceable> ) , <replaceable>r</replaceable> &gt;
......@@ -2486,27 +2373,9 @@ SELECT * FROM test1 WHERE a;
<replaceable>x</replaceable> , <replaceable>y</replaceable> , <replaceable>r</replaceable>
</synopsis>
where the arguments are
<variablelist>
<varlistentry>
<term>(<replaceable>x</replaceable>,<replaceable>y</replaceable>)</term>
<listitem>
<para>
center of the circle
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable>r</replaceable></term>
<listitem>
<para>
radius of the circle
</para>
</listitem>
</varlistentry>
</variablelist>
where
<literal>(<replaceable>x</replaceable>,<replaceable>y</replaceable>)</literal>
is the center and <replaceable>r</replaceable> is the radius of the circle.
</para>
<para>
......@@ -2517,7 +2386,7 @@ SELECT * FROM test1 WHERE a;
</sect1>
<sect1 id="datatype-net-types">
<title>Network Address Data Types</title>
<title>Network Address Types</title>
<indexterm zone="datatype-net-types">
<primary>network</primary>
......@@ -2533,14 +2402,13 @@ SELECT * FROM test1 WHERE a;
</para>
<table tocentry="1" id="datatype-net-types-table">
<title>Network Address Data Types</title>
<tgroup cols="4">
<title>Network Address Types</title>
<tgroup cols="3">
<thead>
<row>
<entry>Name</entry>
<entry>Storage</entry>
<entry>Storage Size</entry>
<entry>Description</entry>
<entry>Range</entry>
</row>
</thead>
<tbody>
......@@ -2548,22 +2416,19 @@ SELECT * FROM test1 WHERE a;
<row>
<entry><type>cidr</type></entry>
<entry>12 bytes</entry>
<entry>IP networks</entry>
<entry>valid IPv4 networks</entry>
<entry>IPv4 networks</entry>
</row>
<row>
<entry><type>inet</type></entry>
<entry>12 bytes</entry>
<entry>IP hosts and networks</entry>
<entry>valid IPv4 hosts or networks</entry>
<entry>IPv4 hosts and networks</entry>
</row>
<row>
<entry><type>macaddr</type></entry>
<entry>6 bytes</entry>
<entry>MAC addresses</entry>
<entry>customary formats</entry>
</row>
</tbody>
......@@ -2585,11 +2450,11 @@ SELECT * FROM test1 WHERE a;
<para>
The <type>inet</type> type holds an IP host address, and
optionally the identity of the subnet it is in, all in one field.
The subnet identity is represented by the number of bits in the
network part of the address (the <quote>netmask</quote>). If the
netmask is 32,
then the value does not indicate a subnet, only a single host.
Note that if you want to accept networks only, you should use the
The subnet identity is represented by stating how many bits of
the host address represent the network address (the
<quote>netmask</quote>). If the netmask is 32, then the value
does not indicate a subnet, only a single host. Note that if you
want to accept networks only, you should use the
<type>cidr</type> type rather than <type>inet</type>.
</para>
......@@ -2617,15 +2482,15 @@ SELECT * FROM test1 WHERE a;
The <type>cidr</type> type holds an IP network specification.
Input and output formats follow Classless Internet Domain Routing
conventions.
The format for
specifying classless networks is <replaceable
The format for specifying networks is <replaceable
class="parameter">x.x.x.x/y</> where <replaceable
class="parameter">x.x.x.x</> is the network and <replaceable
class="parameter">y</> is the number of bits in the netmask. If
<replaceable class="parameter">y</> is omitted, it is calculated
using assumptions from the older classful numbering system, except
using assumptions from the older classful network numbering system, except
that it will be at least large enough to include all of the octets
written in the input.
written in the input. It is an error to specify a network address
that has bits set to the right of the specified netmask.
</para>
<para>
......@@ -2637,9 +2502,9 @@ SELECT * FROM test1 WHERE a;
<tgroup cols="3">
<thead>
<row>
<entry><type>CIDR</type> Input</entry>
<entry><type>CIDR</type> Displayed</entry>
<entry><function>abbrev</function>(<type>CIDR</type>)</entry>
<entry><type>cidr</type> Input</entry>
<entry><type>cidr</type> Output</entry>
<entry><literal><function>abbrev</function>(<type>cidr</type>)</literal></entry>
</row>
</thead>
<tbody>
......@@ -2704,21 +2569,21 @@ SELECT * FROM test1 WHERE a;
</sect2>
<sect2 id="datatype-inet-vs-cidr">
<title><type>inet</type> vs <type>cidr</type></title>
<title><type>inet</type> vs. <type>cidr</type></title>
<para>
The essential difference between <type>inet</type> and <type>cidr</type>
data types is that <type>inet</type> accepts values with nonzero bits to
the right of the netmask, whereas <type>cidr</type> does not.
</para>
<tip>
<para>
If you do not like the output format for <type>inet</type> or
<type>cidr</type> values, try the <function>host</>(),
<function>text</>(), and <function>abbrev</>() functions.
<type>cidr</type> values, try the functions <function>host</>,
<function>text</>, and <function>abbrev</>.
</para>
</tip>
</para>
</sect2>
<sect2 id="datatype-macaddr">
......@@ -2774,37 +2639,37 @@ SELECT * FROM test1 WHERE a;
<para>
Bit strings are strings of 1's and 0's. They can be used to store
or visualize bit masks. There are two SQL bit types:
<type>BIT(<replaceable>n</replaceable>)</type> and <type>BIT
VARYING(<replaceable>n</replaceable>)</type>, where
<type>bit(<replaceable>n</replaceable>)</type> and <type>bit
varying(<replaceable>n</replaceable>)</type>, where
<replaceable>n</replaceable> is a positive integer.
</para>
<para>
<type>BIT</type> type data must match the length
<type>bit</type> type data must match the length
<replaceable>n</replaceable> exactly; it is an error to attempt to
store shorter or longer bit strings. <type>BIT VARYING</type> data is
store shorter or longer bit strings. <type>bit varying</type> data is
of variable length up to the maximum length
<replaceable>n</replaceable>; longer strings will be rejected.
Writing <type>BIT</type> without a length is equivalent to
<literal>BIT(1)</literal>, while <type>BIT VARYING</type> without a length
Writing <type>bit</type> without a length is equivalent to
<literal>bit(1)</literal>, while <type>bit varying</type> without a length
specification means unlimited length.
</para>
<note>
<para>
If one explicitly casts a bit-string value to
<type>BIT(<replaceable>n</>)</type>, it will be truncated or
<type>bit(<replaceable>n</>)</type>, it will be truncated or
zero-padded on the right to be exactly <replaceable>n</> bits,
without raising an error. Similarly,
if one explicitly casts a bit-string value to
<type>BIT VARYING(<replaceable>n</>)</type>, it will be truncated
<type>bit varying(<replaceable>n</>)</type>, it will be truncated
on the right if it is more than <replaceable>n</> bits.
</para>
</note>
<note>
<para>
Prior to <productname>PostgreSQL</> 7.2, <type>BIT</type> data
Prior to <productname>PostgreSQL</> 7.2, <type>bit</type> data
was always silently truncated or zero-padded on the right, with
or without an explicit cast. This was changed to comply with the
<acronym>SQL</acronym> standard.
......@@ -2842,6 +2707,8 @@ SELECT * FROM test;
</sect1>
&array;
<sect1 id="datatype-oid">
<title>Object Identifier Types</title>
......@@ -2896,7 +2763,7 @@ SELECT * FROM test;
tables. Also, an OID system column is added to user-created tables
(unless <literal>WITHOUT OIDS</> is specified at table creation time).
Type <type>oid</> represents an object identifier. There are also
several aliases for <type>oid</>: <type>regproc</>, <type>regprocedure</>,
several alias types for <type>oid</>: <type>regproc</>, <type>regprocedure</>,
<type>regoper</>, <type>regoperator</>, <type>regclass</>,
and <type>regtype</>. <xref linkend="datatype-oid-table"> shows an overview.
</para>
......@@ -2911,15 +2778,15 @@ SELECT * FROM test;
</para>
<para>
The <type>oid</> type itself has few operations beyond comparison
(which is implemented as unsigned comparison). It can be cast to
The <type>oid</> type itself has few operations beyond comparison.
It can be cast to
integer, however, and then manipulated using the standard integer
operators. (Beware of possible signed-versus-unsigned confusion
if you do this.)
</para>
<para>
The <type>oid</> alias types have no operations of their own except
The OID alias types have no operations of their own except
for specialized input and output routines. These routines are able
to accept and display symbolic names for system objects, rather than
the raw numeric value that type <type>oid</> would use. The alias
......@@ -2936,10 +2803,10 @@ SELECT * FROM test;
<tgroup cols="4">
<thead>
<row>
<entry>Type name</entry>
<entry>Name</entry>
<entry>References</entry>
<entry>Description</entry>
<entry>Value example</entry>
<entry>Value Example</entry>
</row>
</thead>
......@@ -2990,7 +2857,7 @@ SELECT * FROM test;
<row>
<entry><type>regtype</></entry>
<entry><structname>pg_type</></entry>
<entry>type name</entry>
<entry>data type name</entry>
<entry><literal>integer</></entry>
</row>
</tbody>
......@@ -3009,42 +2876,16 @@ SELECT * FROM test;
operand.
</para>
<para>
OIDs are 32-bit quantities and are assigned from a single cluster-wide
counter. In a large or long-lived database, it is possible for the
counter to wrap around. Hence, it is bad practice to assume that OIDs
are unique, unless you take steps to ensure that they are unique.
Recommended practice when using OIDs for row identification is to create
a unique constraint on the OID column of each table for which the OID will
be used. Never assume that OIDs are unique across tables; use the
combination of <structfield>tableoid</> and row OID if you need a
database-wide identifier. (Future releases of
<productname>PostgreSQL</productname> are likely to use a separate
OID counter for each table, so that <structfield>tableoid</>
<emphasis>must</> be included to arrive at a globally unique identifier.)
</para>
<para>
Another identifier type used by the system is <type>xid</>, or transaction
(abbreviated <abbrev>xact</>) identifier. This is the data type of the system columns
<structfield>xmin</> and <structfield>xmax</>.
Transaction identifiers are 32-bit quantities. In a long-lived
database it is possible for transaction IDs to wrap around. This
is not a fatal problem given appropriate maintenance procedures;
see the &cite-admin; for details. However, it is
unwise to depend on uniqueness of transaction IDs over the long term
(more than one billion transactions).
<structfield>xmin</> and <structfield>xmax</>. Transaction identifiers are 32-bit quantities.
</para>
<para>
A third identifier type used by the system is <type>cid</>, or
command identifier. This is the data type of the system columns
<structfield>cmin</> and <structfield>cmax</>. Command
identifiers are also 32-bit quantities. This creates a hard limit
of 2<superscript>32</> (4 billion) <acronym>SQL</acronym> commands
within a single transaction. In practice this limit is not a
problem --- note that the limit is on number of
<acronym>SQL</acronym> commands, not number of tuples processed.
<structfield>cmin</> and <structfield>cmax</>. Command identifiers are also 32-bit quantities.
</para>
<para>
......@@ -3055,6 +2896,10 @@ SELECT * FROM test;
physical location of the tuple within its table.
</para>
<para>
(The system columns are further explained in <xref
linkend="ddl-system-columns">.)
</para>
</sect1>
<sect1 id="datatype-pseudo">
......@@ -3114,57 +2959,56 @@ SELECT * FROM test;
<tgroup cols="2">
<thead>
<row>
<entry>Type name</entry>
<entry>Name</entry>
<entry>Description</entry>
</row>
</thead>
<tbody>
<row>
<entry><type>record</></entry>
<entry>Identifies a function returning an unspecified row type</entry>
<entry>Identifies a function returning an unspecified row type.</entry>
</row>
<row>
<entry><type>any</></entry>
<entry>Indicates that a function accepts any input data type whatever</entry>
<entry>Indicates that a function accepts any input data type whatever.</entry>
</row>
<row>
<entry><type>anyarray</></entry>
<entry>Indicates that a function accepts any array data type</entry>
<entry>Indicates that a function accepts any array data type.</entry>
</row>
<row>
<entry><type>void</></entry>
<entry>Indicates that a function returns no value</entry>
<entry>Indicates that a function returns no value.</entry>
</row>
<row>
<entry><type>trigger</></entry>
<entry>A trigger function is declared to return <type>trigger</></entry>
<entry>A trigger function is declared to return <type>trigger.</></entry>
</row>
<row>
<entry><type>language_handler</></entry>
<entry>A procedural language call handler is declared to return <type>language_handler</></entry>
<entry>A procedural language call handler is declared to return <type>language_handler</>.</entry>
</row>
<row>
<entry><type>cstring</></entry>
<entry>Indicates that a function accepts or returns a null-terminated C string</entry>
<entry>Indicates that a function accepts or returns a null-terminated C string.</entry>
</row>
<row>
<entry><type>internal</></entry>
<entry>Indicates that a function accepts or returns a server-internal
data type</entry>
data type.</entry>
</row>
<row>
<entry><type>opaque</></entry>
<entry>An obsolete type name that formerly served all the above purposes</entry>
<entry>An obsolete type name that formerly served all the above purposes.</entry>
</row>
</tbody>
</tgroup>
......@@ -3199,8 +3043,6 @@ SELECT * FROM test;
</sect1>
&array;
</chapter>
<!-- Keep this comment at the end of the file
......
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/datetime.sgml,v 2.29 2002/11/11 20:14:02 petere Exp $
Date/time details
$Header: /cvsroot/pgsql/doc/src/sgml/datetime.sgml,v 2.30 2003/03/13 01:30:27 petere Exp $
-->
<appendix id="datetime-appendix">
<title id="datetime-appendix-title">Date/Time Support</title>
<title>Date/Time Support</title>
<para>
<productname>PostgreSQL</productname> uses an internal heuristic
......@@ -28,12 +27,10 @@ Date/time details
<title>Date/Time Input Interpretation</title>
<para>
The date/time types are all decoded using a common set of routines.
The date/time type inputs are all decoded using the following routine.
</para>
<procedure>
<title>Date/Time Input Interpretation</title>
<step>
<para>
Break the input string into tokens and categorize each token as
......@@ -61,7 +58,7 @@ Date/time details
If the token is numeric only, then it is either a single field
or an ISO 8601 concatenated date (e.g.,
<literal>19990113</literal> for January 13, 1999) or time
(e.g. <literal>141516</literal> for 14:15:16).
(e.g., <literal>141516</literal> for 14:15:16).
</para>
</step>
......@@ -187,7 +184,7 @@ Date/time details
<para>
If BC has been specified, negate the year and add one for
internal storage. (There is no year zero in the Gregorian
calendar, so numerically <literal>1BC</literal> becomes year
calendar, so numerically 1 BC becomes year
zero.)
</para>
</step>
......@@ -195,8 +192,8 @@ Date/time details
<step>
<para>
If BC was not specified, and if the year field was two digits in length, then
adjust the year to 4 digits. If the field was less than 70, then add 2000;
otherwise, add 1900.
adjust the year to four digits. If the field is less than 70, then add 2000,
otherwise add 1900.
<tip>
<para>
......@@ -382,8 +379,8 @@ Date/time details
<para>
The key word <literal>ABSTIME</literal> is ignored for historical
reasons; in very old releases of
<productname>PostgreSQL</productname> invalid fields of type <type>abstime</type>
reasons: In very old releases of
<productname>PostgreSQL</productname>, invalid values of type <type>abstime</type>
were emitted as <literal>Invalid Abstime</literal>. This is no
longer the case however and this key word will likely be dropped in
a future release.
......@@ -406,7 +403,7 @@ Date/time details
<para>
The table is organized by time zone offset from <acronym>UTC</>,
rather than alphabetically; this is intended to facilitate
rather than alphabetically. This is intended to facilitate
matching local usage with recognized abbreviations for cases where
these might differ.
</para>
......@@ -425,7 +422,7 @@ Date/time details
<row>
<entry>NZDT</entry>
<entry>+13:00</entry>
<entry>New Zealand Daylight Time</entry>
<entry>New Zealand Daylight-Saving Time</entry>
</row>
<row>
<entry>IDLE</entry>
......@@ -455,12 +452,12 @@ Date/time details
<row>
<entry>CADT</entry>
<entry>+10:30</entry>
<entry>Central Australia Daylight Savings Time</entry>
<entry>Central Australia Daylight-Saving Time</entry>
</row>
<row>
<entry>SADT</entry>
<entry>+10:30</entry>
<entry>South Australian Daylight Time</entry>
<entry>South Australian Daylight-Saving Time</entry>
</row>
<row>
<entry>AEST</entry>
......@@ -475,7 +472,7 @@ Date/time details
<row>
<entry>GST</entry>
<entry>+10:00</entry>
<entry>Guam Standard Time, USSR Zone 9</entry>
<entry>Guam Standard Time, Russia zone 9</entry>
</row>
<row>
<entry>LIGT</entry>
......@@ -500,7 +497,7 @@ Date/time details
<row>
<entry>JST</entry>
<entry>+09:00</entry>
<entry>Japan Standard Time,USSR Zone 8</entry>
<entry>Japan Standard Time, Russia zone 8</entry>
</row>
<row>
<entry>KST</entry>
......@@ -515,7 +512,7 @@ Date/time details
<row>
<entry>WDT</entry>
<entry>+09:00</entry>
<entry>West Australian Daylight Time</entry>
<entry>West Australian Daylight-Saving Time</entry>
</row>
<row>
<entry>MT</entry>
......@@ -535,7 +532,7 @@ Date/time details
<row>
<entry>WADT</entry>
<entry>+08:00</entry>
<entry>West Australian Daylight Time</entry>
<entry>West Australian Daylight-Saving Time</entry>
</row>
<row>
<entry>WST</entry>
......@@ -608,7 +605,7 @@ Date/time details
<row>
<entry>EAST</entry>
<entry>+04:00</entry>
<entry>Antananarivo Savings Time</entry>
<entry>Antananarivo Summer Time</entry>
</row>
<row>
<entry>MUT</entry>
......@@ -643,7 +640,7 @@ Date/time details
<row>
<entry>EETDST</entry>
<entry>+03:00</entry>
<entry>Eastern Europe Daylight Savings Time</entry>
<entry>Eastern Europe Daylight-Saving Time</entry>
</row>
<row>
<entry>HMT</entry>
......@@ -658,17 +655,17 @@ Date/time details
<row>
<entry>CEST</entry>
<entry>+02:00</entry>
<entry>Central European Savings Time</entry>
<entry>Central European Summer Time</entry>
</row>
<row>
<entry>CETDST</entry>
<entry>+02:00</entry>
<entry>Central European Daylight Savings Time</entry>
<entry>Central European Daylight-Saving Time</entry>
</row>
<row>
<entry>EET</entry>
<entry>+02:00</entry>
<entry>Eastern Europe, USSR Zone 1</entry>
<entry>Eastern European Time, Russia zone 1</entry>
</row>
<row>
<entry>FWT</entry>
......@@ -683,12 +680,12 @@ Date/time details
<row>
<entry>MEST</entry>
<entry>+02:00</entry>
<entry>Middle Europe Summer Time</entry>
<entry>Middle European Summer Time</entry>
</row>
<row>
<entry>METDST</entry>
<entry>+02:00</entry>
<entry>Middle Europe Daylight Time</entry>
<entry>Middle Europe Daylight-Saving Time</entry>
</row>
<row>
<entry>SST</entry>
......@@ -718,17 +715,17 @@ Date/time details
<row>
<entry>MET</entry>
<entry>+01:00</entry>
<entry>Middle Europe Time</entry>
<entry>Middle European Time</entry>
</row>
<row>
<entry>MEWT</entry>
<entry>+01:00</entry>
<entry>Middle Europe Winter Time</entry>
<entry>Middle European Winter Time</entry>
</row>
<row>
<entry>MEZ</entry>
<entry>+01:00</entry>
<entry>Middle Europe Zone</entry>
<entry><foreignphrase>Mitteleuropäische Zeit</></entry>
</row>
<row>
<entry>NOR</entry>
......@@ -748,37 +745,37 @@ Date/time details
<row>
<entry>WETDST</entry>
<entry>+01:00</entry>
<entry>Western Europe Daylight Savings Time</entry>
<entry>Western European Daylight-Saving Time</entry>
</row>
<row>
<entry>GMT</entry>
<entry>+00:00</entry>
<entry>00:00</entry>
<entry>Greenwich Mean Time</entry>
</row>
<row>
<entry>UT</entry>
<entry>+00:00</entry>
<entry>00:00</entry>
<entry>Universal Time</entry>
</row>
<row>
<entry>UTC</entry>
<entry>+00:00</entry>
<entry>Universal Time, Coordinated</entry>
<entry>00:00</entry>
<entry>Universal Coordinated Time</entry>
</row>
<row>
<entry>Z</entry>
<entry>+00:00</entry>
<entry>00:00</entry>
<entry>Same as UTC</entry>
</row>
<row>
<entry>ZULU</entry>
<entry>+00:00</entry>
<entry>00:00</entry>
<entry>Same as UTC</entry>
</row>
<row>
<entry>WET</entry>
<entry>+00:00</entry>
<entry>Western Europe</entry>
<entry>00:00</entry>
<entry>Western European Time</entry>
</row>
<row>
<entry>WAT</entry>
......@@ -788,12 +785,12 @@ Date/time details
<row>
<entry>NDT</entry>
<entry>-02:30</entry>
<entry>Newfoundland Daylight Time</entry>
<entry>Newfoundland Daylight-Saving Time</entry>
</row>
<row>
<entry>ADT</entry>
<entry>-03:00</entry>
<entry>Atlantic Daylight Time</entry>
<entry>Atlantic Daylight-Saving Time</entry>
</row>
<row>
<entry>AWT</entry>
......@@ -828,7 +825,7 @@ Date/time details
<row>
<entry>EDT</entry>
<entry>-04:00</entry>
<entry>Eastern Daylight Time</entry>
<entry>Eastern Daylight-Saving Time</entry>
</row>
<!--
<row>
......@@ -840,7 +837,7 @@ Date/time details
<row>
<entry>CDT</entry>
<entry>-05:00</entry>
<entry>Central Daylight Time</entry>
<entry>Central Daylight-Saving Time</entry>
</row>
<row>
<entry>EST</entry>
......@@ -862,7 +859,7 @@ Date/time details
<row>
<entry>MDT</entry>
<entry>-06:00</entry>
<entry>Mountain Daylight Time</entry>
<entry>Mountain Daylight-Saving Time</entry>
</row>
<!--
<row>
......@@ -879,12 +876,12 @@ Date/time details
<row>
<entry>PDT</entry>
<entry>-07:00</entry>
<entry>Pacific Daylight Time</entry>
<entry>Pacific Daylight-Saving Time</entry>
</row>
<row>
<entry>AKDT</entry>
<entry>-08:00</entry>
<entry>Alaska Daylight Time</entry>
<entry>Alaska Daylight-Saving Time</entry>
</row>
<row>
<entry>PST</entry>
......@@ -894,7 +891,7 @@ Date/time details
<row>
<entry>YDT</entry>
<entry>-08:00</entry>
<entry>Yukon Daylight Time</entry>
<entry>Yukon Daylight-Saving Time</entry>
</row>
<row>
<entry>AKST</entry>
......@@ -904,7 +901,7 @@ Date/time details
<row>
<entry>HDT</entry>
<entry>-09:00</entry>
<entry>Hawaii/Alaska Daylight Time</entry>
<entry>Hawaii/Alaska Daylight-Saving Time</entry>
</row>
<row>
<entry>YST</entry>
......@@ -919,7 +916,7 @@ Date/time details
<row>
<entry>AHST</entry>
<entry>-10:00</entry>
<entry>Alaska-Hawaii Standard Time</entry>
<entry>Alaska/Hawaii Standard Time</entry>
</row>
<row>
<entry>HST</entry>
......@@ -950,10 +947,10 @@ Date/time details
<para>
There are three naming conflicts between Australian time zone
names with time zones commonly used in North and South America:
names and time zone names commonly used in North and South America:
<literal>ACST</literal>, <literal>CST</literal>, and
<literal>EST</literal>. If the run-time option
<varname>AUSTRALIAN_TIMEZONES</varname> is set to true then
<varname>australian_timezones</varname> is set to true then
<literal>ACST</literal>, <literal>CST</literal>,
<literal>EST</literal>, and <literal>SAT</literal> are interpreted
as Australian time zone names, as shown in <xref
......@@ -1002,29 +999,23 @@ Date/time details
</sect1>
<sect1 id="units-history">
<sect1 id="datetime-units-history">
<title>History of Units</title>
<note>
<para>
Contributed by José Soares (<email>jose@sferacarta.com</email>)
</para>
</note>
<para>
The Julian Day was invented by the French scholar
The Julian Date was invented by the French scholar
Joseph Justus Scaliger (1540-1609)
and probably takes its name from the Scaliger's father,
the Italian scholar Julius Caesar Scaliger (1484-1558).
Astronomers have used the Julian period to assign a unique number to
every day since 1 January 4713 BC. This is the so-called Julian Day
every day since 1 January 4713 BC. This is the so-called Julian Date
(JD). JD 0 designates the 24 hours from noon UTC on 1 January 4713 BC
to noon UTC on 2 January 4713 BC.
</para>
<para>
The <quote>Julian Day</quote> is different from the <quote>Julian
Date</quote>. The Julian date refers to the Julian calendar, which
The <quote>Julian Date</quote> is different from the <quote>Julian
Calendar</quote>. The Julian calendar
was introduced by Julius Caesar in 45 BC. It was in common use
until the 1582, when countries started changing to the Gregorian
calendar. In the Julian calendar, the tropical year is
......
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/ddl.sgml,v 1.12 2003/02/19 04:06:27 momjian Exp $ -->
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/ddl.sgml,v 1.13 2003/03/13 01:30:28 petere Exp $ -->
<chapter id="ddl">
<title>Data Definition</title>
......@@ -171,9 +171,9 @@ DROP TABLE products;
The object identifier (object ID) of a row. This is a serial
number that is automatically added by
<productname>PostgreSQL</productname> to all table rows (unless
the table was created <literal>WITHOUT OIDS</literal>, in which
the table was created using <literal>WITHOUT OIDS</literal>, in which
case this column is not present). This column is of type
<literal>oid</literal> (same name as the column); see <xref
<type>oid</type> (same name as the column); see <xref
linkend="datatype-oid"> for more information about the type.
</para>
</listitem>
......@@ -183,7 +183,7 @@ DROP TABLE products;
<term><structfield>tableoid</></term>
<listitem>
<para>
The OID of the table containing this row. This attribute is
The OID of the table containing this row. This column is
particularly handy for queries that select from inheritance
hierarchies, since without it, it's difficult to tell which
individual table a row came from. The
......@@ -221,7 +221,7 @@ DROP TABLE products;
<listitem>
<para>
The identity (transaction ID) of the deleting transaction, or
zero for an undeleted tuple. It is possible for this field to
zero for an undeleted tuple. It is possible for this column to
be nonzero in a visible tuple: That usually indicates that the
deleting transaction hasn't committed yet, or that an attempted
deletion was rolled back.
......@@ -254,9 +254,42 @@ DROP TABLE products;
</listitem>
</varlistentry>
</variablelist>
<para>
OIDs are 32-bit quantities and are assigned from a single cluster-wide
counter. In a large or long-lived database, it is possible for the
counter to wrap around. Hence, it is bad practice to assume that OIDs
are unique, unless you take steps to ensure that they are unique.
Recommended practice when using OIDs for row identification is to create
a unique constraint on the OID column of each table for which the OID will
be used. Never assume that OIDs are unique across tables; use the
combination of <structfield>tableoid</> and row OID if you need a
database-wide identifier. (Future releases of
<productname>PostgreSQL</productname> are likely to use a separate
OID counter for each table, so that <structfield>tableoid</>
<emphasis>must</> be included to arrive at a globally unique identifier.)
</para>
<para>
Transaction identifiers are also 32-bit quantities. In a long-lived
database it is possible for transaction IDs to wrap around. This
is not a fatal problem given appropriate maintenance procedures;
see the &cite-admin; for details. However, it is
unwise to depend on uniqueness of transaction IDs over the long term
(more than one billion transactions).
</para>
<para>
Command
identifiers are also 32-bit quantities. This creates a hard limit
of 2<superscript>32</> (4 billion) <acronym>SQL</acronym> commands
within a single transaction. In practice this limit is not a
problem --- note that the limit is on number of
<acronym>SQL</acronym> commands, not number of tuples processed.
</para>
</sect1>
<sect1>
<sect1 id="ddl-default">
<title>Default Values</title>
<para>
......@@ -279,7 +312,7 @@ DROP TABLE products;
data type. For example:
<programlisting>
CREATE TABLE products (
product_no integer PRIMARY KEY,
product_no integer,
name text,
price numeric <emphasis>DEFAULT 9.99</emphasis>
);
......@@ -1194,7 +1227,7 @@ GRANT SELECT ON accounts TO GROUP staff;
REVOKE ALL ON accounts FROM PUBLIC;
</programlisting>
The special privileges of the table owner (i.e., the right to do
<command>DROP</>, <command>GRANT</>, <command>REVOKE</>, etc)
<command>DROP</>, <command>GRANT</>, <command>REVOKE</>, etc.)
are always implicit in being the owner,
and cannot be granted or revoked. But the table owner can choose
to revoke his own ordinary privileges, for example to make a
......@@ -1214,7 +1247,7 @@ REVOKE ALL ON accounts FROM PUBLIC;
</indexterm>
<para>
A <productname>PostgreSQL</productname> database cluster (installation)
A <productname>PostgreSQL</productname> database cluster
contains one or more named databases. Users and groups of users are
shared across the entire cluster, but no other data is shared across
databases. Any given client connection to the server can access
......@@ -1536,10 +1569,10 @@ REVOKE CREATE ON public FROM PUBLIC;
no longer true: you may create such a table name if you wish, in
any non-system schema. However, it's best to continue to avoid
such names, to ensure that you won't suffer a conflict if some
future version defines a system catalog named the same as your
future version defines a system table named the same as your
table. (With the default search path, an unqualified reference to
your table name would be resolved as the system catalog instead.)
System catalogs will continue to follow the convention of having
your table name would be resolved as the system table instead.)
System tables will continue to follow the convention of having
names beginning with <literal>pg_</>, so that they will not
conflict with unqualified user-table names so long as users avoid
the <literal>pg_</> prefix.
......@@ -1681,7 +1714,8 @@ REVOKE CREATE ON public FROM PUBLIC;
linkend="ddl-constraints-fk">, with the orders table depending on
it, would result in an error message such as this:
<screen>
<userinput>DROP TABLE products;</userinput>
DROP TABLE products;
NOTICE: constraint $1 on table orders depends on table products
ERROR: Cannot drop table products because other objects depend on it
Use DROP ... CASCADE to drop the dependent objects too
......
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/ecpg.sgml,v 1.41 2003/01/19 00:13:28 momjian Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/ecpg.sgml,v 1.42 2003/03/13 01:30:28 petere Exp $
-->
<chapter id="ecpg">
......@@ -44,7 +44,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/ecpg.sgml,v 1.41 2003/01/19 00:13:28 momjia
implementation is designed to match this standard as much as
possible, and it is usually possible to port embedded
<acronym>SQL</acronym> programs written for other
<acronym>RDBMS</acronym> to <productname>PostgreSQL</productname>
SQL databases to <productname>PostgreSQL</productname>
with relative ease.
</para>
......@@ -124,30 +124,30 @@ EXEC SQL CONNECT TO <replaceable>target</replaceable> <optional>AS <replaceable>
<itemizedlist>
<listitem>
<simpara>
<literal><replaceable>userid</replaceable></literal>
<literal><replaceable>username</replaceable></literal>
</simpara>
</listitem>
<listitem>
<simpara>
<literal><replaceable>userid</replaceable>/<replaceable>password</replaceable></literal>
<literal><replaceable>username</replaceable>/<replaceable>password</replaceable></literal>
</simpara>
</listitem>
<listitem>
<simpara>
<literal><replaceable>userid</replaceable> IDENTIFIED BY <replaceable>password</replaceable></literal>
<literal><replaceable>username</replaceable> IDENTIFIED BY <replaceable>password</replaceable></literal>
</simpara>
</listitem>
<listitem>
<simpara>
<literal><replaceable>userid</replaceable> USING <replaceable>password</replaceable></literal>
<literal><replaceable>username</replaceable> USING <replaceable>password</replaceable></literal>
</simpara>
</listitem>
</itemizedlist>
The <replaceable>userid</replaceable> and
<replaceable>password</replaceable> may be a constant text, a
The <replaceable>username</replaceable> and
<replaceable>password</replaceable> may be an SQL name, a
character variable, or a character string.
</para>
......@@ -164,7 +164,7 @@ EXEC SQL CONNECT TO <replaceable>target</replaceable> <optional>AS <replaceable>
<para>
To close a connection, use the following statement:
<programlisting>
EXEC SQL DISCONNECT [<replaceable>connection</replaceable>];
EXEC SQL DISCONNECT <optional><replaceable>connection</replaceable></optional>;
</programlisting>
The <replaceable>connection</replaceable> can be specified
in the following ways:
......@@ -275,7 +275,7 @@ EXEC SQL COMMIT;
other interfaces) via the <option>-t</option> command-line option
to <command>ecpg</command> (see below) or via the <literal>EXEC SQL
SET AUTOCOMMIT TO ON</literal> statement. In autocommit mode, each
query is automatically committed unless it is inside an explicit
command is automatically committed unless it is inside an explicit
transaction block. This mode can be explicitly turned off using
<literal>EXEC SQL SET AUTOCOMMIT TO OFF</literal>.
</para>
......@@ -324,16 +324,16 @@ char foo[16], bar[16];
<para>
The special types <type>VARCHAR</type> and <type>VARCHAR2</type>
are converted into a named <type>struct</> for every variable. A
declaration like:
declaration like
<programlisting>
VARCHAR var[180];
</programlisting>
is converted into:
is converted into
<programlisting>
struct varchar_var { int len; char arr[180]; } var;
</programlisting>
This structure is suitable for interfacing with SQL datums of type
<type>VARCHAR</type>.
<type>varchar</type>.
</para>
<para>
......@@ -389,7 +389,7 @@ struct sqlca
long sqlerrd[6];
/* 0: empty */
/* 1: OID of processed tuple if applicable */
/* 1: OID of processed row if applicable */
/* 2: number of rows processed in an INSERT, UPDATE */
/* or DELETE statement */
/* 3: empty */
......@@ -400,7 +400,7 @@ struct sqlca
/* 0: set to 'W' if at least one other is 'W' */
/* 1: if 'W' at least one character string */
/* value was truncated when it was */
/* stored into a host variable. */
/* stored into a host variable */
/* 2: empty */
/* 3: empty */
/* 4: empty */
......@@ -418,7 +418,7 @@ struct sqlca
If no error occurred in the last <acronym>SQL</acronym> statement,
<literal>sqlca.sqlcode</literal> will be 0
(<symbol>ECPG_NO_ERROR</>). If <literal>sqlca.sqlcode</literal> is
less that zero, this is a serious error, like the database
less than zero, this is a serious error, like the database
definition does not match the query. If it is greater than zero, it
is a normal error like the table did not contain the requested row.
</para>
......@@ -434,7 +434,7 @@ struct sqlca
<variablelist>
<varlistentry>
<term><computeroutput>-12, Out of memory in line %d.</computeroutput></term>
<term><computeroutput>-12: Out of memory in line %d.</computeroutput></term>
<listitem>
<para>
Should not normally occur. This indicates your virtual memory
......@@ -462,7 +462,7 @@ struct sqlca
This means that the server has returned more arguments than we
have matching variables. Perhaps you have forgotten a couple
of the host variables in the <command>INTO
:var1,:var2</command> list.
:var1, :var2</command> list.
</para>
</listitem>
</varlistentry>
......@@ -481,7 +481,7 @@ struct sqlca
<term><computeroutput>-203 (ECPG_TOO_MANY_MATCHES): Too many matches line %d.</computeroutput></term>
<listitem>
<para>
This means the query has returned several rows but the
This means the query has returned multiple rows but the
variables specified are not arrays. The
<command>SELECT</command> command was not unique.
</para>
......@@ -627,7 +627,7 @@ struct sqlca
</varlistentry>
<varlistentry>
<term><computeroutput>-242 (ECPG_UNKNOWN_DESCRIPTOR_ITEM): Descriptor %s not found in line %d.</computeroutput></term>
<term><computeroutput>-242 (ECPG_UNKNOWN_DESCRIPTOR_ITEM): Unknown descriptor item %s in line %d.</computeroutput></term>
<listitem>
<para>
The descriptor specified was not found. The statement you are trying to use has not been prepared.
......@@ -656,12 +656,12 @@ struct sqlca
</varlistentry>
<varlistentry>
<term><computeroutput>-400 (ECPG_PGSQL): Postgres error: %s line %d.</computeroutput></term>
<term><computeroutput>-400 (ECPG_PGSQL): '%s' in line %d.</computeroutput></term>
<listitem>
<para>
Some <productname>PostgreSQL</productname> error. The message
contains the error message from the
<productname>PostgreSQL</productname> backend.
<productname>PostgreSQL</productname> server.
</para>
</listitem>
</varlistentry>
......@@ -670,7 +670,7 @@ struct sqlca
<term><computeroutput>-401 (ECPG_TRANS): Error in transaction processing line %d.</computeroutput></term>
<listitem>
<para>
<productname>PostgreSQL</productname> signaled that we cannot
The <productname>PostgreSQL</productname> server signaled that we cannot
start, commit, or rollback the transaction.
</para>
</listitem>
......@@ -680,7 +680,7 @@ struct sqlca
<term><computeroutput>-402 (ECPG_CONNECT): Could not connect to database %s in line %d.</computeroutput></term>
<listitem>
<para>
The connect to the database did not work.
The connection attempt to the database did not work.
</para>
</listitem>
</varlistentry>
......@@ -718,7 +718,7 @@ EXEC SQL INCLUDE <replaceable>filename</replaceable>;
<programlisting>
#include &lt;<replaceable>filename</replaceable>.h&gt;
</programlisting>
because the file would not be subject to SQL command preprocessing.
because this file would not be subject to SQL command preprocessing.
Naturally, you can continue to use the C
<literal>#include</literal> directive to include other header
files.
......@@ -744,7 +744,7 @@ EXEC SQL INCLUDE <replaceable>filename</replaceable>;
<acronym>SQL</acronym> statements you used to special function
calls. After compiling, you must link with a special library that
contains the needed functions. These functions fetch information
from the arguments, perform the <acronym>SQL</acronym> query using
from the arguments, perform the <acronym>SQL</acronym> command using
the <application>libpq</application> interface, and put the result
in the arguments specified for output.
</para>
......@@ -766,7 +766,7 @@ ecpg prog1.pgc
</para>
<para>
The preprocessed file can be compiled normally, for example
The preprocessed file can be compiled normally, for example:
<programlisting>
cc -c prog1.c
</programlisting>
......@@ -823,83 +823,33 @@ ECPG = ecpg
<function>ECPGdebug(int <replaceable>on</replaceable>, FILE
*<replaceable>stream</replaceable>)</function> turns on debug
logging if called with the first argument non-zero. Debug logging
is done on <replaceable>stream</replaceable>. Most
<acronym>SQL</acronym> statement log their arguments and results.
</para>
<para>
The most important function, <function>ECPGdo</function>, logs
all <acronym>SQL</acronym> statements with both the expanded
string, i.e. the string with all the input variables inserted,
and the result from the <productname>PostgreSQL</productname>
server. This can be very useful when searching for errors in your
<acronym>SQL</acronym> statements.
is done on <replaceable>stream</replaceable>. The log contains
all <acronym>SQL</acronym> statements with all the input
variables inserted, and the results from the
<productname>PostgreSQL</productname> server. This can be very
useful when searching for errors in your <acronym>SQL</acronym>
statements.
</para>
</listitem>
<listitem>
<para>
<function>ECPGstatus()</function> This method returns true if we
<function>ECPGstatus()</function> returns true if you
are connected to a database and false if not.
</para>
</listitem>
</itemizedlist>
</sect1>
<sect1 id="ecpg-porting">
<title>Porting From Other <acronym>RDBMS</acronym> Packages</title>
<para>
The design of <application>ecpg</application> follows the SQL
standard. Porting from a standard RDBMS should not be a problem.
Unfortunately there is no such thing as a standard RDBMS. Therefore
<application>ecpg</application> tries to understand syntax
extensions as long as they do not create conflicts with the
standard.
</para>
<para>
The following list shows all the known incompatibilities. If you
find one not listed please notify the developers. Note, however,
that we list only incompatibilities from a preprocessor of another
RDBMS to <application>ecpg</application> and not
<application>ecpg</application> features that these RDBMS do not
support.
</para>
<variablelist>
<varlistentry>
<term>Syntax of <command>FETCH</command></term>
<indexterm><primary>FETCH</><secondary>embedded SQL</></indexterm>
<listitem>
<para>
The standard syntax for <command>FETCH</command> is:
<synopsis>
FETCH <optional><replaceable>direction</></> <optional><replaceable>amount</></> IN|FROM <replaceable>cursor</replaceable>
</synopsis>
<indexterm><primary>Oracle</></>
<productname>Oracle</productname>, however, does not use the
keywords <literal>IN</literal> or <literal>FROM</literal>. This
feature cannot be added since it would create parsing conflicts.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect1>
<sect1 id="ecpg-develop">
<title>For the Developer</title>
<title>Internals</title>
<para>
This section explain how <application>ecpg</application> works
This section explain how <application>ECPG</application> works
internally. This information can occasionally be useful to help
users understand how to use <application>ecpg</application>.
users understand how to use <application>ECPG</application>.
</para>
<sect2>
<title>The Preprocessor</title>
<para>
The first four lines written by <command>ecpg</command> to the
output are fixed lines. Two are comments and two are include
......@@ -910,8 +860,8 @@ FETCH <optional><replaceable>direction</></> <optional><replaceable>amount</></>
<para>
When it sees an <command>EXEC SQL</command> statement, it
intervenes and changes it. The command starts with <command>exec
sql</command> and ends with <command>;</command>. Everything in
intervenes and changes it. The command starts with <command>EXEC
SQL</command> and ends with <command>;</command>. Everything in
between is treated as an <acronym>SQL</acronym> statement and
parsed for variable substitution.
</para>
......@@ -920,16 +870,89 @@ FETCH <optional><replaceable>direction</></> <optional><replaceable>amount</></>
Variable substitution occurs when a symbol starts with a colon
(<literal>:</literal>). The variable with that name is looked up
among the variables that were previously declared within a
<literal>EXEC SQL DECLARE</> section. Depending on whether the
variable is being use for input or output, a pointer to the
variable is output to allow access by the function.
<literal>EXEC SQL DECLARE</> section.
</para>
<para>
The most important function in the library is
<function>ECPGdo</function>, which takes care of executing most
commands. It takes a variable number of arguments. This can easily
add up to 50 or so arguments, and we hope this will not be a
problem on any platform.
</para>
<para>
The arguments are:
<variablelist>
<varlistentry>
<term>A line number</term>
<listitem>
<para>
This is the line number of the original line; used in error
messages only.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>A string</term>
<listitem>
<para>
This is the <acronym>SQL</acronym> command that is to be issued.
It is modified by the input variables, i.e., the variables that
where not known at compile time but are to be entered in the
command. Where the variables should go the string contains
<literal>?</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Input variables</term>
<listitem>
<para>
Every input variable causes ten arguments to be created. (See below.)
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><parameter>ECPGt_EOIT</></term>
<listitem>
<para>
An <type>enum</> telling that there are no more input
variables.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Output variables</term>
<listitem>
<para>
Every output variable causes ten arguments to be created.
(See below.) These variables are filled by the function.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><parameter>ECPGt_EORT</></term>
<listitem>
<para>
An <type>enum</> telling that there are no more variables.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
<para>
For every variable that is part of the <acronym>SQL</acronym>
query, the function gets other arguments:
command, the function gets ten arguments:
<itemizedlist>
<orderedlist>
<listitem>
<para>
The type as a special symbol.
......@@ -968,8 +991,7 @@ FETCH <optional><replaceable>direction</></> <optional><replaceable>amount</></>
<listitem>
<para>
A pointer to the value of the indicator variable or a pointer
to the pointer of the indicator variable.
A pointer to the indicator variable.
</para>
</listitem>
......@@ -981,7 +1003,7 @@ FETCH <optional><replaceable>direction</></> <optional><replaceable>amount</></>
<listitem>
<para>
Number of elements in the indicator array (for array fetches).
The number of elements in the indicator array (for array fetches).
</para>
</listitem>
......@@ -991,7 +1013,7 @@ FETCH <optional><replaceable>direction</></> <optional><replaceable>amount</></>
array fetches).
</para>
</listitem>
</itemizedlist>
</orderedlist>
</para>
<para>
......@@ -1039,92 +1061,9 @@ ECPGdo(__LINE__, NULL, "SELECT res FROM mytable WHERE index = ? ",
ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, ECPGt_EORT);
#line 147 "foo.pgc"
</programlisting>
(The indentation in this manual is added for readability and not
(The indentation here is added for readability and not
something the preprocessor does.)
</para>
</sect2>
<sect2>
<title>The Library</title>
<para>
The most important function in the library is
<function>ECPGdo</function>. It takes a variable number of
arguments. Hopefully there are no computers that limit the number
of variables that can be accepted by a
<function>varargs()</function> function. This can easily add up to
50 or so arguments.
</para>
<para>
The arguments are:
<variablelist>
<varlistentry>
<term>A line number</term>
<listitem>
<para>
This is a line number of the original line; used in error
messages only.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>A string</term>
<listitem>
<para>
This is the <acronym>SQL</acronym> query that is to be issued.
It is modified by the input variables, i.e. the variables that
where not known at compile time but are to be entered in the
query. Where the variables should go the string contains
<literal>?</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Input variables</term>
<listitem>
<para>
As described in the section about the preprocessor, every
input variable gets ten arguments.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><parameter>ECPGt_EOIT</></term>
<listitem>
<para>
An <type>enum</> telling that there are no more input
variables.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Output variables</term>
<listitem>
<para>
As described in the section about the preprocessor, every
input variable gets ten arguments. These variables are filled
by the function.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><parameter>ECPGt_EORT</></term>
<listitem>
<para>
An <type>enum</> telling that there are no more variables.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
</sect2>
</sect1>
</chapter>
......
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/features.sgml,v 2.17 2003/01/15 21:55:52 momjian Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/features.sgml,v 2.18 2003/03/13 01:30:28 petere Exp $
-->
<appendix id="features">
......@@ -105,7 +105,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/features.sgml,v 2.17 2003/01/15 21:55:52 mo
<para>
The following features defined in <acronym>SQL99</acronym> are not
implemented in the current release of
implemented in this release of
<productname>PostgreSQL</productname>. In a few cases, equivalent
functionality is available.
......
This source diff could not be displayed because it is too large. You can view the blob instead.
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v 1.38 2002/11/11 20:14:03 petere Exp $ -->
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v 1.39 2003/03/13 01:30:28 petere Exp $ -->
<chapter id="indexes">
<title id="indexes-title">Indexes</title>
......@@ -83,8 +83,8 @@ CREATE INDEX test1_id_index ON test1 (id);
</para>
<para>
Indexes can benefit <command>UPDATE</command>s and
<command>DELETE</command>s with search conditions. Indexes can also be
Indexes can also benefit <command>UPDATE</command> and
<command>DELETE</command> commands with search conditions. Indexes can moreover be
used in join queries. Thus,
an index defined on a column that is part of a join condition can
significantly speed up queries with joins.
......@@ -119,7 +119,7 @@ CREATE INDEX test1_id_index ON test1 (id);
By
default, the <command>CREATE INDEX</command> command will create a
B-tree index, which fits the most common situations. In
particular, the <productname>PostgreSQL</productname> query optimizer
particular, the <productname>PostgreSQL</productname> query planner
will consider using a B-tree index whenever an indexed column is
involved in a comparison using one of these operators:
......@@ -146,7 +146,7 @@ CREATE INDEX test1_id_index ON test1 (id);
<synopsis>
CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING RTREE (<replaceable>column</replaceable>);
</synopsis>
The <productname>PostgreSQL</productname> query optimizer will
The <productname>PostgreSQL</productname> query planner will
consider using an R-tree index whenever an indexed column is
involved in a comparison using one of these operators:
......@@ -172,7 +172,7 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
<primary>hash</primary>
<see>indexes</see>
</indexterm>
The query optimizer will consider using a hash index whenever an
The query planner will consider using a hash index whenever an
indexed column is involved in a comparison using the
<literal>=</literal> operator. The following command is used to
create a hash index:
......@@ -196,9 +196,8 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
standard R-trees using Guttman's quadratic split algorithm. The
hash index is an implementation of Litwin's linear hashing. We
mention the algorithms used solely to indicate that all of these
access methods are fully dynamic and do not have to be optimized
periodically (as is the case with, for example, static hash access
methods).
index methods are fully dynamic and do not have to be optimized
periodically (as is the case with, for example, static hash methods).
</para>
</sect1>
......@@ -242,17 +241,17 @@ CREATE INDEX test2_mm_idx ON test2 (major, minor);
</para>
<para>
The query optimizer can use a multicolumn index for queries that
involve the first <parameter>n</parameter> consecutive columns in
the index (when used with appropriate operators), up to the total
number of columns specified in the index definition. For example,
The query planner can use a multicolumn index for queries that
involve the leftmost column in the index definition and any number
of columns listed to the right of it without a gap (when
used with appropriate operators). For example,
an index on <literal>(a, b, c)</literal> can be used in queries
involving all of <literal>a</literal>, <literal>b</literal>, and
<literal>c</literal>, or in queries involving both
<literal>a</literal> and <literal>b</literal>, or in queries
involving only <literal>a</literal>, but not in other combinations.
(In a query involving <literal>a</literal> and <literal>c</literal>
the optimizer might choose to use the index for
the planner might choose to use the index for
<literal>a</literal> only and treat <literal>c</literal> like an
ordinary unindexed column.)
</para>
......@@ -296,7 +295,7 @@ CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</repla
<para>
When an index is declared unique, multiple table rows with equal
indexed values will not be allowed. NULL values are not considered
indexed values will not be allowed. Null values are not considered
equal.
</para>
......@@ -342,7 +341,7 @@ CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</repla
SELECT * FROM test1 WHERE lower(col1) = 'value';
</programlisting>
This query can use an index, if one has been
defined on the result of the <literal>lower(column)</literal>
defined on the result of the <literal>lower(col1)</literal>
operation:
<programlisting>
CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1));
......@@ -353,7 +352,7 @@ CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1));
The function in the index definition can take more than one
argument, but they must be table columns, not constants.
Functional indexes are always single-column (namely, the function
result) even if the function uses more than one input field; there
result) even if the function uses more than one input column; there
cannot be multicolumn indexes that contain function calls.
</para>
......@@ -377,29 +376,32 @@ CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1));
CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> (<replaceable>column</replaceable> <replaceable>opclass</replaceable> <optional>, ...</optional>);
</synopsis>
The operator class identifies the operators to be used by the index
for that column. For example, a B-tree index on four-byte integers
for that column. For example, a B-tree index on the type <type>int4</type>
would use the <literal>int4_ops</literal> class; this operator
class includes comparison functions for four-byte integers. In
class includes comparison functions for values of type <type>int4</type>. In
practice the default operator class for the column's data type is
usually sufficient. The main point of having operator classes is
that for some data types, there could be more than one meaningful
ordering. For example, we might want to sort a complex-number data
type either by absolute value or by real part. We could do this by
defining two operator classes for the data type and then selecting
the proper class when making an index. There are also some
operator classes with special purposes:
the proper class when making an index.
</para>
<para>
There are also some built-in operator classes besides the default ones:
<itemizedlist>
<listitem>
<para>
The operator classes <literal>box_ops</literal> and
<literal>bigbox_ops</literal> both support R-tree indexes on the
<literal>box</literal> data type. The difference between them is
<type>box</type> data type. The difference between them is
that <literal>bigbox_ops</literal> scales box coordinates down,
to avoid floating-point exceptions from doing multiplication,
addition, and subtraction on very large floating-point
coordinates. If the field on which your rectangles lie is about
20 000 units square or larger, you should use
20 000 square units or larger, you should use
<literal>bigbox_ops</literal>.
</para>
</listitem>
......@@ -409,25 +411,25 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
<para>
The following query shows all defined operator classes:
<programlisting>
SELECT am.amname AS acc_method,
opc.opcname AS ops_name
<programlisting>
SELECT am.amname AS index_method,
opc.opcname AS opclass_name
FROM pg_am am, pg_opclass opc
WHERE opc.opcamid = am.oid
ORDER BY acc_method, ops_name;
</programlisting>
ORDER BY index_method, opclass_name;
</programlisting>
It can be extended to show all the operators included in each class:
<programlisting>
SELECT am.amname AS acc_method,
opc.opcname AS ops_name,
opr.oprname AS ops_comp
<programlisting>
SELECT am.amname AS index_method,
opc.opcname AS opclass_name,
opr.oprname AS opclass_operator
FROM pg_am am, pg_opclass opc, pg_amop amop, pg_operator opr
WHERE opc.opcamid = am.oid AND
amop.amopclaid = opc.oid AND
amop.amopopr = opr.oid
ORDER BY acc_method, ops_name, ops_comp;
</programlisting>
ORDER BY index_method, opclass_name, opclass_operator;
</programlisting>
</para>
</sect1>
......@@ -465,7 +467,7 @@ SELECT am.amname AS acc_method,
<para>
Suppose you are storing web server access logs in a database.
Most accesses originate from the IP range of your organization but
Most accesses originate from the IP address range of your organization but
some are from elsewhere (say, employees on dial-up connections).
If your searches by IP are primarily for outside accesses,
you probably do not need to index the IP range that corresponds to your
......@@ -575,16 +577,16 @@ SELECT * FROM orders WHERE order_nr = 3501;
predicate must match the conditions used in the queries that
are supposed to benefit from the index. To be precise, a partial
index can be used in a query only if the system can recognize that
the query's WHERE condition mathematically <firstterm>implies</>
the index's predicate.
the <literal>WHERE</> condition of the query mathematically implies
the predicate of the index.
<productname>PostgreSQL</productname> does not have a sophisticated
theorem prover that can recognize mathematically equivalent
predicates that are written in different forms. (Not
expressions that are written in different forms. (Not
only is such a general theorem prover extremely difficult to
create, it would probably be too slow to be of any real use.)
The system can recognize simple inequality implications, for example
<quote>x &lt; 1</quote> implies <quote>x &lt; 2</quote>; otherwise
the predicate condition must exactly match the query's WHERE condition
the predicate condition must exactly match the query's <literal>WHERE</> condition
or the index will not be recognized to be usable.
</para>
......@@ -606,15 +608,18 @@ SELECT * FROM orders WHERE order_nr = 3501;
a given subject and target combination, but there might be any number of
<quote>unsuccessful</> entries. Here is one way to do it:
<programlisting>
CREATE TABLE tests (subject text,
target text,
success bool,
...);
CREATE TABLE tests (
subject text,
target text,
success boolean,
...
);
CREATE UNIQUE INDEX tests_success_constraint ON tests (subject, target)
WHERE success;
</programlisting>
This is a particularly efficient way of doing it when there are few
successful trials and many unsuccessful ones.
successful tests and many unsuccessful ones.
</para>
</example>
......
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/installation.sgml,v 1.128 2003/01/19 00:13:28 momjian Exp $ -->
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/installation.sgml,v 1.129 2003/03/13 01:30:28 petere Exp $ -->
<chapter id="installation">
<title><![%standalone-include[<productname>PostgreSQL</>]]>
......@@ -69,7 +69,7 @@ su - postgres
<acronym>GNU</> <application>make</> is often installed under
the name <filename>gmake</filename>; this document will always
refer to it by that name. (On some systems
<acronym>GNU</acronym> make is the default tool with the name
<acronym>GNU</acronym> <application>make</> is the default tool with the name
<filename>make</>.) To test for <acronym>GNU</acronym>
<application>make</application> enter
<screen>
......@@ -91,8 +91,8 @@ su - postgres
<listitem>
<para>
<application>gzip</> is needed to unpack the distribution in the
first place. If you are reading this, you probably already got
past that hurdle.
first place.<![%standalone-include;[ If you are reading this, you probably already got
past that hurdle.]]>
</para>
</listitem>
......@@ -108,7 +108,7 @@ su - postgres
specify the <option>--without-readline</option> option for
<filename>configure</>. (On <productname>NetBSD</productname>,
the <filename>libedit</filename> library is
<productname>readline</productname>-compatible and is used if
<productname>Readline</productname>-compatible and is used if
<filename>libreadline</filename> is not found.)
</para>
</listitem>
......@@ -259,7 +259,7 @@ JAVACMD=$JAVA_HOME/bin/java
<systemitem class="osname">Solaris</>), for other systems you
can download an add-on package from here: <ulink
url="http://www.postgresql.org/~petere/gettext.html" ></ulink>.
If you are using the <application>gettext</> implementation in
If you are using the <application>Gettext</> implementation in
the <acronym>GNU</acronym> C library then you will additionally
need the <productname>GNU Gettext</productname> package for some
utility programs. For any of the other implementations you will
......@@ -278,7 +278,7 @@ JAVACMD=$JAVA_HOME/bin/java
</para>
<para>
If you are build from a <acronym>CVS</acronym> tree instead of
If you are building from a <acronym>CVS</acronym> tree instead of
using a released source package, or if you want to do development,
you also need the following packages:
......@@ -427,7 +427,7 @@ JAVACMD=$JAVA_HOME/bin/java
</screen>
Versions prior to 7.0 do not have this
<filename>postmaster.pid</> file. If you are using such a version
you must find out the process id of the server yourself, for
you must find out the process ID of the server yourself, for
example by typing <userinput>ps ax | grep postmaster</>, and
supply it to the <command>kill</> command.
</para>
......@@ -732,7 +732,7 @@ JAVACMD=$JAVA_HOME/bin/java
<para>
To use this option, you will need an implementation of the
<application>gettext</> API; see above.
<application>Gettext</> API; see above.
</para>
</listitem>
</varlistentry>
......@@ -1082,7 +1082,7 @@ All of PostgreSQL is successfully made. Ready to install.
<screen>
<userinput>gmake -C src/interfaces/python install</userinput>
</screen>
If you do not have superuser access you are on your own:
If you do not have root access you are on your own:
you can still take the required files and place them in
other directories where Python can find them, but how to
do that is left as an exercise.
......@@ -1133,7 +1133,7 @@ All of PostgreSQL is successfully made. Ready to install.
<para>
After the installation you can make room by removing the built
files from the source tree with the command <command>gmake
clean</>. This will preserve the files made by the configure
clean</>. This will preserve the files made by the <command>configure</command>
program, so that you can rebuild everything with <command>gmake</>
later on. To reset the source tree to the state in which it was
distributed, use <command>gmake distclean</>. If you are going to
......@@ -1143,8 +1143,8 @@ All of PostgreSQL is successfully made. Ready to install.
</formalpara>
<para>
If you perform a build and then discover that your configure
options were wrong, or if you change anything that configure
If you perform a build and then discover that your <command>configure</>
options were wrong, or if you change anything that <command>configure</>
investigates (for example, software upgrades), then it's a good
idea to do <command>gmake distclean</> before reconfiguring and
rebuilding. Without this, your changes in configuration choices
......@@ -1207,7 +1207,7 @@ setenv LD_LIBRARY_PATH /usr/local/pgsql/lib
<para>
On <systemitem class="osname">Cygwin</systemitem>, put the library
directory in the <envar>PATH</envar> or move the
<filename>.dll</filename> files into the <filename>bin/</filename>
<filename>.dll</filename> files into the <filename>bin</filename>
directory.
</para>
......@@ -1283,7 +1283,7 @@ set path = ( /usr/local/pgsql/bin $path )
<seealso>man pages</seealso>
</indexterm>
To enable your system to find the <application>man</>
documentation, you need to add a line like the following to a
documentation, you need to add lines like the following to a
shell start-up file unless you installed into a location that is
searched by default.
<programlisting>
......@@ -1544,8 +1544,8 @@ gunzip -c user.ps.gz \
<entry>7.3</entry>
<entry>2002-10-28,
10.20 Tom Lane (<email>tgl@sss.pgh.pa.us</email>),
11.00, 11.11, 32 &amp; 64 bit, Giles Lean (<email>giles@nemeton.com.au</email>)</entry>
<entry>gcc and cc; see also <filename>doc/FAQ_HPUX</filename></entry>
11.00, 11.11, 32 and 64 bit, Giles Lean (<email>giles@nemeton.com.au</email>)</entry>
<entry><command>gcc</> and <command>cc</>; see also <filename>doc/FAQ_HPUX</filename></entry>
</row>
<row>
<entry><systemitem class="osname">IRIX</></entry>
......@@ -1585,7 +1585,7 @@ gunzip -c user.ps.gz \
<entry>7.3</entry>
<entry>2002-11-19,
Permaine Cheung <email>pcheung@redhat.com</email>)</entry>
<entry>#undef HAS_TEST_AND_SET, remove slock_t typedef</entry>
<entry><literal>#undef HAS_TEST_AND_SET</>, remove <type>slock_t</> <literal>typedef</></entry>
</row>
<row>
<entry><systemitem class="osname">Linux</></entry>
......@@ -1715,7 +1715,7 @@ gunzip -c user.ps.gz \
<entry><systemitem>x86</></entry>
<entry>7.3.1</entry>
<entry>2002-12-11, Shibashish Satpathy (<email>shib@postmark.net</>)</entry>
<entry>5.0.4, gcc; see also <filename>doc/FAQ_SCO</filename></entry>
<entry>5.0.4, <command>gcc</>; see also <filename>doc/FAQ_SCO</filename></entry>
</row>
<row>
<entry><systemitem class="osname">Solaris</></entry>
......@@ -1723,7 +1723,7 @@ gunzip -c user.ps.gz \
<entry>7.3</entry>
<entry>2002-10-28,
Andrew Sullivan (<email>andrew@libertyrms.info</email>)</entry>
<entry>Solaris 7 &amp; 8; see also <filename>doc/FAQ_Solaris</filename></entry>
<entry>Solaris 7 and 8; see also <filename>doc/FAQ_Solaris</filename></entry>
</row>
<row>
<entry><systemitem class="osname">Solaris</></entry>
......@@ -1813,7 +1813,7 @@ gunzip -c user.ps.gz \
<entry>7.2</entry>
<entry>2001-11-29,
Cyril Velter (<email>cyril.velter@libertysurf.fr</email>)</entry>
<entry>needs updates to semaphore code</entry>
<entry>needs updates to semaphore code</entry>
</row>
<row>
<entry><systemitem class="osname">DG/UX 5.4R4.11</></entry>
......
This source diff could not be displayed because it is too large. You can view the blob instead.
This source diff could not be displayed because it is too large. You can view the blob instead.
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/lobj.sgml,v 1.27 2002/04/18 14:28:14 momjian Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/lobj.sgml,v 1.28 2003/03/13 01:30:28 petere Exp $
-->
<chapter id="largeObjects">
......@@ -8,9 +8,6 @@ $Header: /cvsroot/pgsql/doc/src/sgml/lobj.sgml,v 1.27 2002/04/18 14:28:14 momjia
<indexterm zone="largeobjects"><primary>large object</></>
<indexterm><primary>BLOB</><see>large object</></>
<sect1 id="lo-intro">
<title>Introduction</title>
<para>
In <productname>PostgreSQL</productname> releases prior to 7.1,
the size of any row in the database could not exceed the size of a
......@@ -19,10 +16,24 @@ $Header: /cvsroot/pgsql/doc/src/sgml/lobj.sgml,v 1.27 2002/04/18 14:28:14 momjia
size of a data value was relatively low. To support the storage of
larger atomic values, <productname>PostgreSQL</productname>
provided and continues to provide a large object interface. This
interface provides file-oriented access to user data that has been
declared to be a large object.
interface provides file-oriented access to user data that is stored in
a special large-object structure.
</para>
<para>
This chapter describes the implementation and the programming and
query language interfaces to <productname>PostgreSQL</productname>
large object data. We use the <application>libpq</application> C
library for the examples in this chapter, but most programming
interfaces native to <productname>PostgreSQL</productname> support
equivalent functionality. Other interfaces may use the large
object interface internally to provide generic support for large
values. This is not described here.
</para>
<sect1 id="lo-history">
<title>History</title>
<para>
<productname>POSTGRES 4.2</productname>, the indirect predecessor
of <productname>PostgreSQL</productname>, supported three standard
......@@ -50,21 +61,8 @@ $Header: /cvsroot/pgsql/doc/src/sgml/lobj.sgml,v 1.27 2002/04/18 14:28:14 momjia
(nicknamed <quote><acronym>TOAST</acronym></quote>) that allows
data rows to be much larger than individual data pages. This
makes the large object interface partially obsolete. One
remaining advantage of the large object interface is that it
allows random access to the data, i.e., the ability to read or
write small chunks of a large value. It is planned to equip
<acronym>TOAST</acronym> with such functionality in the future.
</para>
<para>
This section describes the implementation and the programming and
query language interfaces to <productname>PostgreSQL</productname>
large object data. We use the <application>libpq</application> C
library for the examples in this section, but most programming
interfaces native to <productname>PostgreSQL</productname> support
equivalent functionality. Other interfaces may use the large
object interface internally to provide generic support for large
values. This is not described here.
remaining advantage of the large object interface is that it allows values up
to 2 GB in size, whereas <acronym>TOAST</acronym> can only handle 1 GB.
</para>
</sect1>
......@@ -75,64 +73,45 @@ $Header: /cvsroot/pgsql/doc/src/sgml/lobj.sgml,v 1.27 2002/04/18 14:28:14 momjia
<para>
The large object implementation breaks large
objects up into <quote>chunks</quote> and stores the chunks in
tuples in the database. A B-tree index guarantees fast
rows in the database. A B-tree index guarantees fast
searches for the correct chunk number when doing random
access reads and writes.
</para>
</sect1>
<sect1 id="lo-interfaces">
<title>Interfaces</title>
<title>Client Interfaces</title>
<para>
The facilities <productname>PostgreSQL</productname> provides to
access large objects, both in the backend as part of user-defined
functions or the front end as part of an application
using the interface, are described below. For users
familiar with <productname>POSTGRES 4.2</productname>,
<productname>PostgreSQL</productname> has a new set of
functions providing a more coherent interface.
<note>
<para>
All large object manipulation <emphasis>must</emphasis> take
place within an SQL transaction. This requirement is strictly
enforced as of <productname>PostgreSQL 6.5</>, though it has been an
implicit requirement in previous versions, resulting in
misbehavior if ignored.
</para>
</note>
This section describes the facilities that
<productname>PostgreSQL</productname> client interface libraries
provide for accessing large objects. All large object
manipulation using these functions <emphasis>must</emphasis> take
place within an SQL transaction block. (This requirement is
strictly enforced as of <productname>PostgreSQL 6.5</>, though it
has been an implicit requirement in previous versions, resulting
in misbehavior if ignored.)
The <productname>PostgreSQL</productname> large object interface is modeled after
the <acronym>Unix</acronym> file-system interface, with analogues of
<function>open</function>, <function>read</function>,
<function>write</function>,
<function>lseek</function>, etc.
</para>
<para>
The <productname>PostgreSQL</productname> large object interface is modeled after
the <acronym>Unix</acronym> file-system interface, with analogues of
<function>open(2)</function>, <function>read(2)</function>,
<function>write(2)</function>,
<function>lseek(2)</function>, etc. User
functions call these routines to retrieve only the data of
interest from a large object. For example, if a large
object type called <type>mugshot</type> existed that stored
photographs of faces, then a function called <function>beard</function> could
be declared on <type>mugshot</type> data. <function>beard</> could look at the
lower third of a photograph, and determine the color of
the beard that appeared there, if any. The entire
large-object value need not be buffered, or even
examined, by the <function>beard</function> function.
Large objects may be accessed from dynamically-loaded <acronym>C</acronym>
functions or database client programs that link the
library. <productname>PostgreSQL</productname> provides a set of routines that
support opening, reading, writing, closing, and seeking on
large objects.
Client applications which use the large object interface in
<application>libpq</application> should include the header file
<filename>libpq/libpq-fs.h</filename> and link with the
<application>libpq</application> library.
</para>
<sect2>
<title>Creating a Large Object</title>
<para>
The routine
The function
<synopsis>
Oid lo_creat(PGconn *<replaceable class="parameter">conn</replaceable>, int <replaceable class="parameter">mode</replaceable>)
Oid lo_creat(PGconn *conn, int mode);
</synopsis>
creates a new large object.
<replaceable class="parameter">mode</replaceable> is a bit mask
......@@ -145,7 +124,11 @@ Oid lo_creat(PGconn *<replaceable class="parameter">conn</replaceable>, int <rep
historically been used at Berkeley to designate the storage manager number on which the large object
should reside. These
bits should always be zero now.
The commands below create a large object:
The return value is the OID that was assigned to the new large object.
</para>
<para>
An example:
<programlisting>
inv_oid = lo_creat(INV_READ|INV_WRITE);
</programlisting>
......@@ -158,11 +141,12 @@ inv_oid = lo_creat(INV_READ|INV_WRITE);
<para>
To import an operating system file as a large object, call
<synopsis>
Oid lo_import(PGconn *<replaceable class="parameter">conn</replaceable>, const char *<replaceable class="parameter">filename</replaceable>)
Oid lo_import(PGconn *conn, const char *filename);
</synopsis>
<replaceable class="parameter">filename</replaceable>
specifies the operating system name of
the file to be imported as a large object.
The return value is the OID that was assigned to the new large object.
</para>
</sect2>
......@@ -173,7 +157,7 @@ Oid lo_import(PGconn *<replaceable class="parameter">conn</replaceable>, const c
To export a large object
into an operating system file, call
<synopsis>
int lo_export(PGconn *<replaceable class="parameter">conn</replaceable>, Oid <replaceable class="parameter">lobjId</replaceable>, const char *<replaceable class="parameter">filename</replaceable>)
int lo_export(PGconn *conn, Oid lobjId, const char *filename);
</synopsis>
The <parameter>lobjId</parameter> argument specifies the OID of the large
object to export and the <parameter>filename</parameter> argument specifies
......@@ -187,7 +171,7 @@ int lo_export(PGconn *<replaceable class="parameter">conn</replaceable>, Oid <re
<para>
To open an existing large object, call
<synopsis>
int lo_open(PGconn *conn, Oid lobjId, int mode)
int lo_open(PGconn *conn, Oid lobjId, int mode);
</synopsis>
The <parameter>lobjId</parameter> argument specifies the OID of the large
object to open. The <parameter>mode</parameter> bits control whether the
......@@ -205,10 +189,10 @@ int lo_open(PGconn *conn, Oid lobjId, int mode)
<title>Writing Data to a Large Object</title>
<para>
The routine
<programlisting>
int lo_write(PGconn *conn, int fd, const char *buf, size_t len)
</programlisting>
The function
<synopsis>
int lo_write(PGconn *conn, int fd, const char *buf, size_t len);
</synopsis>
writes <parameter>len</parameter> bytes from <parameter>buf</parameter> to large object <parameter>fd</>. The <parameter>fd</parameter>
argument must have been returned by a previous <function>lo_open</function>.
The number of bytes actually written is returned. In
......@@ -220,10 +204,10 @@ int lo_write(PGconn *conn, int fd, const char *buf, size_t len)
<title>Reading Data from a Large Object</title>
<para>
The routine
<programlisting>
int lo_read(PGconn *conn, int fd, char *buf, size_t len)
</programlisting>
The function
<synopsis>
int lo_read(PGconn *conn, int fd, char *buf, size_t len);
</synopsis>
reads <parameter>len</parameter> bytes from large object <parameter>fd</parameter> into <parameter>buf</parameter>. The <parameter>fd</parameter>
argument must have been returned by a previous <function>lo_open</function>.
The number of bytes actually read is returned. In
......@@ -237,13 +221,26 @@ int lo_read(PGconn *conn, int fd, char *buf, size_t len)
<para>
To change the current read or write location on a large
object, call
<programlisting>
int lo_lseek(PGconn *conn, int fd, int offset, int whence)
</programlisting>
This routine moves the current location pointer for the
<synopsis>
int lo_lseek(PGconn *conn, int fd, int offset, int whence);
</synopsis>
This function moves the current location pointer for the
large object described by <parameter>fd</> to the new location specified
by <parameter>offset</>. The valid values for <parameter>whence</> are
<symbol>SEEK_SET</>, <symbol>SEEK_CUR</>, and <symbol>SEEK_END</>.
<symbol>SEEK_SET</> (seek from object start), <symbol>SEEK_CUR</> (seek from current position), and <symbol>SEEK_END</> (seek from object end). The return value is the new location pointer.
</para>
</sect2>
<sect2>
<title>Obtaining the Seek Position of a Large Object</title>
<para>
To obtain the current read or write location of a large object,
call
<synopsis>
int lo_tell(PGconn *conn, int fd);
</synopsis>
If there is an error, the return value is negative.
</para>
</sect2>
......@@ -252,9 +249,9 @@ int lo_lseek(PGconn *conn, int fd, int offset, int whence)
<para>
A large object may be closed by calling
<programlisting>
int lo_close(PGconn *conn, int fd)
</programlisting>
<synopsis>
int lo_close(PGconn *conn, int fd);
</synopsis>
where <parameter>fd</> is a large object descriptor returned by
<function>lo_open</function>. On success, <function>lo_close</function>
returns zero. On error, the return value is negative.
......@@ -267,7 +264,7 @@ int lo_close(PGconn *conn, int fd)
<para>
To remove a large object from the database, call
<synopsis>
int lo_unlink(PGconn *<replaceable class="parameter">conn</replaceable>, Oid lobjId)
int lo_unlink(PGconn *conn, Oid lobjId);
</synopsis>
The <parameter>lobjId</parameter> argument specifies the OID of the large
object to remove. In the event of an error, the return value is negative.
......@@ -278,14 +275,14 @@ int lo_unlink(PGconn *<replaceable class="parameter">conn</replaceable>, Oid lob
</sect1>
<sect1 id="lo-funcs">
<title>Server-side Built-in Functions</title>
<title>Server-side Functions</title>
<para>
There are two built-in registered functions, <function>lo_import</function>
and <function>lo_export</function> which are convenient for use
There are two built-in server-side functions, <function>lo_import</function>
and <function>lo_export</function>, for large object access, which are available for use
in <acronym>SQL</acronym>
queries.
Here is an example of their use
commands.
Here is an example of their use:
<programlisting>
CREATE TABLE image (
name text,
......@@ -301,23 +298,20 @@ SELECT lo_export(image.raster, '/tmp/motd') FROM image
</para>
</sect1>
<sect1 id="lo-libpq">
<title>Accessing Large Objects from <application>Libpq</application></title>
<sect1 id="lo-examplesect">
<title>Example Program</title>
<para>
<xref linkend="lo-example"> is a sample program which shows how the large object
interface
in <application>libpq</> can be used. Parts of the program are
commented out but are left in the source for the reader's
benefit. This program can be found in
benefit. This program can also be found in
<filename>src/test/examples/testlo.c</filename> in the source distribution.
Frontend applications which use the large object interface
in <application>libpq</application> should include the header file
<filename>libpq/libpq-fs.h</filename> and link with the <application>libpq</application> library.
</para>
<example id="lo-example">
<title>Large Objects with <application>Libpq</application> Example Program</title>
<title>Large Objects with <application>libpq</application> Example Program</title>
<programlisting>
/*--------------------------------------------------------------
*
......
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/manage-ag.sgml,v 2.24 2002/11/15 03:11:17 momjian Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/manage-ag.sgml,v 2.25 2003/03/13 01:30:29 petere Exp $
-->
<chapter id="managing-databases">
......@@ -16,7 +16,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/manage-ag.sgml,v 2.24 2002/11/15 03:11:17 m
them.
</para>
<sect1>
<sect1 id="manage-ag-overview">
<title>Overview</title>
<para>
......@@ -24,8 +24,8 @@ $Header: /cvsroot/pgsql/doc/src/sgml/manage-ag.sgml,v 2.24 2002/11/15 03:11:17 m
(<quote>database objects</quote>). Generally, every database
object (tables, functions, etc.) belongs to one and only one
database. (But there are a few system catalogs, for example
<literal>pg_database</>, that belong to a whole installation and
are accessible from each database within the installation.) More
<literal>pg_database</>, that belong to a whole cluster and
are accessible from each database within the cluster.) More
accurately, a database is a collection of schemas and the schemas
contain the tables, functions, etc. So the full hierarchy is:
server, database, schema, table (or something else instead of a
......@@ -70,10 +70,10 @@ $Header: /cvsroot/pgsql/doc/src/sgml/manage-ag.sgml,v 2.24 2002/11/15 03:11:17 m
</para>
<para>
Databases are created with the query language command
Databases are created with the SQL command
<command>CREATE DATABASE</command>:
<synopsis>
CREATE DATABASE <replaceable>name</>
CREATE DATABASE <replaceable>name</>;
</synopsis>
where <replaceable>name</> follows the usual rules for
<acronym>SQL</acronym> identifiers. The current user automatically
......@@ -93,14 +93,14 @@ CREATE DATABASE <replaceable>name</>
question remains how the <emphasis>first</> database at any given
site can be created. The first database is always created by the
<command>initdb</> command when the data storage area is
initialized. (See <xref linkend="creating-cluster">.) By convention
this database is called <literal>template1</>. So to create the
initialized. (See <xref linkend="creating-cluster">.)
This database is called <literal>template1</>. So to create the
first <quote>real</> database you can connect to
<literal>template1</>.
</para>
<para>
The name <quote>template1</quote> is no accident: When a new
The name <literal>template1</literal> is no accident: When a new
database is created, the template database is essentially cloned.
This means that any changes you make in <literal>template1</> are
propagated to all subsequently created databases. This implies that
......@@ -118,9 +118,9 @@ CREATE DATABASE <replaceable>name</>
createdb <replaceable class="parameter">dbname</replaceable>
</synopsis>
<command>createdb</> does no magic. It connects to the template1
<command>createdb</> does no magic. It connects to the <literal>template1</>
database and issues the <command>CREATE DATABASE</> command,
exactly as described above. It uses the <application>psql</> program
exactly as described above. It uses the <command>psql</> program
internally. The reference page on <command>createdb</> contains the invocation
details. Note that <command>createdb</> without any arguments will create
a database with the current user name, which may or may not be what
......@@ -174,7 +174,7 @@ createdb -O <replaceable>username</> <replaceable>dbname</>
<literal>template1</>, that is, only the standard objects predefined by
your version of <productname>PostgreSQL</productname>.
<literal>template0</> should never be changed
after <literal>initdb</>. By instructing <command>CREATE DATABASE</> to
after <command>initdb</>. By instructing <command>CREATE DATABASE</> to
copy <literal>template0</> instead of <literal>template1</>, you can
create a <quote>virgin</> user database that contains none of the
site-local additions in <literal>template1</>. This is particularly
......@@ -198,7 +198,7 @@ createdb -T template0 <replaceable>dbname</>
<para>
It is possible to create additional template databases, and indeed
one might copy any database in an installation by specifying its name
one might copy any database in a cluster by specifying its name
as the template for <command>CREATE DATABASE</>. It is important to
understand, however, that this is not (yet) intended as
a general-purpose <quote><command>COPY DATABASE</command></quote> facility. In particular, it is
......@@ -206,7 +206,7 @@ createdb -T template0 <replaceable>dbname</>
in progress)
for the duration of the copying operation. <command>CREATE DATABASE</>
will check
that no backend processes (other than itself) are connected to
that no session (other than itself) is connected to
the source database at the start of the operation, but this does not
guarantee that changes cannot be made while the copy proceeds, which
would result in an inconsistent copied database. Therefore,
......@@ -225,11 +225,9 @@ createdb -T template0 <replaceable>dbname</>
If <literal>datallowconn</literal> is false, then no new connections
to that database will be allowed (but existing sessions are not killed
simply by setting the flag false). The <literal>template0</literal>
database is normally marked <literal>datallowconn</literal> =
<literal>false</> to prevent modification of it.
database is normally marked <literal>datallowconn = false</> to prevent modification of it.
Both <literal>template0</literal> and <literal>template1</literal>
should always be marked with <literal>datistemplate</literal> =
<literal>true</>.
should always be marked with <literal>datistemplate = true</>.
</para>
<para>
......@@ -237,11 +235,11 @@ createdb -T template0 <replaceable>dbname</>
it is a good idea to perform
<command>VACUUM FREEZE</> or <command>VACUUM FULL FREEZE</> in that
database. If this is done when there are no other open transactions
in the same database, then it is guaranteed that all tuples in the
in the same database, then it is guaranteed that all rows in the
database are <quote>frozen</> and will not be subject to transaction
ID wraparound problems. This is particularly important for a database
that will have <literal>datallowconn</literal> set to false, since it
will be impossible to do routine maintenance <command>VACUUM</>s on
will be impossible to do routine maintenance <command>VACUUM</> in
such a database.
See <xref linkend="vacuum-for-wraparound"> for more information.
</para>
......@@ -295,7 +293,7 @@ ALTER DATABASE mydb SET geqo TO off;
<para>
It is possible to create a database in a location other than the
default location for the installation. Remember that all database access
default location for the installation. But remember that all database access
occurs through the
database server, so any location specified must be
accessible by the server.
......@@ -317,7 +315,7 @@ ALTER DATABASE mydb SET geqo TO off;
<para>
To create the variable in the environment of the server process
you must first shut down the server, define the variable,
initialize the data area, and finally restart the server. (See
initialize the data area, and finally restart the server. (See also
<xref linkend="postmaster-shutdown"> and <xref
linkend="postmaster-start">.) To set an environment variable, type
<programlisting>
......@@ -328,7 +326,7 @@ export PGDATA2
<programlisting>
setenv PGDATA2 /home/postgres/data
</programlisting>
in <application>csh</> or <application>tcsh</>. You have to make sure that this environment
in <command>csh</> or <command>tcsh</>. You have to make sure that this environment
variable is always defined in the server environment, otherwise
you won't be able to access that database. Therefore you probably
want to set it in some sort of shell start-up file or server
......@@ -352,7 +350,7 @@ initlocation PGDATA2
<para>
To create a database within the new location, use the command
<synopsis>
CREATE DATABASE <replaceable>name</> WITH LOCATION = '<replaceable>location</>'
CREATE DATABASE <replaceable>name</> WITH LOCATION '<replaceable>location</>';
</synopsis>
where <replaceable>location</> is the environment variable you
used, <envar>PGDATA2</> in this example. The <command>createdb</>
......@@ -386,9 +384,9 @@ gmake CPPFLAGS=-DALLOW_ABSOLUTE_DBPATHS all
<para>
Databases are destroyed with the command <command>DROP DATABASE</command>:
<synopsis>
DROP DATABASE <replaceable>name</>
DROP DATABASE <replaceable>name</>;
</synopsis>
Only the owner of the database (i.e., the user that created it), or
Only the owner of the database (i.e., the user that created it) or
a superuser, can drop a database. Dropping a database removes all objects
that were
contained within the database. The destruction of a database cannot
......@@ -399,8 +397,8 @@ DROP DATABASE <replaceable>name</>
You cannot execute the <command>DROP DATABASE</command> command
while connected to the victim database. You can, however, be
connected to any other database, including the <literal>template1</>
database,
which would be the only option for dropping the last user database of a
database.
<literal>template1</> would be the only option for dropping the last user database of a
given cluster.
</para>
......
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/mvcc.sgml,v 2.33 2003/02/19 04:06:28 momjian Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/mvcc.sgml,v 2.34 2003/03/13 01:30:29 petere Exp $
-->
<chapter id="mvcc">
......@@ -116,7 +116,6 @@ $Header: /cvsroot/pgsql/doc/src/sgml/mvcc.sgml,v 2.33 2003/02/19 04:06:28 momjia
<table tocentry="1" id="mvcc-isolevel-table">
<title><acronym>SQL</acronym> Transaction Isolation Levels</title>
<titleabbrev>Isolation Levels</titleabbrev>
<tgroup cols="4">
<thead>
<row>
......@@ -222,7 +221,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/mvcc.sgml,v 2.33 2003/02/19 04:06:28 momjia
executed within its own transaction, even though they are not yet
committed.) In effect, a <command>SELECT</command> query
sees a snapshot of the database as of the instant that that query
begins to run. Notice that two successive <command>SELECT</command>s can
begins to run. Notice that two successive <command>SELECT</command> commands can
see different data, even though they are within a single transaction, if
other transactions
commit changes during execution of the first <command>SELECT</command>.
......@@ -232,7 +231,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/mvcc.sgml,v 2.33 2003/02/19 04:06:28 momjia
<command>UPDATE</command>, <command>DELETE</command>, and <command>SELECT
FOR UPDATE</command> commands behave the same as <command>SELECT</command>
in terms of searching for target rows: they will only find target rows
that were committed as of the query start time. However, such a target
that were committed as of the command start time. However, such a target
row may have already been updated (or deleted or marked for update) by
another concurrent transaction by the time it is found. In this case, the
would-be updater will wait for the first updating transaction to commit or
......@@ -241,18 +240,18 @@ $Header: /cvsroot/pgsql/doc/src/sgml/mvcc.sgml,v 2.33 2003/02/19 04:06:28 momjia
updating the originally found row. If the first updater commits, the
second updater will ignore the row if the first updater deleted it,
otherwise it will attempt to apply its operation to the updated version of
the row. The query search condition (<literal>WHERE</> clause) is
the row. The search condition of the command (the <literal>WHERE</> clause) is
re-evaluated to see if the updated version of the row still matches the
search condition. If so, the second updater proceeds with its operation,
starting from the updated version of the row.
</para>
<para>
Because of the above rule, it is possible for updating queries to see
inconsistent snapshots --- they can see the effects of concurrent updating
queries that affected the same rows they are trying to update, but they
do not see effects of those queries on other rows in the database.
This behavior makes Read Committed mode unsuitable for queries that
Because of the above rule, it is possible for an updating command to see an
inconsistent snapshot: it can see the effects of concurrent updating
commands that affected the same rows it is trying to update, but it
does not see effects of those commands on other rows in the database.
This behavior makes Read Committed mode unsuitable for commands that
involve complex search conditions. However, it is just right for simpler
cases. For example, consider updating bank balances with transactions
like
......@@ -266,17 +265,17 @@ COMMIT;
If two such transactions concurrently try to change the balance of account
12345, we clearly want the second transaction to start from the updated
version of the account's row. Because each query is affecting only a
version of the account's row. Because each command is affecting only a
predetermined row, letting it see the updated version of the row does
not create any troublesome inconsistency.
</para>
<para>
Since in Read Committed mode each new query starts with a new snapshot
Since in Read Committed mode each new command starts with a new snapshot
that includes all transactions committed up to that instant, subsequent
queries in the same transaction will see the effects of the committed
commands in the same transaction will see the effects of the committed
concurrent transaction in any case. The point at issue here is whether
or not within a <emphasis>single</> query we see an absolutely consistent
or not within a <emphasis>single</> command we see an absolutely consistent
view of the database.
</para>
......@@ -294,11 +293,11 @@ COMMIT;
<indexterm>
<primary>isolation levels</primary>
<secondary>read serializable</secondary>
<secondary>serializable</secondary>
</indexterm>
<para>
<firstterm>Serializable</firstterm> provides the strictest transaction
The level <firstterm>Serializable</firstterm> provides the strictest transaction
isolation. This level emulates serial transaction execution,
as if transactions had been executed one after another, serially,
rather than concurrently. However, applications using this level must
......@@ -317,7 +316,7 @@ COMMIT;
<command>SELECT</command>
sees a snapshot as of the start of the transaction, not as of the start
of the current query within the transaction. Thus, successive
<command>SELECT</command>s within a single transaction always see the same
<command>SELECT</command> commands within a single transaction always see the same
data.
</para>
......@@ -354,7 +353,7 @@ ERROR: Can't serialize access due to concurrent update
</para>
<para>
Note that only updating transactions may need to be retried --- read-only
Note that only updating transactions may need to be retried; read-only
transactions will never have serialization conflicts.
</para>
......@@ -367,7 +366,7 @@ ERROR: Can't serialize access due to concurrent update
this mode is recommended only when updating transactions contain logic
sufficiently complex that they may give wrong answers in Read
Committed mode. Most commonly, Serializable mode is necessary when
a transaction performs several successive queries that must see
a transaction executes several successive commands that must see
identical views of the database.
</para>
</sect2>
......@@ -401,29 +400,29 @@ ERROR: Can't serialize access due to concurrent update
<productname>PostgreSQL</productname>.
Remember that all of these lock modes are table-level locks,
even if the name contains the word
<quote>row</quote>. The names of the lock modes are historical.
<quote>row</quote>; the names of the lock modes are historical.
To some extent the names reflect the typical usage of each lock
mode --- but the semantics are all the same. The only real difference
between one lock mode and another is the set of lock modes with
which each conflicts. Two transactions cannot hold locks of conflicting
modes on the same table at the same time. (However, a transaction
never conflicts with itself --- for example, it may acquire
never conflicts with itself. For example, it may acquire
<literal>ACCESS EXCLUSIVE</literal> lock and later acquire
<literal>ACCESS SHARE</literal> lock on the same table.) Non-conflicting
lock modes may be held concurrently by many transactions. Notice in
particular that some lock modes are self-conflicting (for example,
<literal>ACCESS EXCLUSIVE</literal> cannot be held by more than one
an <literal>ACCESS EXCLUSIVE</literal> lock cannot be held by more than one
transaction at a time) while others are not self-conflicting (for example,
<literal>ACCESS SHARE</literal> can be held by multiple transactions).
Once acquired, a lock mode is held till end of transaction.
an <literal>ACCESS SHARE</literal> lock can be held by multiple transactions).
Once acquired, a lock is held till end of transaction.
</para>
<para>
To examine a list of the currently outstanding locks in a
database server, use the <literal>pg_locks</literal> system
view. For more information on monitoring the status of the lock
manager subsystem, refer to the &cite-admin;.
</para>
<para>
To examine a list of the currently outstanding locks in a database
server, use the <literal>pg_locks</literal> system view. For more
information on monitoring the status of the lock manager
subsystem, refer to the &cite-admin;.
</para>
<variablelist>
<title>Table-level lock modes</title>
......@@ -482,7 +481,7 @@ ERROR: Can't serialize access due to concurrent update
acquire this lock mode on the target table (in addition to
<literal>ACCESS SHARE</literal> locks on any other referenced
tables). In general, this lock mode will be acquired by any
query that modifies the data in a table.
command that modifies the data in a table.
</para>
</listitem>
</varlistentry>
......@@ -557,7 +556,7 @@ ERROR: Can't serialize access due to concurrent update
EXCLUSIVE</literal>, <literal>SHARE</literal>, <literal>SHARE
ROW EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
<literal>ACCESS EXCLUSIVE</literal> lock modes.
This mode allows only concurrent <literal>ACCESS SHARE</literal>,
This mode allows only concurrent <literal>ACCESS SHARE</literal> locks,
i.e., only reads from the table can proceed in parallel with a
transaction holding this lock mode.
</para>
......@@ -596,13 +595,13 @@ ERROR: Can't serialize access due to concurrent update
</varlistentry>
</variablelist>
<note>
<tip>
<para>
Only an <literal>ACCESS EXCLUSIVE</literal> lock blocks a
<command>SELECT</command> (without <option>FOR UPDATE</option>)
statement.
</para>
</note>
</tip>
</sect2>
......@@ -635,7 +634,7 @@ ERROR: Can't serialize access due to concurrent update
<para>
In addition to table and row locks, page-level share/exclusive locks are
used to control read/write access to table pages in the shared buffer
pool. These locks are released immediately after a tuple is fetched or
pool. These locks are released immediately after a row is fetched or
updated. Application developers normally need not be concerned with
page-level locks, but we mention them for completeness.
</para>
......@@ -777,7 +776,7 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
example, a banking application might wish to check that the sum of
all credits in one table equals the sum of debits in another table,
when both tables are being actively updated. Comparing the results of two
successive <literal>SELECT SUM(...)</literal> commands will not work reliably under
successive <literal>SELECT sum(...)</literal> commands will not work reliably under
Read Committed mode, since the second query will likely include the results
of transactions not counted by the first. Doing the two sums in a
single serializable transaction will give an accurate picture of the
......@@ -800,10 +799,11 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
Read Committed mode, or in Serializable mode be careful to obtain the
lock(s) before performing queries. An explicit lock obtained in a
serializable transaction guarantees that no other transactions modifying
the table are still running --- but if the snapshot seen by the
the table are still running, but if the snapshot seen by the
transaction predates obtaining the lock, it may predate some now-committed
changes in the table. A serializable transaction's snapshot is actually
frozen at the start of its first query (<literal>SELECT</>, <literal>INSERT</>,
frozen at the start of its first query or data-modification command
(<literal>SELECT</>, <literal>INSERT</>,
<literal>UPDATE</>, or <literal>DELETE</>), so
it's possible to obtain explicit locks before the snapshot is
frozen.
......@@ -819,9 +819,6 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
data, nonblocking read/write access is not currently offered for every
index access method implemented
in <productname>PostgreSQL</productname>.
</para>
<para>
The various index types are handled as follows:
<variablelist>
......@@ -833,7 +830,7 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
<para>
Short-term share/exclusive page-level locks are used for
read/write access. Locks are released immediately after each
index tuple is fetched or inserted. B-tree indexes provide
index row is fetched or inserted. B-tree indexes provide
the highest concurrency without deadlock conditions.
</para>
</listitem>
......@@ -846,7 +843,7 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
<listitem>
<para>
Share/exclusive index-level locks are used for read/write access.
Locks are released after the statement (command) is done.
Locks are released after the command is done.
</para>
</listitem>
</varlistentry>
......
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/perform.sgml,v 1.26 2003/01/28 03:34:29 momjian Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/perform.sgml,v 1.27 2003/03/13 01:30:29 petere Exp $
-->
<chapter id="performance-tips">
......@@ -39,8 +39,8 @@ $Header: /cvsroot/pgsql/doc/src/sgml/perform.sgml,v 1.26 2003/01/28 03:34:29 mom
<listitem>
<para>
Estimated total cost (If all rows are retrieved, which they may not
be --- a query with a <literal>LIMIT</> clause will stop short of paying the total cost,
Estimated total cost (If all rows were to be retrieved, which they may not
be: a query with a <literal>LIMIT</> clause will stop short of paying the total cost,
for example.)
</para>
</listitem>
......@@ -48,7 +48,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/perform.sgml,v 1.26 2003/01/28 03:34:29 mom
<listitem>
<para>
Estimated number of rows output by this plan node (Again, only if
executed to completion.)
executed to completion)
</para>
</listitem>
......@@ -74,8 +74,8 @@ $Header: /cvsroot/pgsql/doc/src/sgml/perform.sgml,v 1.26 2003/01/28 03:34:29 mom
the cost of all its child nodes. It's also important to realize that
the cost only reflects things that the planner/optimizer cares about.
In particular, the cost does not consider the time spent transmitting
result rows to the frontend --- which could be a pretty dominant
factor in the true elapsed time, but the planner ignores it because
result rows to the frontend, which could be a pretty dominant
factor in the true elapsed time; but the planner ignores it because
it cannot change it by altering the plan. (Every correct plan will
output the same row set, we trust.)
</para>
......@@ -83,19 +83,20 @@ $Header: /cvsroot/pgsql/doc/src/sgml/perform.sgml,v 1.26 2003/01/28 03:34:29 mom
<para>
Rows output is a little tricky because it is <emphasis>not</emphasis> the
number of rows
processed/scanned by the query --- it is usually less, reflecting the
estimated selectivity of any <literal>WHERE</>-clause constraints that are being
processed/scanned by the query, it is usually less, reflecting the
estimated selectivity of any <literal>WHERE</>-clause conditions that are being
applied at this node. Ideally the top-level rows estimate will
approximate the number of rows actually returned, updated, or deleted
by the query.
</para>
<para>
Here are some examples (using the regress test database after a
Here are some examples (using the regression test database after a
<literal>VACUUM ANALYZE</>, and 7.3 development sources):
<programlisting>
regression=# EXPLAIN SELECT * FROM tenk1;
EXPLAIN SELECT * FROM tenk1;
QUERY PLAN
-------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..333.00 rows=10000 width=148)
......@@ -119,7 +120,8 @@ SELECT * FROM pg_class WHERE relname = 'tenk1';
Now let's modify the query to add a <literal>WHERE</> condition:
<programlisting>
regression=# EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 1000;
EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 1000;
QUERY PLAN
------------------------------------------------------------
Seq Scan on tenk1 (cost=0.00..358.00 rows=1033 width=148)
......@@ -145,7 +147,8 @@ regression=# EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 1000;
Modify the query to restrict the condition even more:
<programlisting>
regression=# EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 50;
EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 50;
QUERY PLAN
-------------------------------------------------------------------------------
Index Scan using tenk1_unique1 on tenk1 (cost=0.00..179.33 rows=49 width=148)
......@@ -161,11 +164,11 @@ regression=# EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 50;
</para>
<para>
Add another clause to the <literal>WHERE</> condition:
Add another condition to the <literal>WHERE</> clause:
<programlisting>
regression=# EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 50 AND
regression-# stringu1 = 'xxx';
EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 50 AND stringu1 = 'xxx';
QUERY PLAN
-------------------------------------------------------------------------------
Index Scan using tenk1_unique1 on tenk1 (cost=0.00..179.45 rows=1 width=148)
......@@ -173,7 +176,7 @@ regression-# stringu1 = 'xxx';
Filter: (stringu1 = 'xxx'::name)
</programlisting>
The added clause <literal>stringu1 = 'xxx'</literal> reduces the
The added condition <literal>stringu1 = 'xxx'</literal> reduces the
output-rows estimate, but not the cost because we still have to visit the
same set of rows. Notice that the <literal>stringu1</> clause
cannot be applied as an index condition (since this index is only on
......@@ -183,11 +186,11 @@ regression-# stringu1 = 'xxx';
</para>
<para>
Let's try joining two tables, using the fields we have been discussing:
Let's try joining two tables, using the columns we have been discussing:
<programlisting>
regression=# EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 50
regression-# AND t1.unique2 = t2.unique2;
EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 50 AND t1.unique2 = t2.unique2;
QUERY PLAN
----------------------------------------------------------------------------
Nested Loop (cost=0.00..327.02 rows=49 width=296)
......@@ -203,7 +206,7 @@ regression-# AND t1.unique2 = t2.unique2;
<para>
In this nested-loop join, the outer scan is the same index scan we had
in the example before last, and so its cost and row count are the same
because we are applying the <literal>unique1 &lt; 50</literal> <literal>WHERE</> clause at that node.
because we are applying the <literal>WHERE</> clause <literal>unique1 &lt; 50</literal> at that node.
The <literal>t1.unique2 = t2.unique2</literal> clause is not relevant yet, so it doesn't
affect row count of the outer scan. For the inner scan, the <literal>unique2</> value of the
current
......@@ -218,9 +221,9 @@ regression-# AND t1.unique2 = t2.unique2;
</para>
<para>
In this example the loop's output row count is the same as the product
In this example the join's output row count is the same as the product
of the two scans' row counts, but that's not true in general, because
in general you can have <literal>WHERE</> clauses that mention both relations and
in general you can have <literal>WHERE</> clauses that mention both tables and
so can only be applied at the join point, not to either input scan.
For example, if we added <literal>WHERE ... AND t1.hundred &lt; t2.hundred</literal>,
that would decrease the output row count of the join node, but not change
......@@ -234,10 +237,9 @@ regression-# AND t1.unique2 = t2.unique2;
also <xref linkend="explicit-joins">.)
<programlisting>
regression=# SET enable_nestloop = off;
SET
regression=# EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 50
regression-# AND t1.unique2 = t2.unique2;
SET enable_nestloop = off;
EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 50 AND t1.unique2 = t2.unique2;
QUERY PLAN
--------------------------------------------------------------------------
Hash Join (cost=179.45..563.06 rows=49 width=296)
......@@ -269,9 +271,8 @@ regression-# AND t1.unique2 = t2.unique2;
For example, we might get a result like this:
<screen>
regression=# EXPLAIN ANALYZE
regression-# SELECT * FROM tenk1 t1, tenk2 t2
regression-# WHERE t1.unique1 &lt; 50 AND t1.unique2 = t2.unique2;
EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 50 AND t1.unique2 = t2.unique2;
QUERY PLAN
-------------------------------------------------------------------------------
Nested Loop (cost=0.00..327.02 rows=49 width=296)
......@@ -345,14 +346,14 @@ regression-# WHERE t1.unique1 &lt; 50 AND t1.unique2 = t2.unique2;
<para>
One component of the statistics is the total number of entries in each
table and index, as well as the number of disk blocks occupied by each
table and index. This information is kept in
<structname>pg_class</structname>'s <structfield>reltuples</structfield>
and <structfield>relpages</structfield> columns. We can look at it
table and index. This information is kept in the table
<structname>pg_class</structname> in the columns <structfield>reltuples</structfield>
and <structfield>relpages</structfield>. We can look at it
with queries similar to this one:
<screen>
regression=# SELECT relname, relkind, reltuples, relpages FROM pg_class
regression-# WHERE relname LIKE 'tenk1%';
SELECT relname, relkind, reltuples, relpages FROM pg_class WHERE relname LIKE 'tenk1%';
relname | relkind | reltuples | relpages
---------------+---------+-----------+----------
tenk1 | r | 10000 | 233
......@@ -385,10 +386,10 @@ regression-# WHERE relname LIKE 'tenk1%';
to having <literal>WHERE</> clauses that restrict the rows to be examined.
The planner thus needs to make an estimate of the
<firstterm>selectivity</> of <literal>WHERE</> clauses, that is, the fraction of
rows that match each clause of the <literal>WHERE</> condition. The information
rows that match each condition in the <literal>WHERE</> clause. The information
used for this task is stored in the <structname>pg_statistic</structname>
system catalog. Entries in <structname>pg_statistic</structname> are
updated by <command>ANALYZE</> and <command>VACUUM ANALYZE</> commands,
updated by <command>ANALYZE</> and <command>VACUUM ANALYZE</> commands
and are always approximate even when freshly updated.
</para>
......@@ -398,7 +399,7 @@ regression-# WHERE relname LIKE 'tenk1%';
when examining the statistics manually. <structname>pg_stats</structname>
is designed to be more easily readable. Furthermore,
<structname>pg_stats</structname> is readable by all, whereas
<structname>pg_statistic</structname> is only readable by the superuser.
<structname>pg_statistic</structname> is only readable by a superuser.
(This prevents unprivileged users from learning something about
the contents of other people's tables from the statistics. The
<structname>pg_stats</structname> view is restricted to show only
......@@ -406,13 +407,13 @@ regression-# WHERE relname LIKE 'tenk1%';
For example, we might do:
<screen>
regression=# SELECT attname, n_distinct, most_common_vals FROM pg_stats WHERE tablename = 'road';
SELECT attname, n_distinct, most_common_vals FROM pg_stats WHERE tablename = 'road';
attname | n_distinct | most_common_vals
---------+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
name | -0.467008 | {"I- 580 Ramp","I- 880 Ramp","Sp Railroad ","I- 580 ","I- 680 Ramp","I- 80 Ramp","14th St ","5th St ","Mission Blvd","I- 880 "}
thepath | 20 | {"[(-122.089,37.71),(-122.0886,37.711)]"}
(2 rows)
regression=#
</screen>
</para>
......@@ -428,7 +429,7 @@ regression=#
<thead>
<row>
<entry>Name</entry>
<entry>Type</entry>
<entry>Data Type</entry>
<entry>Description</entry>
</row>
</thead>
......@@ -437,25 +438,25 @@ regression=#
<row>
<entry><literal>tablename</literal></entry>
<entry><type>name</type></entry>
<entry>Name of the table containing the column</entry>
<entry>Name of the table containing the column.</entry>
</row>
<row>
<entry><literal>attname</literal></entry>
<entry><type>name</type></entry>
<entry>Column described by this row</entry>
<entry>Name of the column described by this row.</entry>
</row>
<row>
<entry><literal>null_frac</literal></entry>
<entry><type>real</type></entry>
<entry>Fraction of column's entries that are null</entry>
<entry>Fraction of column entries that are null.</entry>
</row>
<row>
<entry><literal>avg_width</literal></entry>
<entry><type>integer</type></entry>
<entry>Average width in bytes of the column's entries</entry>
<entry>Average width in bytes of the column entries.</entry>
</row>
<row>
......@@ -488,25 +489,25 @@ regression=#
</row>
<row>
<entry>histogram_bounds</entry>
<entry><literal>histogram_bounds</literal></entry>
<entry><type>text[]</type></entry>
<entry>A list of values that divide the column's values into
groups of approximately equal population. The
<structfield>most_common_vals</>, if present, are omitted from the
histogram calculation. (Omitted if column data type does not have a
<literal>&lt;</> operator, or if the <structfield>most_common_vals</>
groups of approximately equal population. The values in
<structfield>most_common_vals</>, if present, are omitted from this
histogram calculation. (This columns is not filled if the column data type does not have a
<literal>&lt;</> operator or if the <structfield>most_common_vals</>
list accounts for the entire population.)
</entry>
</row>
<row>
<entry>correlation</entry>
<entry><literal>correlation</literal></entry>
<entry><type>real</type></entry>
<entry>Statistical correlation between physical row ordering and
logical ordering of the column values. This ranges from -1 to +1.
When the value is near -1 or +1, an index scan on the column will
be estimated to be cheaper than when it is near zero, due to reduction
of random access to the disk. (Omitted if column data type does
of random access to the disk. (This column is not filled if the column data type does
not have a <literal>&lt;</> operator.)
</entry>
</row>
......@@ -532,7 +533,7 @@ regression=#
<title>Controlling the Planner with Explicit <literal>JOIN</> Clauses</title>
<para>
Beginning with <productname>PostgreSQL</productname> 7.1 it has been possible
It is possible
to control the query planner to some extent by using the explicit <literal>JOIN</>
syntax. To see why this matters, we first need some background.
</para>
......@@ -547,7 +548,7 @@ SELECT * FROM a, b, c WHERE a.id = b.id AND b.ref = c.id;
the <literal>WHERE</> condition <literal>a.id = b.id</>, and then
joins C to this joined table, using the other <literal>WHERE</>
condition. Or it could join B to C and then join A to that result.
Or it could join A to C and then join them with B --- but that
Or it could join A to C and then join them with B, but that
would be inefficient, since the full Cartesian product of A and C
would have to be formed, there being no applicable condition in the
<literal>WHERE</> clause to allow optimization of the join. (All
......@@ -570,7 +571,7 @@ SELECT * FROM a, b, c WHERE a.id = b.id AND b.ref = c.id;
<productname>PostgreSQL</productname> planner will switch from exhaustive
search to a <firstterm>genetic</firstterm> probabilistic search
through a limited number of possibilities. (The switch-over threshold is
set by the <varname>GEQO_THRESHOLD</varname> run-time
set by the <varname>geqo_threshold</varname> run-time
parameter described in the &cite-admin;.)
The genetic search takes less time, but it won't
necessarily find the best possible plan.
......@@ -611,7 +612,7 @@ SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
<para>
To force the planner to follow the <literal>JOIN</> order for inner joins,
set the <varname>JOIN_COLLAPSE_LIMIT</> run-time parameter to 1.
set the <varname>join_collapse_limit</> run-time parameter to 1.
(Other possible values are discussed below.)
</para>
......@@ -622,7 +623,7 @@ SELECT * FROM a JOIN (b JOIN c ON (b.ref = c.id)) ON (a.id = b.id);
<programlisting>
SELECT * FROM a CROSS JOIN b, c, d, e WHERE ...;
</programlisting>
With <varname>JOIN_COLLAPSE_LIMIT</> = 1, this
With <varname>join_collapse_limit</> = 1, this
forces the planner to join A to B before joining them to other tables,
but doesn't constrain its choices otherwise. In this example, the
number of possible join orders is reduced by a factor of 5.
......@@ -639,43 +640,43 @@ SELECT * FROM a CROSS JOIN b, c, d, e WHERE ...;
<para>
A closely related issue that affects planning time is collapsing of
sub-SELECTs into their parent query. For example, consider
subqueries into their parent query. For example, consider
<programlisting>
SELECT *
FROM x, y,
(SELECT * FROM a, b, c WHERE something) AS ss
WHERE somethingelse
(SELECT * FROM a, b, c WHERE something) AS ss
WHERE somethingelse;
</programlisting>
This situation might arise from use of a view that contains a join;
the view's SELECT rule will be inserted in place of the view reference,
the view's <literal>SELECT</> rule will be inserted in place of the view reference,
yielding a query much like the above. Normally, the planner will try
to collapse the sub-query into the parent, yielding
to collapse the subquery into the parent, yielding
<programlisting>
SELECT * FROM x, y, a, b, c WHERE something AND somethingelse
SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
</programlisting>
This usually results in a better plan than planning the sub-query
separately. (For example, the outer WHERE conditions might be such that
This usually results in a better plan than planning the subquery
separately. (For example, the outer <literal>WHERE</> conditions might be such that
joining X to A first eliminates many rows of A, thus avoiding the need to
form the full logical output of the sub-select.) But at the same time,
form the full logical output of the subquery.) But at the same time,
we have increased the planning time; here, we have a five-way join
problem replacing two separate three-way join problems. Because of the
exponential growth of the number of possibilities, this makes a big
difference. The planner tries to avoid getting stuck in huge join search
problems by not collapsing a sub-query if more than
<varname>FROM_COLLAPSE_LIMIT</> FROM-items would result in the parent
problems by not collapsing a subquery if more than
<varname>from_collapse_limit</> <literal>FROM</> items would result in the parent
query. You can trade off planning time against quality of plan by
adjusting this run-time parameter up or down.
</para>
<para>
<varname>FROM_COLLAPSE_LIMIT</> and <varname>JOIN_COLLAPSE_LIMIT</>
<varname>from_collapse_limit</> and <varname>join_collapse_limit</>
are similarly named because they do almost the same thing: one controls
when the planner will <quote>flatten out</> sub-SELECTs, and the
other controls when it will flatten out explicit inner JOINs. Typically
you would either set <varname>JOIN_COLLAPSE_LIMIT</> equal to
<varname>FROM_COLLAPSE_LIMIT</> (so that explicit JOINs and sub-SELECTs
act similarly) or set <varname>JOIN_COLLAPSE_LIMIT</> to 1 (if you want
to control join order with explicit JOINs). But you might set them
when the planner will <quote>flatten out</> subselects, and the
other controls when it will flatten out explicit inner joins. Typically
you would either set <varname>join_collapse_limit</> equal to
<varname>from_collapse_limit</> (so that explicit joins and subselects
act similarly) or set <varname>join_collapse_limit</> to 1 (if you want
to control join order with explicit joins). But you might set them
differently if you are trying to fine-tune the tradeoff between planning
time and run time.
</para>
......@@ -701,19 +702,19 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse
make sure the library does it when you want it done.)
If you allow each insertion to be committed separately,
<productname>PostgreSQL</productname> is doing a lot of work for each
record added.
row added.
An additional benefit of doing all insertions in one transaction
is that if the insertion of one record were to fail then the
insertion of all records inserted up to that point would be rolled
is that if the insertion of one row were to fail then the
insertion of all rows inserted up to that point would be rolled
back, so you won't be stuck with partially loaded data.
</para>
</sect2>
<sect2 id="populate-copy-from">
<title>Use COPY FROM</title>
<title>Use <command>COPY FROM</command></title>
<para>
Use <command>COPY FROM STDIN</command> to load all the records in one
Use <command>COPY FROM STDIN</command> to load all the rows in one
command, instead of using
a series of <command>INSERT</command> commands. This reduces parsing,
planning, etc.
......@@ -730,12 +731,12 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse
create the table, bulk-load with <command>COPY</command>, then create any
indexes needed
for the table. Creating an index on pre-existing data is quicker than
updating it incrementally as each record is loaded.
updating it incrementally as each row is loaded.
</para>
<para>
If you are augmenting an existing table, you can <command>DROP
INDEX</command>, load the table, then recreate the index. Of
If you are augmenting an existing table, you can drop the index,
load the table, then recreate the index. Of
course, the database performance for other users may be adversely
affected during the time that the index is missing. One should also
think twice before dropping unique indexes, since the error checking
......@@ -744,7 +745,7 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse
</sect2>
<sect2 id="populate-analyze">
<title>Run ANALYZE Afterwards</title>
<title>Run <command>ANALYZE</command> Afterwards</title>
<para>
It's a good idea to run <command>ANALYZE</command> or <command>VACUUM
......
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/queries.sgml,v 1.19 2002/11/11 20:14:03 petere Exp $ -->
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/queries.sgml,v 1.20 2003/03/13 01:30:29 petere Exp $ -->
<chapter id="queries">
<title>Queries</title>
......@@ -157,18 +157,17 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r
row consisting of all columns in <replaceable>T1</replaceable>
followed by all columns in <replaceable>T2</replaceable>. If
the tables have N and M rows respectively, the joined
table will have N * M rows. A cross join is equivalent to an
<literal>INNER JOIN ON TRUE</literal>.
table will have N * M rows.
</para>
<tip>
<para>
<literal>FROM <replaceable>T1</replaceable> CROSS JOIN
<replaceable>T2</replaceable></literal> is equivalent to
<literal>FROM <replaceable>T1</replaceable>,
<replaceable>T2</replaceable></literal>.
</para>
</tip>
<para>
<literal>FROM <replaceable>T1</replaceable> CROSS JOIN
<replaceable>T2</replaceable></literal> is equivalent to
<literal>FROM <replaceable>T1</replaceable>,
<replaceable>T2</replaceable></literal>. It is also equivalent to
<literal>FROM <replaceable>T1</replaceable> INNER JOIN
<replaceable>T2</replaceable> ON TRUE</literal> (see below).
</para>
</listitem>
</varlistentry>
......@@ -240,7 +239,6 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r
<para>
The possible types of qualified join are:
</para>
<variablelist>
<varlistentry>
......@@ -302,6 +300,7 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r
</listitem>
</varlistentry>
</variablelist>
</para>
</listitem>
</varlistentry>
</variablelist>
......@@ -630,12 +629,12 @@ SELECT ... FROM fdt WHERE EXISTS (SELECT c1 FROM t2 WHERE c2 > fdt.c1)
condition of the <literal>WHERE</> clause are eliminated from
<literal>fdt</literal>. Notice the use of scalar subqueries as
value expressions. Just like any other query, the subqueries can
employ complex table expressions. Notice how
employ complex table expressions. Notice also how
<literal>fdt</literal> is referenced in the subqueries.
Qualifying <literal>c1</> as <literal>fdt.c1</> is only necessary
if <literal>c1</> is also the name of a column in the derived
input table of the subquery. Qualifying the column name adds
clarity even when it is not needed. This shows how the column
input table of the subquery. But qualifying the column name adds
clarity even when it is not needed. This example shows how the column
naming scope of an outer query extends into its inner queries.
</para>
</sect2>
......@@ -663,7 +662,7 @@ SELECT <replaceable>select_list</replaceable>
</synopsis>
<para>
The <literal>GROUP BY</> clause is used to group together rows in
The <literal>GROUP BY</> clause is used to group together those rows in
a table that share the same values in all the columns listed. The
order in which the columns are listed does not matter. The
purpose is to reduce each group of rows sharing common values into
......@@ -711,7 +710,7 @@ SELECT <replaceable>select_list</replaceable>
c | 2
(3 rows)
</screen>
Here <literal>sum()</literal> is an aggregate function that
Here <literal>sum</literal> is an aggregate function that
computes a single value over the entire group. More information
about the available aggregate functions can be found in <xref
linkend="functions-aggregate">.
......@@ -727,9 +726,8 @@ SELECT <replaceable>select_list</replaceable>
</tip>
<para>
Here is another example: <function>sum(sales)</function> on a
table grouped by product code gives the total sales for each
product, not the total sales on all products.
Here is another example: it calculates the total sales for each
product (rather than the total sales on all products).
<programlisting>
SELECT product_id, p.name, (sum(s.units) * p.price) AS sales
FROM products p LEFT JOIN sales s USING (product_id)
......@@ -744,8 +742,8 @@ SELECT product_id, p.name, (sum(s.units) * p.price) AS sales
unnecessary, but this is not implemented yet.) The column
<literal>s.units</> does not have to be in the <literal>GROUP
BY</> list since it is only used in an aggregate expression
(<function>sum()</function>), which represents the group of sales
of a product. For each product, a summary row is returned about
(<literal>sum(...)</literal>), which represents the sales
of a product. For each product, the query returns a summary row about
all sales of the product.
</para>
......@@ -800,10 +798,11 @@ SELECT product_id, p.name, (sum(s.units) * (p.price - p.cost)) AS profit
HAVING sum(p.price * s.units) > 5000;
</programlisting>
In the example above, the <literal>WHERE</> clause is selecting
rows by a column that is not grouped, while the <literal>HAVING</>
rows by a column that is not grouped (the expression is only true for
sales during the last four weeks), while the <literal>HAVING</>
clause restricts the output to groups with total gross sales over
5000. Note that the aggregate expressions do not necessarily need
to be the same everywhere.
to be the same in all parts of the query.
</para>
</sect2>
</sect1>
......@@ -852,7 +851,7 @@ SELECT a, b, c FROM ...
If more than one table has a column of the same name, the table
name must also be given, as in
<programlisting>
SELECT tbl1.a, tbl2.b, tbl1.c FROM ...
SELECT tbl1.a, tbl2.a, tbl1.b FROM ...
</programlisting>
(See also <xref linkend="queries-where">.)
</para>
......@@ -860,7 +859,7 @@ SELECT tbl1.a, tbl2.b, tbl1.c FROM ...
<para>
If an arbitrary value expression is used in the select list, it
conceptually adds a new virtual column to the returned table. The
value expression is evaluated once for each retrieved row, with
value expression is evaluated once for each result row, with
the row's values substituted for any column references. But the
expressions in the select list do not have to reference any
columns in the table expression of the <literal>FROM</> clause;
......@@ -888,7 +887,7 @@ SELECT a AS value, b + c AS sum FROM ...
</para>
<para>
If no output column name is specified via AS, the system assigns a
If no output column name is specified using <literal>AS</>, the system assigns a
default name. For simple column references, this is the name of the
referenced column. For function
calls, this is the name of the function. For complex expressions,
......@@ -1129,7 +1128,7 @@ SELECT <replaceable>select_list</replaceable>
<para>
<literal>OFFSET</> says to skip that many rows before beginning to
return rows to the client. <literal>OFFSET 0</> is the same as
return rows. <literal>OFFSET 0</> is the same as
omitting the <literal>OFFSET</> clause. If both <literal>OFFSET</>
and <literal>LIMIT</> appear, then <literal>OFFSET</> rows are
skipped before starting to count the <literal>LIMIT</> rows that
......@@ -1140,7 +1139,7 @@ SELECT <replaceable>select_list</replaceable>
When using <literal>LIMIT</>, it is a good idea to use an
<literal>ORDER BY</> clause that constrains the result rows into a
unique order. Otherwise you will get an unpredictable subset of
the query's rows---you may be asking for the tenth through
the query's rows. --- You may be asking for the tenth through
twentieth rows, but tenth through twentieth in what ordering? The
ordering is unknown, unless you specified <literal>ORDER BY</>.
</para>
......
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/query.sgml,v 1.28 2002/11/11 20:14:03 petere Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/query.sgml,v 1.29 2003/03/13 01:30:29 petere Exp $
-->
<chapter id="tutorial-sql">
......@@ -214,7 +214,7 @@ INSERT INTO weather VALUES ('San Francisco', 46, 50, 0.25, '1994-11-27');
The <type>point</type> type requires a coordinate pair as input,
as shown here:
<programlisting>
INSERT INTO cities VALUES ('San Francisco', '(-194.0, 53.0)');
INSERT INTO cities VALUES ('San Francisco', '(-194.0, 53.0)');
</programlisting>
</para>
......@@ -296,7 +296,7 @@ SELECT * FROM weather;
</para>
<para>
You may specify any arbitrary expressions in the target list. For
You may specify any arbitrary expressions in the select list. For
example, you can do:
<programlisting>
SELECT city, (temp_hi+temp_lo)/2 AS temp_avg, date FROM weather;
......@@ -339,7 +339,7 @@ SELECT * FROM weather
<indexterm><primary>DISTINCT</primary></indexterm>
<indexterm><primary>duplicate</primary></indexterm>
As a final note, you can request that the results of a select can
As a final note, you can request that the results of a query can
be returned in sorted order or with duplicate rows removed:
<programlisting>
......@@ -710,7 +710,7 @@ SELECT city, max(temp_lo)
<literal>WHERE</literal> clause must not contain aggregate functions;
it makes no sense to try to use an aggregate to determine which rows
will be inputs to the aggregates. On the other hand,
<literal>HAVING</literal> clauses always contain aggregate functions.
<literal>HAVING</literal> clause always contains aggregate functions.
(Strictly speaking, you are allowed to write a <literal>HAVING</literal>
clause that doesn't use aggregates, but it's wasteful: The same condition
could be used more efficiently at the <literal>WHERE</literal> stage.)
......
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/regress.sgml,v 1.30 2002/11/08 20:26:12 tgl Exp $ -->
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/regress.sgml,v 1.31 2003/03/13 01:30:29 petere Exp $ -->
<chapter id="regress">
<title id="regress-title">Regression Tests</title>
<sect1 id="regress-intro">
<title>Introduction</title>
<para>
The regression tests are a comprehensive set of tests for the SQL
implementation in <productname>PostgreSQL</productname>. They test
standard SQL operations as well as the extended capabilities of
<productname>PostgreSQL</productname>. The test suite was
originally developed by Jolly Chen and Andrew Yu, and was
extensively revised and repackaged by Marc Fournier and Thomas
Lockhart. From <productname>PostgreSQL</productname> 6.1 onward
the regression tests are current for every official release.
<productname>PostgreSQL</productname>. From
<productname>PostgreSQL</productname> 6.1 onward, the regression
tests are current for every official release.
</para>
</sect1>
<sect1 id="regress-run">
<title>Running the Tests</title>
......@@ -40,12 +33,12 @@
To run the regression tests after building but before installation,
type
<screen>
<prompt>$ </prompt><userinput>gmake check</userinput>
gmake check
</screen>
in the top-level directory. (Or you can change to
<filename>src/test/regress</filename> and run the command there.)
This will first build several auxiliary files, such as
platform-dependent <quote>expected</quote> files and some sample
some sample
user-defined trigger functions, and then run the test driver
script. At the end you should see something like
<screen>
......@@ -66,7 +59,7 @@
If you already did the build as root, you do not have to start all
over. Instead, make the regression test directory writable by
some other user, log in as that user, and restart the tests.
For example,
For example
<screen>
<prompt>root# </prompt><userinput>chmod -R a+w src/test/regress</userinput>
<prompt>root# </prompt><userinput>chmod -R a+w contrib/spi</userinput>
......@@ -87,7 +80,7 @@
<para>
The parallel regression test starts quite a few processes under your
user ID. Presently, the maximum concurrency is twenty parallel test
scripts, which means sixty processes --- there's a backend, a <application>psql</>,
scripts, which means sixty processes: there's a server process, a <application>psql</>,
and usually a shell parent process for the <application>psql</> for each test script.
So if your system enforces a per-user limit on the number of processes,
make sure this limit is at least seventy-five or so, else you may get
......@@ -105,11 +98,9 @@
too many child processes in parallel. This may cause the parallel
test run to lock up or fail. In such cases, specify a different
Bourne-compatible shell on the command line, for example:
<informalexample>
<screen>
<prompt>$ </prompt><userinput>gmake SHELL=/bin/ksh check</userinput>
gmake SHELL=/bin/ksh check
</screen>
</informalexample>
If no non-broken shell is available, you can alter the parallel test
schedule as suggested above.
</para>
......@@ -120,7 +111,7 @@
initialize a data area and start the
server, <![%standalone-ignore;[as explained in <xref linkend="runtime">, ]]> then type
<screen>
<prompt>$ </prompt><userinput>gmake installcheck</userinput>
gmake installcheck
</screen>
The tests will expect to contact the server at the local host and the
default port number, unless directed otherwise by <envar>PGHOST</envar> and <envar>PGPORT</envar>
......@@ -137,7 +128,7 @@
<quote>fail</quote> some of these regression tests due to
platform-specific artifacts such as varying floating-point representation
and time zone support. The tests are currently evaluated using a simple
<application>diff</application> comparison against the outputs
<command>diff</command> comparison against the outputs
generated on a reference system, so the results are sensitive to
small system differences. When a test is reported as
<quote>failed</quote>, always examine the differences between
......@@ -150,12 +141,12 @@
<para>
The actual outputs of the regression tests are in files in the
<filename>src/test/regress/results</filename> directory. The test
script uses <application>diff</application> to compare each output
script uses <command>diff</command> to compare each output
file against the reference outputs stored in the
<filename>src/test/regress/expected</filename> directory. Any
differences are saved for your inspection in
<filename>src/test/regress/regression.diffs</filename>. (Or you
can run <application>diff</application> yourself, if you prefer.)
can run <command>diff</command> yourself, if you prefer.)
</para>
<sect2>
......@@ -183,7 +174,7 @@
failures. The regression test suite is set up to handle this
problem by providing alternative result files that together are
known to handle a large number of locales. For example, for the
<quote>char</quote> test, the expected file
<literal>char</literal> test, the expected file
<filename>char.out</filename> handles the <literal>C</> and <literal>POSIX</> locales,
and the file <filename>char_1.out</filename> handles many other
locales. The regression test driver will automatically pick the
......@@ -214,28 +205,28 @@
fail if you run the test on the day of a daylight-saving time
changeover, or the day before or after one. These queries assume
that the intervals between midnight yesterday, midnight today and
midnight tomorrow are exactly twenty-four hours -- which is wrong
midnight tomorrow are exactly twenty-four hours --- which is wrong
if daylight-saving time went into or out of effect meanwhile.
</para>
<para>
Most of the date and time results are dependent on the time zone
environment. The reference files are generated for time zone
<literal>PST8PDT</literal> (Berkeley, California) and there will be apparent
<literal>PST8PDT</literal> (Berkeley, California), and there will be apparent
failures if the tests are not run with that time zone setting.
The regression test driver sets environment variable
<envar>PGTZ</envar> to <literal>PST8PDT</literal>, which normally
ensures proper results. However, your system must provide library
ensures proper results. However, your operating system must provide
support for the <literal>PST8PDT</literal> time zone, or the time zone-dependent
tests will fail. To verify that your machine does have this
support, type the following:
<screen>
<prompt>$ </prompt><userinput>env TZ=PST8PDT date</userinput>
env TZ=PST8PDT date
</screen>
The command above should have returned the current system time in
the <literal>PST8PDT</literal> time zone. If the <literal>PST8PDT</literal> database is not available,
the <literal>PST8PDT</literal> time zone. If the <literal>PST8PDT</literal> time zone is not available,
then your system may have returned the time in GMT. If the
<literal>PST8PDT</literal> time zone is not available, you can set the time zone
<literal>PST8PDT</literal> time zone is missing, you can set the time zone
rules explicitly:
<programlisting>
PGTZ='PST8PDT7,M04.01.0,M10.05.03'; export PGTZ
......@@ -250,7 +241,7 @@ PGTZ='PST8PDT7,M04.01.0,M10.05.03'; export PGTZ
</para>
<para>
Some systems using older time zone libraries fail to apply
Some systems using older time-zone libraries fail to apply
daylight-saving corrections to dates before 1970, causing
pre-1970 <acronym>PDT</acronym> times to be displayed in <acronym>PST</acronym> instead. This will
result in localized differences in the test results.
......@@ -261,8 +252,8 @@ PGTZ='PST8PDT7,M04.01.0,M10.05.03'; export PGTZ
<title>Floating-point differences</title>
<para>
Some of the tests involve computing 64-bit (<type>double
precision</type>) numbers from table columns. Differences in
Some of the tests involve computing 64-bit floating-point numbers (<type>double
precision</type>) from table columns. Differences in
results involving mathematical functions of <type>double
precision</type> columns have been observed. The <literal>float8</> and
<literal>geometry</> tests are particularly prone to small differences
......@@ -292,26 +283,26 @@ PGTZ='PST8PDT7,M04.01.0,M10.05.03'; export PGTZ
You might see differences in which the same rows are output in a
different order than what appears in the expected file. In most cases
this is not, strictly speaking, a bug. Most of the regression test
scripts are not so pedantic as to use an ORDER BY for every single
SELECT, and so their result row orderings are not well-defined
scripts are not so pedantic as to use an <literal>ORDER BY</> for every single
<literal>SELECT</>, and so their result row orderings are not well-defined
according to the letter of the SQL specification. In practice, since we are
looking at the same queries being executed on the same data by the same
software, we usually get the same result ordering on all platforms, and
so the lack of ORDER BY isn't a problem. Some queries do exhibit
so the lack of <literal>ORDER BY</> isn't a problem. Some queries do exhibit
cross-platform ordering differences, however. (Ordering differences
can also be triggered by non-C locale settings.)
</para>
<para>
Therefore, if you see an ordering difference, it's not something to
worry about, unless the query does have an ORDER BY that your result
worry about, unless the query does have an <literal>ORDER BY</> that your result
is violating. But please report it anyway, so that we can add an
ORDER BY to that particular query and thereby eliminate the bogus
<literal>ORDER BY</> to that particular query and thereby eliminate the bogus
<quote>failure</quote> in future releases.
</para>
<para>
You might wonder why we don't order all the regress test queries explicitly to
You might wonder why we don't order all the regression test queries explicitly to
get rid of this issue once and for all. The reason is that that would
make the regression tests less useful, not more, since they'd tend
to exercise query plan types that produce ordered results to the
......@@ -323,7 +314,7 @@ exclusion of those that don't.
<title>The <quote>random</quote> test</title>
<para>
There is at least one case in the <quote>random</quote> test
There is at least one case in the <literal>random</literal> test
script that is intended to produce random results. This causes
random to fail the regression test once in a while (perhaps once
in every five to ten trials). Typing
......@@ -362,11 +353,11 @@ diff results/random.out expected/random.out
testname/platformpattern=comparisonfilename
</synopsis>
The test name is just the name of the particular regression test
module. The platform pattern is a pattern in the style of
<citerefentry><refentrytitle>expr</><manvolnum>1</></citerefentry> (that is, a regular expression with an implicit
module. The platform pattern is a pattern in the style of the Unix
tool <command>expr</> (that is, a regular expression with an implicit
<literal>^</literal> anchor
at the start). It is matched against the platform name as printed
by <filename>config.guess</filename> followed by
by <command>config.guess</command> followed by
<literal>:gcc</literal> or <literal>:cc</literal>, depending on
whether you use the GNU compiler or the system's native compiler
(on systems where there is a difference). The comparison file
......@@ -387,7 +378,7 @@ testname/platformpattern=comparisonfilename
horology/hppa=horology-no-DST-before-1970
</programlisting>
which will trigger on any machine for which the output of <command>config.guess</command>
begins with <quote><literal>hppa</literal></quote>. Other lines
begins with <literal>hppa</literal>. Other lines
in <filename>resultmap</> select the variant comparison file for other
platforms where it's appropriate.
</para>
......
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v 1.75 2003/02/19 03:13:24 momjian Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v 1.76 2003/03/13 01:30:29 petere Exp $
-->
<chapter id="sql-syntax">
......@@ -179,7 +179,7 @@ UPDATE "my_table" SET "a" = 5;
<para>
Quoting an identifier also makes it case-sensitive, whereas
unquoted names are always folded to lower case. For example, the
identifiers <literal>FOO</literal>, <literal>foo</literal> and
identifiers <literal>FOO</literal>, <literal>foo</literal>, and
<literal>"foo"</literal> are considered the same by
<productname>PostgreSQL</productname>, but <literal>"Foo"</literal>
and <literal>"FOO"</literal> are different from these three and
......@@ -414,10 +414,10 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
function-call syntaxes can also be used to specify run-time type
conversions of arbitrary expressions, as discussed in <xref
linkend="sql-syntax-type-casts">. But the form
<replaceable>type</replaceable> '<replaceable>string</replaceable>'
<literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
can only be used to specify the type of a literal constant.
Another restriction on
<replaceable>type</replaceable> '<replaceable>string</replaceable>'
<literal><replaceable>type</replaceable> '<replaceable>string</replaceable>'</literal>
is that it does not work for array types; use <literal>::</literal>
or <literal>CAST()</literal> to specify the type of an array constant.
</para>
......@@ -597,7 +597,7 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
<listitem>
<para>
The period (<literal>.</literal>) is used in floating-point
The period (<literal>.</literal>) is used in numeric
constants, and to separate schema, table, and column names.
</para>
</listitem>
......@@ -870,7 +870,7 @@ SELECT 3 OPERATOR(pg_catalog.+) 4;
<listitem>
<para>
A positional parameter reference, in the body of a function declaration.
A positional parameter reference, in the body of a function definition.
</para>
</listitem>
......
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/typeconv.sgml,v 1.27 2003/03/13 01:30:29 petere Exp $
-->
<chapter Id="typeconv">
<title>Type Conversion</title>
<para>
<acronym>SQL</acronym> queries can, intentionally or not, require
<acronym>SQL</acronym> statements can, intentionally or not, require
mixing of different data types in the same expression.
<productname>PostgreSQL</productname> has extensive facilities for
evaluating mixed-type expressions.
......@@ -14,7 +18,7 @@ to understand the details of the type conversion mechanism.
However, the implicit conversions done by <productname>PostgreSQL</productname>
can affect the results of a query. When necessary, these results
can be tailored by a user or programmer
using <emphasis>explicit</emphasis> type coercion.
using <emphasis>explicit</emphasis> type conversion.
</para>
<para>
......@@ -27,7 +31,7 @@ operators.
<para>
The &cite-programmer; has more details on the exact algorithms used for
implicit type conversion and coercion.
implicit type conversion and conversion.
</para>
<sect1 id="typeconv-overview">
......@@ -46,15 +50,16 @@ mixed-type expressions to be meaningful even with user-defined types.
<para>
The <productname>PostgreSQL</productname> scanner/parser decodes lexical
elements into only five fundamental categories: integers, floating-point numbers, strings,
names, and key words. Most extended types are first tokenized into
names, and key words. Most extended types are first classified as
strings. The <acronym>SQL</acronym> language definition allows specifying type
names with strings, and this mechanism can be used in
<productname>PostgreSQL</productname> to start the parser down the correct
path. For example, the query
<screen>
tgl=> SELECT text 'Origin' AS "Label", point '(0,0)' AS "Value";
Label | Value
SELECT text 'Origin' AS "label", point '(0,0)' AS "value";
label | value
--------+-------
Origin | (0,0)
(1 row)
......@@ -62,7 +67,7 @@ tgl=> SELECT text 'Origin' AS "Label", point '(0,0)' AS "Value";
has two literal constants, of type <type>text</type> and <type>point</type>.
If a type is not specified for a string literal, then the placeholder type
<firstterm>unknown</firstterm> is assigned initially, to be resolved in later
<type>unknown</type> is assigned initially, to be resolved in later
stages as described below.
</para>
......@@ -70,7 +75,6 @@ stages as described below.
There are four fundamental <acronym>SQL</acronym> constructs requiring
distinct type conversion rules in the <productname>PostgreSQL</productname>
parser:
</para>
<variablelist>
<varlistentry>
......@@ -92,9 +96,8 @@ Function calls
<listitem>
<para>
Much of the <productname>PostgreSQL</productname> type system is built around a
rich set of functions. Function calls have one or more arguments which, for
any specific query, must be matched to the functions available in the system
catalog. Since <productname>PostgreSQL</productname> permits function
rich set of functions. Function calls can have one or more arguments.
Since <productname>PostgreSQL</productname> permits function
overloading, the function name alone does not uniquely identify the function
to be called; the parser must select the right function based on the data
types of the supplied arguments.
......@@ -103,12 +106,12 @@ types of the supplied arguments.
</varlistentry>
<varlistentry>
<term>
Query targets
Value Storage
</term>
<listitem>
<para>
<acronym>SQL</acronym> <command>INSERT</command> and <command>UPDATE</command> statements place the results of
expressions into a table. The expressions in the query must be matched up
expressions into a table. The expressions in the statement must be matched up
with, and perhaps converted to, the types of the target columns.
</para>
</listitem>
......@@ -119,22 +122,15 @@ with, and perhaps converted to, the types of the target columns.
</term>
<listitem>
<para>
Since all select results from a unionized <literal>SELECT</literal> statement must appear in a single
Since all query results from a unionized <literal>SELECT</literal> statement must appear in a single
set of columns, the types of the results
of each <literal>SELECT</> clause must be matched up and converted to a uniform set.
Similarly, the result expressions of a <literal>CASE</> construct must be coerced to
Similarly, the branch expressions of a <literal>CASE</> construct must be converted to
a common type so that the <literal>CASE</> expression as a whole has a known output type.
</para>
</listitem>
</varlistentry>
</variablelist>
<para>
Many of the general type conversion rules use simple conventions built on
the <productname>PostgreSQL</productname> function and operator system tables.
There are some heuristics included in the conversion rules to better support
conventions for the <acronym>SQL</acronym> standard native types such as
<type>smallint</type>, <type>integer</type>, and <type>real</type>.
</para>
<para>
......@@ -157,7 +153,7 @@ a <firstterm>preferred type</firstterm> which is preferentially selected
when there is ambiguity.
In the user-defined category, each type is its own preferred type.
Ambiguous expressions (those with multiple candidate parsing solutions)
can often be resolved when there are multiple possible built-in types, but
can therefore often be resolved when there are multiple possible built-in types, but
they will raise an error when there are multiple choices for user-defined
types.
</para>
......@@ -184,8 +180,7 @@ be converted to a user-defined type (of course, only if conversion is necessary)
<para>
User-defined types are not related. Currently, <productname>PostgreSQL</productname>
does not have information available to it on relationships between types, other than
hardcoded heuristics for built-in types and implicit relationships based on available functions
in the catalog.
hardcoded heuristics for built-in types and implicit relationships based on available functions.
</para>
</listitem>
......@@ -195,12 +190,12 @@ There should be no extra overhead from the parser or executor
if a query does not need implicit type conversion.
That is, if a query is well formulated and the types already match up, then the query should proceed
without spending extra time in the parser and without introducing unnecessary implicit conversion
functions into the query.
calls into the query.
</para>
<para>
Additionally, if a query usually requires an implicit conversion for a function, and
if then the user defines an explicit function with the correct argument types, the parser
if then the user defines a new function with the correct argument types, the parser
should use this new function and will no longer do the implicit conversion using the old function.
</para>
</listitem>
......@@ -226,7 +221,7 @@ should use this new function and will no longer do the implicit conversion using
<para>
Select the operators to be considered from the
<classname>pg_operator</classname> system catalog. If an unqualified
operator name is used (the usual case), the operators
operator name was used (the usual case), the operators
considered are those of the right name and argument count that are
visible in the current search path (see <xref linkend="ddl-schemas-path">).
If a qualified operator name was given, only operators in the specified
......@@ -255,7 +250,7 @@ operators considered), use it.
<substeps>
<step performance="optional">
<para>
If one argument of a binary operator is <type>unknown</type> type,
If one argument of a binary operator invocation is of the <type>unknown</type> type,
then assume it is the same type as the other argument for this check.
Other cases involving <type>unknown</type> will never find a match at
this step.
......@@ -272,9 +267,9 @@ Look for the best match.
<step performance="required">
<para>
Discard candidate operators for which the input types do not match
and cannot be coerced (using an implicit coercion function) to match.
and cannot be converted (using an implicit conversion) to match.
<type>unknown</type> literals are
assumed to be coercible to anything for this purpose. If only one
assumed to be convertible to anything for this purpose. If only one
candidate remains, use it; else continue to the next step.
</para>
</step>
......@@ -296,23 +291,22 @@ If only one candidate remains, use it; else continue to the next step.
<step performance="required">
<para>
Run through all candidates and keep those that accept preferred types at
the most positions where type coercion will be required.
the most positions where type conversion will be required.
Keep all candidates if none accept preferred types.
If only one candidate remains, use it; else continue to the next step.
</para>
</step>
<step performance="required">
<para>
If any input arguments are <quote>unknown</quote>, check the type
If any input arguments are <type>unknown</type>, check the type
categories accepted at those argument positions by the remaining
candidates. At each position, select the <quote>string</quote> category if any
candidate accepts that category (this bias towards string is appropriate
since an unknown-type literal does look like a string). Otherwise, if
candidates. At each position, select the <literal>string</literal> category if any
candidate accepts that category. (This bias towards string is appropriate
since an unknown-type literal does look like a string.) Otherwise, if
all the remaining candidates accept the same type category, select that
category; otherwise fail because the correct choice cannot be deduced
without more clues. Also note whether any of the candidates accept a
preferred data type within the selected category. Now discard operator
candidates that do not accept the selected type category; furthermore,
without more clues. Now discard operator
candidates that do not accept the selected type category. Furthermore,
if any candidate accepts a preferred type at a given argument position,
discard candidates that accept non-preferred types for that argument.
</para>
......@@ -328,7 +322,9 @@ then fail.
</step>
</procedure>
<bridgehead renderas="sect2">Examples</bridgehead>
<para>
Some examples follow.
</para>
<example>
<title>Exponentiation Operator Type Resolution</title>
......@@ -340,8 +336,9 @@ operator defined in the catalog, and it takes arguments of type
The scanner assigns an initial type of <type>integer</type> to both arguments
of this query expression:
<screen>
tgl=> SELECT 2 ^ 3 AS "Exp";
Exp
SELECT 2 ^ 3 AS "exp";
exp
-----
8
(1 row)
......@@ -351,30 +348,8 @@ So the parser does a type conversion on both operands and the query
is equivalent to
<screen>
tgl=> SELECT CAST(2 AS double precision) ^ CAST(3 AS double precision) AS "Exp";
Exp
-----
8
(1 row)
</screen>
or
<screen>
tgl=> SELECT 2.0 ^ 3.0 AS "Exp";
Exp
-----
8
(1 row)
SELECT CAST(2 AS double precision) ^ CAST(3 AS double precision) AS "exp";
</screen>
<note>
<para>
This last form has the least overhead, since no functions are called to do
implicit type conversion. This is not an issue for small queries, but may
have an impact on the performance of queries involving large tables.
</para>
</note>
</para>
</example>
......@@ -383,15 +358,16 @@ have an impact on the performance of queries involving large tables.
<para>
A string-like syntax is used for working with string types as well as for
working with complex extended types.
working with complex extension types.
Strings with unspecified type are matched with likely operator candidates.
</para>
<para>
An example with one unspecified argument:
<screen>
tgl=> SELECT text 'abc' || 'def' AS "Text and Unknown";
Text and Unknown
SELECT text 'abc' || 'def' AS "text and unknown";
text and unknown
------------------
abcdef
(1 row)
......@@ -405,10 +381,11 @@ be interpreted as of type <type>text</type>.
</para>
<para>
Concatenation on unspecified types:
Here is a concatenation on unspecified types:
<screen>
tgl=> SELECT 'abc' || 'def' AS "Unspecified";
Unspecified
SELECT 'abc' || 'def' AS "unspecified";
unspecified
-------------
abcdef
(1 row)
......@@ -421,7 +398,7 @@ are specified in the query. So, the parser looks for all candidate operators
and finds that there are candidates accepting both string-category and
bit-string-category inputs. Since string category is preferred when available,
that category is selected, and then the
<quote>preferred type</quote> for strings, <type>text</type>, is used as the specific
preferred type for strings, <type>text</type>, is used as the specific
type to resolve the unknown literals to.
</para>
</example>
......@@ -437,27 +414,29 @@ entries is for type <type>float8</type>, which is the preferred type in
the numeric category. Therefore, <productname>PostgreSQL</productname>
will use that entry when faced with a non-numeric input:
<screen>
tgl=> select @ text '-4.5' as "abs";
SELECT @ '-4.5' AS "abs";
abs
-----
4.5
(1 row)
</screen>
Here the system has performed an implicit text-to-float8 conversion
before applying the chosen operator. We can verify that float8 and
Here the system has performed an implicit conversion from <type>text</type> to <type>float8</type>
before applying the chosen operator. We can verify that <type>float8</type> and
not some other type was used:
<screen>
tgl=> select @ text '-4.5e500' as "abs";
SELECT @ '-4.5e500' AS "abs";
ERROR: Input '-4.5e500' is out of range for float8
</screen>
</para>
<para>
On the other hand, the postfix operator <literal>!</> (factorial)
is defined only for integer data types, not for float8. So, if we
is defined only for integer data types, not for <type>float8</type>. So, if we
try a similar case with <literal>!</>, we get:
<screen>
tgl=> select text '20' ! as "factorial";
SELECT '20' ! AS "factorial";
ERROR: Unable to identify a postfix operator '!' for type 'text'
You may need to add parentheses or an explicit cast
</screen>
......@@ -465,7 +444,8 @@ This happens because the system can't decide which of the several
possible <literal>!</> operators should be preferred. We can help
it out with an explicit cast:
<screen>
tgl=> select cast(text '20' as int8) ! as "factorial";
SELECT CAST('20' AS int8) ! AS "factorial";
factorial
---------------------
2432902008176640000
......@@ -491,7 +471,7 @@ tgl=> select cast(text '20' as int8) ! as "factorial";
<para>
Select the functions to be considered from the
<classname>pg_proc</classname> system catalog. If an unqualified
function name is used, the functions
function name was used, the functions
considered are those of the right name and argument count that are
visible in the current search path (see <xref linkend="ddl-schemas-path">).
If a qualified function name was given, only functions in the specified
......@@ -517,16 +497,18 @@ If one exists (there can be only one exact match in the set of
functions considered), use it.
(Cases involving <type>unknown</type> will never find a match at
this step.)
</para></step>
</para>
</step>
<step performance="required">
<para>
If no exact match is found, see whether the function call appears
to be a trivial type coercion request. This happens if the function call
to be a trivial type conversion request. This happens if the function call
has just one argument and the function name is the same as the (internal)
name of some data type. Furthermore, the function argument must be either
an unknown-type literal or a type that is binary-compatible with the named
data type. When these conditions are met, the function argument is coerced
to the named data type without any explicit function call.
data type. When these conditions are met, the function argument is converted
to the named data type without any actual function call.
</para>
</step>
<step performance="required">
......@@ -537,9 +519,9 @@ Look for the best match.
<step performance="required">
<para>
Discard candidate functions for which the input types do not match
and cannot be coerced (using an implicit coercion function) to match.
and cannot be converted (using an implicit conversion) to match.
<type>unknown</type> literals are
assumed to be coercible to anything for this purpose. If only one
assumed to be convertible to anything for this purpose. If only one
candidate remains, use it; else continue to the next step.
</para>
</step>
......@@ -561,7 +543,7 @@ If only one candidate remains, use it; else continue to the next step.
<step performance="required">
<para>
Run through all candidates and keep those that accept preferred types at
the most positions where type coercion will be required.
the most positions where type conversion will be required.
Keep all candidates if none accept preferred types.
If only one candidate remains, use it; else continue to the next step.
</para>
......@@ -570,13 +552,12 @@ If only one candidate remains, use it; else continue to the next step.
<para>
If any input arguments are <type>unknown</type>, check the type categories accepted
at those argument positions by the remaining candidates. At each position,
select the <type>string</type> category if any candidate accepts that category
(this bias towards string
is appropriate since an unknown-type literal does look like a string).
select the <type>string</type> category if any candidate accepts that category.
(This bias towards string
is appropriate since an unknown-type literal does look like a string.)
Otherwise, if all the remaining candidates accept the same type category,
select that category; otherwise fail because
the correct choice cannot be deduced without more clues. Also note whether
any of the candidates accept a preferred data type within the selected category.
the correct choice cannot be deduced without more clues.
Now discard candidates that do not accept the selected type category;
furthermore, if any candidate accepts a preferred type at a given argument
position, discard candidates that accept non-preferred types for that
......@@ -594,32 +575,41 @@ then fail.
</step>
</procedure>
<bridgehead renderas="sect2">Examples</bridgehead>
<para>
Some examples follow.
</para>
<example>
<title>Factorial Function Argument Type Resolution</title>
<title>Rounding Function Argument Type Resolution</title>
<para>
There is only one <function>int4fac</function> function defined in the
<classname>pg_proc</classname> catalog.
So the following query automatically converts the <type>int2</type> argument
to <type>int4</type>:
There is only one <function>round</function> function with two
arguments. (The first is <type>numeric</type>, the second is
<type>integer</type>.) So the following query automatically converts
the first argument of type <type>integer</type> to
<type>numeric</type>:
<screen>
tgl=> SELECT int4fac(int2 '4');
int4fac
---------
24
SELECT round(4, 4);
round
--------
4.0000
(1 row)
</screen>
and is actually transformed by the parser to
That query is actually transformed by the parser to
<screen>
tgl=> SELECT int4fac(int4(int2 '4'));
int4fac
---------
24
(1 row)
SELECT round(CAST (4 AS numeric), 4);
</screen>
</para>
<para>
Since numeric constants with decimal points are initially assigned the
type <type>numeric</type>, the following query will require no type
conversion and may therefore be slightly more efficient:
<screen>
SELECT round(4.0, 4);
</screen>
</para>
</example>
......@@ -628,15 +618,15 @@ tgl=> SELECT int4fac(int4(int2 '4'));
<title>Substring Function Type Resolution</title>
<para>
There are two <function>substr</function> functions declared in <classname>pg_proc</classname>. However,
only one takes two arguments, of types <type>text</type> and <type>int4</type>.
</para>
There are several <function>substr</function> functions, one of which
takes types <type>text</type> and <type>integer</type>. If called
with a string constant of unspecified type, the system chooses the
candidate function that accepts an argument of the preferred category
<literal>string</literal> (namely of type <type>text</type>).
<para>
If called with a string constant of unspecified type, the type is matched up
directly with the only candidate function type:
<screen>
tgl=> SELECT substr('1234', 3);
SELECT substr('1234', 3);
substr
--------
34
......@@ -646,28 +636,26 @@ tgl=> SELECT substr('1234', 3);
<para>
If the string is declared to be of type <type>varchar</type>, as might be the case
if it comes from a table, then the parser will try to coerce it to become <type>text</type>:
if it comes from a table, then the parser will try to convert it to become <type>text</type>:
<screen>
tgl=> SELECT substr(varchar '1234', 3);
SELECT substr(varchar '1234', 3);
substr
--------
34
(1 row)
</screen>
which is transformed by the parser to become
This is transformed by the parser to effectively become
<screen>
tgl=> SELECT substr(text(varchar '1234'), 3);
substr
--------
34
(1 row)
SELECT substr(CAST (varchar '1234' AS text), 3);
</screen>
</para>
<para>
<note>
<para>
Actually, the parser is aware that <type>text</type> and <type>varchar</type>
are <firstterm>binary-compatible</>, meaning that one can be passed to a function that
The parser is aware that <type>text</type> and <type>varchar</type>
are binary-compatible, meaning that one can be passed to a function that
accepts the other without doing any physical conversion. Therefore, no
explicit type conversion call is really inserted in this case.
</para>
......@@ -675,64 +663,67 @@ explicit type conversion call is really inserted in this case.
</para>
<para>
And, if the function is called with an <type>int4</type>, the parser will
And, if the function is called with an argument of type <type>integer</type>, the parser will
try to convert that to <type>text</type>:
<screen>
tgl=> SELECT substr(1234, 3);
SELECT substr(1234, 3);
substr
--------
34
(1 row)
</screen>
which actually executes as
This actually executes as
<screen>
tgl=> SELECT substr(text(1234), 3);
substr
--------
34
(1 row)
SELECT substr(CAST (1234 AS text), 3);
</screen>
This succeeds because there is a conversion function text(int4) in the
system catalog.
This automatic transformation can succeed because there is an
implicitly invocable cast from <type>integer</type> to
<type>text</type>.
</para>
</example>
</sect1>
<sect1 id="typeconv-query">
<title>Query Targets</title>
<title>Value Storage</title>
<para>
Values to be inserted into a table are coerced to the destination
Values to be inserted into a table are converted to the destination
column's data type according to the
following steps.
</para>
<procedure>
<title>Query Target Type Resolution</title>
<title>Value Storage Type Conversion</title>
<step performance="required">
<para>
Check for an exact match with the target.
</para></step>
</para>
</step>
<step performance="required">
<para>
Otherwise, try to coerce the expression to the target type. This will succeed
if the two types are known binary-compatible, or if there is a conversion
function. If the expression is an unknown-type literal, the contents of
Otherwise, try to convert the expression to the target type. This will succeed
if there is a registered cast between the two types.
If the expression is an unknown-type literal, the contents of
the literal string will be fed to the input conversion routine for the target
type.
</para></step>
</para>
</step>
<step performance="required">
<para>
If the target is a fixed-length type (e.g. <type>char</type> or <type>varchar</type>
If the target is a fixed-length type (e.g., <type>char</type> or <type>varchar</type>
declared with a length) then try to find a sizing function for the target
type. A sizing function is a function of the same name as the type,
taking two arguments of which the first is that type and the second is an
integer, and returning the same type. If one is found, it is applied,
taking two arguments of which the first is that type and the second is of type
<type>integer</type>, and returning the same type. If one is found, it is applied,
passing the column's declared length as the second parameter.
</para></step>
</para>
</step>
</procedure>
......@@ -740,30 +731,31 @@ passing the column's declared length as the second parameter.
<title><type>character</type> Storage Type Conversion</title>
<para>
For a target column declared as <type>character(20)</type> the following query
ensures that the target is sized correctly:
For a target column declared as <type>character(20)</type> the following statement
ensures that the stored value is sized correctly:
<screen>
tgl=> CREATE TABLE vv (v character(20));
CREATE
tgl=> INSERT INTO vv SELECT 'abc' || 'def';
INSERT 392905 1
tgl=> SELECT v, length(v) FROM vv;
CREATE TABLE vv (v character(20));
INSERT INTO vv SELECT 'abc' || 'def';
SELECT v, length(v) FROM vv;
v | length
----------------------+--------
abcdef | 20
(1 row)
</screen>
</para>
<para>
What has really happened here is that the two unknown literals are resolved
to <type>text</type> by default, allowing the <literal>||</literal> operator
to be resolved as <type>text</type> concatenation. Then the <type>text</type>
result of the operator is coerced to <type>bpchar</type> (<quote>blank-padded
char</>, the internal name of the character data type) to match the target
column type. (Since the parser knows that <type>text</type> and
<type>bpchar</type> are binary-compatible, this coercion is implicit and does
result of the operator is converted to <type>bpchar</type> (<quote>blank-padded
char</>, the internal name of the <type>character</type> data type) to match the target
column type. (Since the types <type>text</type> and
<type>bpchar</type> are binary-compatible, this conversion does
not insert any real function call.) Finally, the sizing function
<literal>bpchar(bpchar, integer)</literal> is found in the system catalogs
<literal>bpchar(bpchar, integer)</literal> is found in the system catalog
and applied to the operator's result and the stored column length. This
type-specific function performs the required length check and addition of
padding spaces.
......@@ -783,78 +775,87 @@ to each output column of a union query. The <literal>INTERSECT</> and
A <literal>CASE</> construct also uses the identical algorithm to match up its
component expressions and select a result data type.
</para>
<procedure>
<title><literal>UNION</> and <literal>CASE</> Type Resolution</title>
<step performance="required">
<para>
If all inputs are of type <type>unknown</type>, resolve as type
<type>text</type> (the preferred type for string category).
Otherwise, ignore the <type>unknown</type> inputs while choosing the type.
</para></step>
<type>text</type> (the preferred type of the string category).
Otherwise, ignore the <type>unknown</type> inputs while choosing the result type.
</para>
</step>
<step performance="required">
<para>
If the non-unknown inputs are not all of the same type category, fail.
</para></step>
</para>
</step>
<step performance="required">
<para>
Choose the first non-unknown input type which is a preferred type in
that category or allows all the non-unknown inputs to be implicitly
coerced to it.
</para></step>
converted to it.
</para>
</step>
<step performance="required">
<para>
Coerce all inputs to the selected type.
</para></step>
Convert all inputs to the selected type.
</para>
</step>
</procedure>
<bridgehead renderas="sect2">Examples</bridgehead>
<para>
Some examples follow.
</para>
<example>
<title>Underspecified Types in a Union</title>
<title>Type Resolution with Underspecified Types in a Union</title>
<para>
<screen>
tgl=> SELECT text 'a' AS "Text" UNION SELECT 'b';
Text
SELECT text 'a' AS "text" UNION SELECT 'b';
text
------
a
b
(2 rows)
</screen>
Here, the unknown-type literal <literal>'b'</literal> will be resolved as type text.
Here, the unknown-type literal <literal>'b'</literal> will be resolved as type <type>text</type>.
</para>
</example>
<example>
<title>Type Conversion in a Simple Union</title>
<title>Type Resolution in a Simple Union</title>
<para>
<screen>
tgl=> SELECT 1.2 AS "Numeric" UNION SELECT 1;
Numeric
SELECT 1.2 AS "numeric" UNION SELECT 1;
numeric
---------
1
1.2
(2 rows)
</screen>
The literal <literal>1.2</> is of type <type>numeric</>,
and the integer value <literal>1</> can be cast implicitly to
and the <type>integer</type> value <literal>1</> can be cast implicitly to
<type>numeric</>, so that type is used.
</para>
</example>
<example>
<title>Type Conversion in a Transposed Union</title>
<title>Type Resolution in a Transposed Union</title>
<para>
<screen>
tgl=> SELECT 1 AS "Real"
tgl-> UNION SELECT CAST('2.2' AS REAL);
Real
SELECT 1 AS "real" UNION SELECT CAST('2.2' AS REAL);
real
------
1
2.2
......
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/user-manag.sgml,v 1.18 2002/11/11 20:14:04 petere Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/user-manag.sgml,v 1.19 2003/03/13 01:30:29 petere Exp $
-->
<chapter id="user-manag">
......@@ -31,20 +31,20 @@ $Header: /cvsroot/pgsql/doc/src/sgml/user-manag.sgml,v 1.18 2002/11/11 20:14:04
per individual database). To create a user use the <command>CREATE
USER</command> SQL command:
<synopsis>
CREATE USER <replaceable>name</replaceable>
CREATE USER <replaceable>name</replaceable>;
</synopsis>
<replaceable>name</replaceable> follows the rules for SQL
identifiers: either unadorned without special characters, or
double-quoted. To remove an existing user, use the analogous
<command>DROP USER</command> command:
<synopsis>
DROP USER <replaceable>name</replaceable>
DROP USER <replaceable>name</replaceable>;
</synopsis>
</para>
<para>
For convenience, the programs <application>createuser</application>
and <application>dropuser</application> are provided as wrappers
For convenience, the programs <command>createuser</command>
and <command>dropuser</command> are provided as wrappers
around these SQL commands that can be called from the shell command
line:
<synopsis>
......@@ -57,11 +57,11 @@ dropuser <replaceable>name</replaceable>
In order to bootstrap the database system, a freshly initialized
system always contains one predefined user. This user will have the
fixed ID 1, and by default (unless altered when running
<application>initdb</application>) it will have the same name as
the operating system user that initialized the database
<command>initdb</command>) it will have the same name as the
operating system user that initialized the database
cluster. Customarily, this user will be named
<systemitem>postgres</systemitem>. In order to create more users
you first have to connect as this initial user.
<literal>postgres</literal>. In order to create more users you
first have to connect as this initial user.
</para>
<para>
......@@ -69,11 +69,11 @@ dropuser <replaceable>name</replaceable>
database server. The user name to use for a particular database
connection is indicated by the client that is initiating the
connection request in an application-specific fashion. For example,
the <application>psql</application> program uses the
the <command>psql</command> program uses the
<option>-U</option> command line option to indicate the user to
connect as. Many applications assume the name of the current
operating system user by default (including
<application>createuser</> and <application>psql</>). Therefore it
<command>createuser</> and <command>psql</>). Therefore it
is convenient to maintain a naming correspondence between the two
user sets.
</para>
......@@ -134,7 +134,7 @@ dropuser <replaceable>name</replaceable>
make use of passwords. Database passwords are separate from
operating system passwords. Specify a password upon user
creation with <literal>CREATE USER
<replaceable>name</replaceable> PASSWORD 'string'</literal>.
<replaceable>name</replaceable> PASSWORD '<replaceable>string</>'</literal>.
</para>
</listitem>
</varlistentry>
......@@ -172,12 +172,12 @@ ALTER USER myname SET enable_indexscan TO off;
management of privileges: privileges can be granted to, or revoked
from, a group as a whole. To create a group, use
<synopsis>
CREATE GROUP <replaceable>name</replaceable>
CREATE GROUP <replaceable>name</replaceable>;
</synopsis>
To add users to or remove users from a group, use
<synopsis>
ALTER GROUP <replaceable>name</replaceable> ADD USER <replaceable>uname1</replaceable>, ...
ALTER GROUP <replaceable>name</replaceable> DROP USER <replaceable>uname1</replaceable>, ...
ALTER GROUP <replaceable>name</replaceable> ADD USER <replaceable>uname1</replaceable>, ... ;
ALTER GROUP <replaceable>name</replaceable> DROP USER <replaceable>uname1</replaceable>, ... ;
</synopsis>
</para>
</sect1>
......@@ -247,7 +247,7 @@ REVOKE ALL ON accounts FROM PUBLIC;
<para>
Functions and triggers allow users to insert code into the backend
server that other users may execute without knowing it. Hence, both
mechanisms permit users to <firstterm>Trojan horse</firstterm>
mechanisms permit users to <quote>Trojan horse</quote>
others with relative impunity. The only real protection is tight
control over who can define functions.
</para>
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment