Commit ba36c48e authored by Bruce Momjian's avatar Bruce Momjian

Proofreading adjustments for first two parts of documentation (Tutorial

and SQL).
parent 23a9ac61
<!-- $PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.57 2009/02/04 21:30:41 alvherre Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.58 2009/04/27 16:27:35 momjian Exp $ -->
<chapter id="tutorial-advanced">
<title>Advanced Features</title>
......@@ -19,10 +19,10 @@
<para>
This chapter will on occasion refer to examples found in <xref
linkend="tutorial-sql"> to change or improve them, so it will be
of advantage if you have read that chapter. Some examples from
good if you have read that chapter. Some examples from
this chapter can also be found in
<filename>advanced.sql</filename> in the tutorial directory. This
file also contains some example data to load, which is not
file also contains some sample data to load, which is not
repeated here. (Refer to <xref linkend="tutorial-sql-intro"> for
how to use the file.)
</para>
......@@ -173,7 +173,7 @@ UPDATE branches SET balance = balance + 100.00
</para>
<para>
The details of these commands are not important here; the important
The details of these commands are not important; the important
point is that there are several separate updates involved to accomplish
this rather simple operation. Our bank's officers will want to be
assured that either all these updates happen, or none of them happen.
......@@ -307,7 +307,7 @@ COMMIT;
<para>
This example is, of course, oversimplified, but there's a lot of control
to be had over a transaction block through the use of savepoints.
possible in a transaction block through the use of savepoints.
Moreover, <command>ROLLBACK TO</> is the only way to regain control of a
transaction block that was put in aborted state by the
system due to an error, short of rolling it back completely and starting
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.31 2007/12/12 06:23:27 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/arch-dev.sgml,v 2.32 2009/04/27 16:27:35 momjian Exp $ -->
<chapter id="overview">
<title>Overview of PostgreSQL Internals</title>
......@@ -67,7 +67,7 @@
One application of the rewrite system is in the realization of
<firstterm>views</firstterm>.
Whenever a query against a view
(i.e. a <firstterm>virtual table</firstterm>) is made,
(i.e., a <firstterm>virtual table</firstterm>) is made,
the rewrite system rewrites the user's query to
a query that accesses the <firstterm>base tables</firstterm> given in
the <firstterm>view definition</firstterm> instead.
......@@ -145,7 +145,7 @@
<para>
Once a connection is established the client process can send a query
to the <firstterm>backend</firstterm> (server). The query is transmitted using plain text,
i.e. there is no parsing done in the <firstterm>frontend</firstterm> (client). The
i.e., there is no parsing done in the <firstterm>frontend</firstterm> (client). The
server parses the query, creates an <firstterm>execution plan</firstterm>,
executes the plan and returns the retrieved rows to the client
by transmitting them over the established connection.
......@@ -442,7 +442,7 @@
relations, a near-exhaustive search is conducted to find the best
join sequence. The planner preferentially considers joins between any
two relations for which there exist a corresponding join clause in the
<literal>WHERE</literal> qualification (i.e. for
<literal>WHERE</literal> qualification (i.e., for
which a restriction like <literal>where rel1.attr1=rel2.attr2</literal>
exists). Join pairs with no join clause are considered only when there
is no other choice, that is, a particular relation has no available
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/array.sgml,v 1.68 2008/11/12 13:09:27 petere Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/array.sgml,v 1.69 2009/04/27 16:27:35 momjian Exp $ -->
<sect1 id="arrays">
<title>Arrays</title>
......@@ -54,23 +54,24 @@ CREATE TABLE tictactoe (
);
</programlisting>
However, the current implementation does not enforce the array size
limits &mdash; the behavior is the same as for arrays of unspecified
However, the current implementation ignores any supplied array size
limits, i.e., the behavior is the same as for arrays of unspecified
length.
</para>
<para>
Actually, the current implementation does not enforce the declared
In addition, the current implementation does not enforce the declared
number of dimensions either. Arrays of a particular element type are
all considered to be of the same type, regardless of size or number
of dimensions. So, declaring number of dimensions or sizes in
of dimensions. So, declaring the number of dimensions or sizes in
<command>CREATE TABLE</command> is simply documentation, it does not
affect run-time behavior.
</para>
<para>
An alternative syntax, which conforms to the SQL standard, can
be used for one-dimensional arrays.
An alternative syntax, which conforms to the SQL standard by using
they keyword <literal>ARRAY</>, can
be used for one-dimensional arrays;
<structfield>pay_by_quarter</structfield> could have been defined
as:
<programlisting>
......@@ -107,9 +108,9 @@ CREATE TABLE tictactoe (
where <replaceable>delim</replaceable> is the delimiter character
for the type, as recorded in its <literal>pg_type</literal> entry.
Among the standard data types provided in the
<productname>PostgreSQL</productname> distribution, type
<literal>box</> uses a semicolon (<literal>;</>) but all the others
use comma (<literal>,</>). Each <replaceable>val</replaceable> is
<productname>PostgreSQL</productname> distribution, all use a comma
(<literal>,</>), except for the type <literal>box</> which uses a semicolon
(<literal>;</>). Each <replaceable>val</replaceable> is
either a constant of the array element type, or a subarray. An example
of an array constant is:
<programlisting>
......@@ -120,7 +121,7 @@ CREATE TABLE tictactoe (
</para>
<para>
To set an element of an array constant to NULL, write <literal>NULL</>
To set an element of an array to NULL, write <literal>NULL</>
for the element value. (Any upper- or lower-case variant of
<literal>NULL</> will do.) If you want an actual string value
<quote>NULL</>, you must put double quotes around it.
......@@ -163,6 +164,19 @@ SELECT * FROM sal_emp;
</programlisting>
</para>
<para>
Multidimensional arrays must have matching extents for each
dimension. A mismatch causes an error, for example:
<programlisting>
INSERT INTO sal_emp
VALUES ('Bill',
'{10000, 10000, 10000, 10000}',
'{{"meeting", "lunch"}, {"meeting"}}');
ERROR: multidimensional arrays must have array expressions with matching dimensions
</programlisting>
</para>
<para>
The <literal>ARRAY</> constructor syntax can also be used:
<programlisting>
......@@ -182,19 +196,6 @@ INSERT INTO sal_emp
constructor syntax is discussed in more detail in
<xref linkend="sql-syntax-array-constructors">.
</para>
<para>
Multidimensional arrays must have matching extents for each
dimension. A mismatch causes an error report, for example:
<programlisting>
INSERT INTO sal_emp
VALUES ('Bill',
'{10000, 10000, 10000, 10000}',
'{{"meeting", "lunch"}, {"meeting"}}');
ERROR: multidimensional arrays must have array expressions with matching dimensions
</programlisting>
</para>
</sect2>
<sect2 id="arrays-accessing">
......@@ -207,7 +208,7 @@ ERROR: multidimensional arrays must have array expressions with matching dimens
<para>
Now, we can run some queries on the table.
First, we show how to access a single element of an array at a time.
First, we show how to access a single element of an array.
This query retrieves the names of the employees whose pay changed in
the second quarter:
......@@ -221,7 +222,7 @@ SELECT name FROM sal_emp WHERE pay_by_quarter[1] &lt;&gt; pay_by_quarter[2];
</programlisting>
The array subscript numbers are written within square brackets.
By default <productname>PostgreSQL</productname> uses the
By default <productname>PostgreSQL</productname> uses a
one-based numbering convention for arrays, that is,
an array of <replaceable>n</> elements starts with <literal>array[1]</literal> and
ends with <literal>array[<replaceable>n</>]</literal>.
......@@ -257,7 +258,7 @@ SELECT schedule[1:2][1:1] FROM sal_emp WHERE name = 'Bill';
(1 row)
</programlisting>
If any dimension is written as a slice, i.e. contains a colon, then all
If any dimension is written as a slice, i.e., contains a colon, then all
dimensions are treated as slices. Any dimension that has only a single
number (no colon) is treated as being from <literal>1</>
to the number specified. For example, <literal>[2]</> is treated as
......@@ -288,13 +289,14 @@ SELECT schedule[1:2][2] FROM sal_emp WHERE name = 'Bill';
<para>
An array slice expression likewise yields null if the array itself or
any of the subscript expressions are null. However, in other corner
any of the subscript expressions are null. However, in other
cases such as selecting an array slice that
is completely outside the current array bounds, a slice expression
yields an empty (zero-dimensional) array instead of null. (This
does not match non-slice behavior and is done for historical reasons.)
If the requested slice partially overlaps the array bounds, then it
is silently reduced to just the overlapping region.
is silently reduced to just the overlapping region instead of
returning null.
</para>
<para>
......@@ -311,7 +313,7 @@ SELECT array_dims(schedule) FROM sal_emp WHERE name = 'Carol';
</programlisting>
<function>array_dims</function> produces a <type>text</type> result,
which is convenient for people to read but perhaps not so convenient
which is convenient for people to read but perhaps inconvenient
for programs. Dimensions can also be retrieved with
<function>array_upper</function> and <function>array_lower</function>,
which return the upper and lower bound of a
......@@ -380,12 +382,12 @@ UPDATE sal_emp SET pay_by_quarter[1:2] = '{27000,27000}'
</para>
<para>
A stored array value can be enlarged by assigning to element(s) not already
A stored array value can be enlarged by assigning to elements not already
present. Any positions between those previously present and the newly
assigned element(s) will be filled with nulls. For example, if array
assigned elements will be filled with nulls. For example, if array
<literal>myarray</> currently has 4 elements, it will have six
elements after an update that assigns to <literal>myarray[6]</>,
and <literal>myarray[5]</> will contain a null.
elements after an update that assigns to <literal>myarray[6]</>;
<literal>myarray[5]</> will contain null.
Currently, enlargement in this fashion is only allowed for one-dimensional
arrays, not multidimensional arrays.
</para>
......@@ -393,11 +395,11 @@ UPDATE sal_emp SET pay_by_quarter[1:2] = '{27000,27000}'
<para>
Subscripted assignment allows creation of arrays that do not use one-based
subscripts. For example one might assign to <literal>myarray[-2:7]</> to
create an array with subscript values running from -2 to 7.
create an array with subscript values from -2 to 7.
</para>
<para>
New array values can also be constructed by using the concatenation operator,
New array values can also be constructed using the concatenation operator,
<literal>||</literal>:
<programlisting>
SELECT ARRAY[1,2] || ARRAY[3,4];
......@@ -415,14 +417,14 @@ SELECT ARRAY[5,6] || ARRAY[[1,2],[3,4]];
</para>
<para>
The concatenation operator allows a single element to be pushed on to the
The concatenation operator allows a single element to be pushed to the
beginning or end of a one-dimensional array. It also accepts two
<replaceable>N</>-dimensional arrays, or an <replaceable>N</>-dimensional
and an <replaceable>N+1</>-dimensional array.
</para>
<para>
When a single element is pushed on to either the beginning or end of a
When a single element is pushed to either the beginning or end of a
one-dimensional array, the result is an array with the same lower bound
subscript as the array operand. For example:
<programlisting>
......@@ -461,7 +463,7 @@ SELECT array_dims(ARRAY[[1,2],[3,4]] || ARRAY[[5,6],[7,8],[9,0]]);
</para>
<para>
When an <replaceable>N</>-dimensional array is pushed on to the beginning
When an <replaceable>N</>-dimensional array is pushed to the beginning
or end of an <replaceable>N+1</>-dimensional array, the result is
analogous to the element-array case above. Each <replaceable>N</>-dimensional
sub-array is essentially an element of the <replaceable>N+1</>-dimensional
......@@ -482,7 +484,7 @@ SELECT array_dims(ARRAY[1,2] || ARRAY[[3,4],[5,6]]);
arrays, but <function>array_cat</function> supports multidimensional arrays.
Note that the concatenation operator discussed above is preferred over
direct use of these functions. In fact, the functions exist primarily for use
direct use of these functions. In fact, these functions primarily exist for use
in implementing the concatenation operator. However, they might be directly
useful in the creation of user-defined aggregates. Some examples:
......@@ -528,8 +530,8 @@ SELECT array_cat(ARRAY[5,6], ARRAY[[1,2],[3,4]]);
</indexterm>
<para>
To search for a value in an array, you must check each value of the
array. This can be done by hand, if you know the size of the array.
To search for a value in an array, each value must be checked.
This can be done manually, if you know the size of the array.
For example:
<programlisting>
......@@ -540,7 +542,7 @@ SELECT * FROM sal_emp WHERE pay_by_quarter[1] = 10000 OR
</programlisting>
However, this quickly becomes tedious for large arrays, and is not
helpful if the size of the array is uncertain. An alternative method is
helpful if the size of the array is unknown. An alternative method is
described in <xref linkend="functions-comparisons">. The above
query could be replaced by:
......@@ -548,7 +550,7 @@ SELECT * FROM sal_emp WHERE pay_by_quarter[1] = 10000 OR
SELECT * FROM sal_emp WHERE 10000 = ANY (pay_by_quarter);
</programlisting>
In addition, you could find rows where the array had all values
In addition, you can find rows where the array has all values
equal to 10000 with:
<programlisting>
......@@ -578,7 +580,7 @@ SELECT * FROM
can be a sign of database misdesign. Consider
using a separate table with a row for each item that would be an
array element. This will be easier to search, and is likely to
scale up better to large numbers of elements.
scale better for a large number of elements.
</para>
</tip>
</sect2>
......@@ -600,9 +602,9 @@ SELECT * FROM
The delimiter character is usually a comma (<literal>,</>) but can be
something else: it is determined by the <literal>typdelim</> setting
for the array's element type. (Among the standard data types provided
in the <productname>PostgreSQL</productname> distribution, type
<literal>box</> uses a semicolon (<literal>;</>) but all the others
use comma.) In a multidimensional array, each dimension (row, plane,
in the <productname>PostgreSQL</productname> distribution, all
use a comma, except for <literal>box</>, which uses a semicolon (<literal>;</>).)
In a multidimensional array, each dimension (row, plane,
cube, etc.) gets its own level of curly braces, and delimiters
must be written between adjacent curly-braced entities of the same level.
</para>
......@@ -614,7 +616,7 @@ SELECT * FROM
<literal>NULL</>. Double quotes and backslashes
embedded in element values will be backslash-escaped. For numeric
data types it is safe to assume that double quotes will never appear, but
for textual data types one should be prepared to cope with either presence
for textual data types one should be prepared to cope with either the presence
or absence of quotes.
</para>
......@@ -647,27 +649,27 @@ SELECT f1[1][-2][3] AS e1, f1[1][-1][5] AS e2
or backslashes disables this and allows the literal string value
<quote>NULL</> to be entered. Also, for backwards compatibility with
pre-8.2 versions of <productname>PostgreSQL</>, the <xref
linkend="guc-array-nulls"> configuration parameter might be turned
linkend="guc-array-nulls"> configuration parameter can be turned
<literal>off</> to suppress recognition of <literal>NULL</> as a NULL.
</para>
<para>
As shown previously, when writing an array value you can write double
As shown previously, when writing an array value you can use double
quotes around any individual array element. You <emphasis>must</> do so
if the element value would otherwise confuse the array-value parser.
For example, elements containing curly braces, commas (or whatever the
delimiter character is), double quotes, backslashes, or leading or trailing
For example, elements containing curly braces, commas (or the matching
delimiter character), double quotes, backslashes, or leading or trailing
whitespace must be double-quoted. Empty strings and strings matching the
word <literal>NULL</> must be quoted, too. To put a double quote or
backslash in a quoted array element value, use escape string syntax
and precede it with a backslash. Alternatively, you can use
and precede it with a backslash. Alternatively, you can avoid quotes and use
backslash-escaping to protect all data characters that would otherwise
be taken as array syntax.
</para>
<para>
You can write whitespace before a left brace or after a right
brace. You can also write whitespace before or after any individual item
You can use whitespace before a left brace or after a right
brace. You can also add whitespace before or after any individual item
string. In all of these cases the whitespace will be ignored. However,
whitespace within double-quoted elements, or surrounded on both sides by
non-whitespace characters of an element, is not ignored.
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.124 2009/04/07 00:31:25 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.125 2009/04/27 16:27:35 momjian Exp $ -->
<chapter id="backup">
<title>Backup and Restore</title>
......@@ -1523,7 +1523,7 @@ archive_command = 'local_backup_script.sh'
</para>
<para>
It should be noted that the log shipping is asynchronous, i.e. the WAL
It should be noted that the log shipping is asynchronous, i.e., the WAL
records are shipped after transaction commit. As a result there is a
window for data loss should the primary server suffer a catastrophic
failure: transactions not yet shipped will be lost. The length of the
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.215 2009/04/23 00:23:45 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.216 2009/04/27 16:27:35 momjian Exp $ -->
<chapter Id="runtime-config">
<title>Server Configuration</title>
......@@ -1253,7 +1253,7 @@ SET ENABLE_SEQSCAN TO OFF;
function, which some operating systems lack. If the function is not
present then setting this parameter to anything but zero will result
in an error. On some operating systems the function is present but
does not actually do anything (e.g. Solaris).
does not actually do anything (e.g., Solaris).
</para>
</listitem>
</varlistentry>
......@@ -4333,7 +4333,7 @@ SET XML OPTION { DOCUMENT | CONTENT };
If a dynamically loadable module needs to be opened and the
file name specified in the <command>CREATE FUNCTION</command> or
<command>LOAD</command> command
does not have a directory component (i.e. the
does not have a directory component (i.e., the
name does not contain a slash), the system will search this
path for the required file.
</para>
......@@ -4503,7 +4503,7 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
The shared lock table is created to track locks on
<varname>max_locks_per_transaction</varname> * (<xref
linkend="guc-max-connections"> + <xref
linkend="guc-max-prepared-transactions">) objects (e.g. tables);
linkend="guc-max-prepared-transactions">) objects (e.g., tables);
hence, no more than this many distinct objects can be locked at
any one time. This parameter controls the average number of object
locks allocated for each transaction; individual transactions
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/contrib.sgml,v 1.12 2009/03/25 23:20:01 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/contrib.sgml,v 1.13 2009/04/27 16:27:35 momjian Exp $ -->
<appendix id="contrib">
<title>Additional Supplied Modules</title>
......@@ -16,7 +16,7 @@
<para>
When building from the source distribution, these modules are not built
automatically. You can build and install all of them by running
automatically. You can build and install all of them by running:
<screen>
<userinput>gmake</userinput>
<userinput>gmake install</userinput>
......@@ -25,7 +25,7 @@
or to build and install
just one selected module, do the same in that module's subdirectory.
Many of the modules have regression tests, which can be executed by
running
running:
<screen>
<userinput>gmake installcheck</userinput>
</screen>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/datatype.sgml,v 1.236 2009/03/09 14:34:34 petere Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/datatype.sgml,v 1.237 2009/04/27 16:27:35 momjian Exp $ -->
<chapter id="datatype">
<title id="datatype-title">Data Types</title>
......@@ -25,7 +25,7 @@
<quote>Aliases</quote> column are the names used internally by
<productname>PostgreSQL</productname> for historical reasons. In
addition, some internally used or deprecated types are available,
but they are not listed here.
but are not listed here.
</para>
<table id="datatype-table">
......@@ -73,7 +73,7 @@
<row>
<entry><type>box</type></entry>
<entry></entry>
<entry>rectangular box in the plane</entry>
<entry>rectangular box on a plane</entry>
</row>
<row>
......@@ -103,7 +103,7 @@
<row>
<entry><type>circle</type></entry>
<entry></entry>
<entry>circle in the plane</entry>
<entry>circle on a plane</entry>
</row>
<row>
......@@ -115,7 +115,7 @@
<row>
<entry><type>double precision</type></entry>
<entry><type>float8</type></entry>
<entry>double precision floating-point number</entry>
<entry>double precision floating-point number (8 bytes)</entry>
</row>
<row>
......@@ -139,19 +139,19 @@
<row>
<entry><type>line</type></entry>
<entry></entry>
<entry>infinite line in the plane</entry>
<entry>infinite line on a plane</entry>
</row>
<row>
<entry><type>lseg</type></entry>
<entry></entry>
<entry>line segment in the plane</entry>
<entry>line segment on a plane</entry>
</row>
<row>
<entry><type>macaddr</type></entry>
<entry></entry>
<entry>MAC address</entry>
<entry>MAC (Media Access Control) address</entry>
</row>
<row>
......@@ -171,25 +171,25 @@
<row>
<entry><type>path</type></entry>
<entry></entry>
<entry>geometric path in the plane</entry>
<entry>geometric path on a plane</entry>
</row>
<row>
<entry><type>point</type></entry>
<entry></entry>
<entry>geometric point in the plane</entry>
<entry>geometric point on a plane</entry>
</row>
<row>
<entry><type>polygon</type></entry>
<entry></entry>
<entry>closed geometric path in the plane</entry>
<entry>closed geometric path on a plane</entry>
</row>
<row>
<entry><type>real</type></entry>
<entry><type>float4</type></entry>
<entry>single precision floating-point number</entry>
<entry>single precision floating-point number (4 bytes)</entry>
</row>
<row>
......@@ -213,7 +213,7 @@
<row>
<entry><type>time [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
<entry></entry>
<entry>time of day</entry>
<entry>time of day (no time zone)</entry>
</row>
<row>
......@@ -225,7 +225,7 @@
<row>
<entry><type>timestamp [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
<entry></entry>
<entry>date and time</entry>
<entry>date and time (no time zone)</entry>
</row>
<row>
......@@ -288,9 +288,9 @@
and output functions. Many of the built-in types have
obvious external formats. However, several types are either unique
to <productname>PostgreSQL</productname>, such as geometric
paths, or have several possibilities for formats, such as the date
paths, or have several possible formats, such as the date
and time types.
Some of the input and output functions are not invertible. That is,
Some of the input and output functions are not invertible, i.e.
the result of an output function might lose accuracy when compared to
the original input.
</para>
......@@ -332,7 +332,7 @@
<row>
<entry><type>integer</></entry>
<entry>4 bytes</entry>
<entry>usual choice for integer</entry>
<entry>typical choice for integer</entry>
<entry>-2147483648 to +2147483647</entry>
</row>
<row>
......@@ -431,21 +431,21 @@
</para>
<para>
The type <type>integer</type> is the usual choice, as it offers
The type <type>integer</type> is the common choice, as it offers
the best balance between range, storage size, and performance.
The <type>smallint</type> type is generally only used if disk
space is at a premium. The <type>bigint</type> type should only
be used if the <type>integer</type> range is not sufficient,
be used if the <type>integer</type> range is insufficient,
because the latter is definitely faster.
</para>
<para>
The <type>bigint</type> type might not function correctly on all
platforms, since it relies on compiler support for eight-byte
integers. On a machine without such support, <type>bigint</type>
On very minimal operating systems the <type>bigint</type> type
might not function correctly because it relies on compiler support
for eight-byte integers. On such machines, <type>bigint</type>
acts the same as <type>integer</type> (but still takes up eight
bytes of storage). However, we are not aware of any reasonable
platform where this is actually the case.
bytes of storage). (We are not aware of any
platform where this is true.)
</para>
<para>
......@@ -453,7 +453,7 @@
<type>integer</type> (or <type>int</type>),
<type>smallint</type>, and <type>bigint</type>. The
type names <type>int2</type>, <type>int4</type>, and
<type>int8</type> are extensions, which are shared with various
<type>int8</type> are extensions, which are also used by
other <acronym>SQL</acronym> database systems.
</para>
......@@ -481,11 +481,11 @@
especially recommended for storing monetary amounts and other
quantities where exactness is required. However, arithmetic on
<type>numeric</type> values is very slow compared to the integer
types, or to the floating-point types described in the next section.
and floating-point types described in the next section.
</para>
<para>
In what follows we use these terms: The
We use the following terms below: The
<firstterm>scale</firstterm> of a <type>numeric</type> is the
count of decimal digits in the fractional part, to the right of
the decimal point. The <firstterm>precision</firstterm> of a
......@@ -558,7 +558,7 @@ NUMERIC
type allows the special value <literal>NaN</>, meaning
<quote>not-a-number</quote>. Any operation on <literal>NaN</>
yields another <literal>NaN</>. When writing this value
as a constant in a SQL command, you must put quotes around it,
as a constant in an SQL command, you must put quotes around it,
for example <literal>UPDATE table SET x = 'NaN'</>. On input,
the string <literal>NaN</> is recognized in a case-insensitive manner.
</para>
......@@ -621,10 +621,10 @@ NUMERIC
<para>
Inexact means that some values cannot be converted exactly to the
internal format and are stored as approximations, so that storing
and printing back out a value might show slight discrepancies.
and retrieving a value might show slight discrepancies.
Managing these errors and how they propagate through calculations
is the subject of an entire branch of mathematics and computer
science and will not be discussed further here, except for the
science and will not be discussed here, except for the
following points:
<itemizedlist>
<listitem>
......@@ -645,8 +645,8 @@ NUMERIC
<listitem>
<para>
Comparing two floating-point values for equality might or might
not work as expected.
Comparing two floating-point values for equality might not
always work as expected.
</para>
</listitem>
</itemizedlist>
......@@ -702,7 +702,7 @@ NUMERIC
notations <type>float</type> and
<type>float(<replaceable>p</replaceable>)</type> for specifying
inexact numeric types. Here, <replaceable>p</replaceable> specifies
the minimum acceptable precision in binary digits.
the minimum acceptable precision in <emphasis>binary</> digits.
<productname>PostgreSQL</productname> accepts
<type>float(1)</type> to <type>float(24)</type> as selecting the
<type>real</type> type, while
......@@ -717,7 +717,7 @@ NUMERIC
<para>
Prior to <productname>PostgreSQL</productname> 7.4, the precision in
<type>float(<replaceable>p</replaceable>)</type> was taken to mean
so many decimal digits. This has been corrected to match the SQL
so many <emphasis>decimal</> digits. This has been corrected to match the SQL
standard, which specifies that the precision is measured in binary
digits. The assumption that <type>real</type> and
<type>double precision</type> have exactly 24 and 53 bits in the
......@@ -762,7 +762,7 @@ NUMERIC
<para>
The data types <type>serial</type> and <type>bigserial</type>
are not true types, but merely
a notational convenience for setting up unique identifier columns
a notational convenience for creating unique identifier columns
(similar to the <literal>AUTO_INCREMENT</literal> property
supported by some other databases). In the current
implementation, specifying:
......@@ -786,7 +786,7 @@ ALTER SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceab
Thus, we have created an integer column and arranged for its default
values to be assigned from a sequence generator. A <literal>NOT NULL</>
constraint is applied to ensure that a null value cannot be explicitly
inserted, either. (In most cases you would also want to attach a
inserted. (In most cases you would also want to attach a
<literal>UNIQUE</> or <literal>PRIMARY KEY</> constraint to prevent
duplicate values from being inserted by accident, but this is
not automatic.) Lastly, the sequence is marked as <quote>owned by</>
......@@ -797,8 +797,8 @@ ALTER SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceab
<para>
Prior to <productname>PostgreSQL</productname> 7.3, <type>serial</type>
implied <literal>UNIQUE</literal>. This is no longer automatic. If
you wish a serial column to be in a unique constraint or a
primary key, it must now be specified, same as with
you wish a serial column to have a unique constraint or be a
primary key, it must now be specified just like
any other data type.
</para>
</note>
......@@ -815,7 +815,7 @@ ALTER SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceab
<para>
The type names <type>serial</type> and <type>serial4</type> are
equivalent: both create <type>integer</type> columns. The type
names <type>bigserial</type> and <type>serial8</type> work just
names <type>bigserial</type> and <type>serial8</type> work
the same way, except that they create a <type>bigint</type>
column. <type>bigserial</type> should be used if you anticipate
the use of more than 2<superscript>31</> identifiers over the
......@@ -837,9 +837,10 @@ ALTER SEQUENCE <replaceable class="parameter">tablename</replaceable>_<replaceab
<para>
The <type>money</type> type stores a currency amount with a fixed
fractional precision; see <xref
linkend="datatype-money-table">.
linkend="datatype-money-table">. The fractional precision
is controlled by the database locale.
Input is accepted in a variety of formats, including integer and
floating-point literals, as well as <quote>typical</quote>
floating-point literals, as well as typical
currency formatting, such as <literal>'$1,000.00'</literal>.
Output is generally in the latter form but depends on the locale.
Non-quoted numeric values can be converted to <type>money</type> by
......@@ -859,10 +860,10 @@ SELECT regexp_replace('52093.89'::money::text, '[$,]', '', 'g')::numeric;
</para>
<para>
Since the output of this data type is locale-sensitive, it may not
Since the output of this data type is locale-sensitive, it might not
work to load <type>money</> data into a database that has a different
setting of <varname>lc_monetary</>. To avoid problems, before
restoring a dump make sure <varname>lc_monetary</> has the same or
restoring a dump into a new database make sure <varname>lc_monetary</> has the same or
equivalent value as in the database that was dumped.
</para>
......@@ -960,7 +961,7 @@ SELECT regexp_replace('52093.89'::money::text, '[$,]', '', 'g')::numeric;
<type>character varying(<replaceable>n</>)</type> and
<type>character(<replaceable>n</>)</type>, where <replaceable>n</>
is a positive integer. Both of these types can store strings up to
<replaceable>n</> characters in length. An attempt to store a
<replaceable>n</> characters in length (not bytes). An attempt to store a
longer string into a column of these types will result in an
error, unless the excess characters are all spaces, in which case
the string will be truncated to the maximum length. (This somewhat
......@@ -1015,16 +1016,16 @@ SELECT regexp_replace('52093.89'::money::text, '[$,]', '', 'g')::numeric;
<para>
The storage requirement for a short string (up to 126 bytes) is 1 byte
plus the actual string, which includes the space padding in the case of
<type>character</type>. Longer strings have 4 bytes overhead instead
<type>character</type>. Longer strings have 4 bytes of overhead instead
of 1. Long strings are compressed by the system automatically, so
the physical requirement on disk might be less. Very long values are also
stored in background tables so that they do not interfere with rapid
access to shorter column values. In any case, the longest
possible character string that can be stored is about 1 GB. (The
maximum value that will be allowed for <replaceable>n</> in the data
type declaration is less than that. It wouldn't be very useful to
type declaration is less than that. It wouldn't be useful to
change this because with multibyte character encodings the number of
characters and bytes can be quite different anyway. If you desire to
characters and bytes can be quite different. If you desire to
store long strings with no specific upper limit, use
<type>text</type> or <type>character varying</type> without a length
specifier, rather than making up an arbitrary length limit.)
......@@ -1032,12 +1033,12 @@ SELECT regexp_replace('52093.89'::money::text, '[$,]', '', 'g')::numeric;
<tip>
<para>
There are no performance differences between these three types,
apart from increased storage size when using the blank-padded
type, and a few extra cycles to check the length when storing into
There is no performance difference between these three types,
apart from increased storage space when using the blank-padded
type, and a few extra CPU cycles to check the length when storing into
a length-constrained column. While
<type>character(<replaceable>n</>)</type> has performance
advantages in some other database systems, it has no such advantages in
advantages in some other database systems, there is no such advantage in
<productname>PostgreSQL</productname>. In most situations
<type>text</type> or <type>character varying</type> should be used
instead.
......@@ -1095,16 +1096,17 @@ SELECT b, char_length(b) FROM test2;
There are two other fixed-length character types in
<productname>PostgreSQL</productname>, shown in <xref
linkend="datatype-character-special-table">. The <type>name</type>
type exists <emphasis>only</emphasis> for storage of identifiers
type exists <emphasis>only</emphasis> for the storage of identifiers
in the internal system catalogs and is not intended for use by the general user. Its
length is currently defined as 64 bytes (63 usable characters plus
terminator) but should be referenced using the constant
<symbol>NAMEDATALEN</symbol>. The length is set at compile time (and
<symbol>NAMEDATALEN</symbol> in <literal>C</> source code.
The length is set at compile time (and
is therefore adjustable for special uses); the default maximum
length might change in a future release. The type <type>"char"</type>
(note the quotes) is different from <type>char(1)</type> in that it
only uses one byte of storage. It is internally used in the system
catalogs as a poor-man's enumeration type.
catalogs as a simplistic enumeration type.
</para>
<table id="datatype-character-special-table">
......@@ -1172,8 +1174,8 @@ SELECT b, char_length(b) FROM test2;
<para>
A binary string is a sequence of octets (or bytes). Binary
strings are distinguished from character strings by two
characteristics: First, binary strings specifically allow storing
strings are distinguished from character strings in two
ways: First, binary strings specifically allow storing
octets of value zero and other <quote>non-printable</quote>
octets (usually, octets outside the range 32 to 126).
Character strings disallow zero octets, and also disallow any
......@@ -1191,8 +1193,8 @@ SELECT b, char_length(b) FROM test2;
values <emphasis>must</emphasis> be escaped (but all octet
values <emphasis>can</emphasis> be escaped) when used as part
of a string literal in an <acronym>SQL</acronym> statement. In
general, to escape an octet, it is converted into the three-digit
octal number equivalent of its decimal octet value, and preceded
general, to escape an octet, convert it into its three-digit
octal value and precede it
by two backslashes. <xref linkend="datatype-binary-sqlesc">
shows the characters that must be escaped, and gives the alternative
escape sequences where applicable.
......@@ -1249,16 +1251,16 @@ SELECT b, char_length(b) FROM test2;
</table>
<para>
The requirement to escape <quote>non-printable</quote> octets actually
The requirement to escape <emphasis>non-printable</emphasis> octets
varies depending on locale settings. In some instances you can get away
with leaving them unescaped. Note that the result in each of the examples
in <xref linkend="datatype-binary-sqlesc"> was exactly one octet in
length, even though the output representation of the zero octet and
backslash are more than one character.
length, even though the output representation is sometimes
more than one character.
</para>
<para>
The reason that you have to write so many backslashes, as shown
The reason multiple backslashes are required, as shown
in <xref linkend="datatype-binary-sqlesc">, is that an input
string written as a string literal must pass through two parse
phases in the <productname>PostgreSQL</productname> server.
......@@ -1280,12 +1282,12 @@ SELECT b, char_length(b) FROM test2;
</para>
<para>
<type>Bytea</type> octets are also escaped in the output. In general, each
<type>Bytea</type> octets are sometimes escaped when output. In general, each
<quote>non-printable</quote> octet is converted into
its equivalent three-digit octal value and preceded by one backslash.
Most <quote>printable</quote> octets are represented by their standard
representation in the client character set. The octet with decimal
value 92 (backslash) has a special alternative output representation.
value 92 (backslash) is doubled in the output.
Details are in <xref linkend="datatype-binary-resesc">.
</para>
......@@ -1406,7 +1408,7 @@ SELECT b, char_length(b) FROM test2;
<row>
<entry><type>timestamp [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
<entry>8 bytes</entry>
<entry>both date and time</entry>
<entry>both date and time (no time zone)</entry>
<entry>4713 BC</entry>
<entry>294276 AD</entry>
<entry>1 microsecond / 14 digits</entry>
......@@ -1422,7 +1424,7 @@ SELECT b, char_length(b) FROM test2;
<row>
<entry><type>date</type></entry>
<entry>4 bytes</entry>
<entry>dates only</entry>
<entry>date (no time of day)</entry>
<entry>4713 BC</entry>
<entry>5874897 AD</entry>
<entry>1 day</entry>
......@@ -1430,7 +1432,7 @@ SELECT b, char_length(b) FROM test2;
<row>
<entry><type>time [ (<replaceable>p</replaceable>) ] [ without time zone ]</type></entry>
<entry>8 bytes</entry>
<entry>times of day only</entry>
<entry>time of day (no date)</entry>
<entry>00:00:00</entry>
<entry>24:00:00</entry>
<entry>1 microsecond / 14 digits</entry>
......@@ -1446,7 +1448,7 @@ SELECT b, char_length(b) FROM test2;
<row>
<entry><type>interval [ <replaceable>fields</replaceable> ] [ (<replaceable>p</replaceable>) ]</type></entry>
<entry>12 bytes</entry>
<entry>time intervals</entry>
<entry>time interval</entry>
<entry>-178000000 years</entry>
<entry>178000000 years</entry>
<entry>1 microsecond / 14 digits</entry>
......@@ -1542,9 +1544,8 @@ SELECT b, char_length(b) FROM test2;
<para>
The types <type>abstime</type>
and <type>reltime</type> are lower precision types which are used internally.
You are discouraged from using these types in new
applications and are encouraged to move any old
ones over when appropriate. Any or all of these internal types
You are discouraged from using these types in
applications; these internal types
might disappear in a future release.
</para>
......@@ -1555,7 +1556,7 @@ SELECT b, char_length(b) FROM test2;
Date and time input is accepted in almost any reasonable format, including
ISO 8601, <acronym>SQL</acronym>-compatible,
traditional <productname>POSTGRES</productname>, and others.
For some formats, ordering of month, day, and year in date input is
For some formats, ordering of day, month, and year in date input is
ambiguous and there is support for specifying the expected
ordering of these fields. Set the <xref linkend="guc-datestyle"> parameter
to <literal>MDY</> to select month-day-year interpretation,
......@@ -1582,8 +1583,7 @@ SELECT b, char_length(b) FROM test2;
<synopsis>
<replaceable>type</replaceable> [ (<replaceable>p</replaceable>) ] '<replaceable>value</replaceable>'
</synopsis>
where <replaceable>p</replaceable> in the optional precision
specification is an integer corresponding to the number of
where <replaceable>p</replaceable> is an optional precision corresponding to the number of
fractional digits in the seconds field. Precision can be
specified for <type>time</type>, <type>timestamp</type>, and
<type>interval</type> types. The allowed values are mentioned
......@@ -1613,15 +1613,15 @@ SELECT b, char_length(b) FROM test2;
</row>
</thead>
<tbody>
<row>
<entry>January 8, 1999</entry>
<entry>unambiguous in any <varname>datestyle</varname> input mode</entry>
</row>
<row>
<entry>1999-01-08</entry>
<entry>ISO 8601; January 8 in any mode
(recommended format)</entry>
</row>
<row>
<entry>January 8, 1999</entry>
<entry>unambiguous in any <varname>datestyle</varname> input mode</entry>
</row>
<row>
<entry>1/8/1999</entry>
<entry>January 8 in <literal>MDY</> mode;
......@@ -1681,7 +1681,7 @@ SELECT b, char_length(b) FROM test2;
</row>
<row>
<entry>January 8, 99 BC</entry>
<entry>year 99 before the Common Era</entry>
<entry>year 99 BC</entry>
</row>
</tbody>
</tgroup>
......@@ -1705,7 +1705,7 @@ SELECT b, char_length(b) FROM test2;
The time-of-day types are <type>time [
(<replaceable>p</replaceable>) ] without time zone</type> and
<type>time [ (<replaceable>p</replaceable>) ] with time
zone</type>. Writing just <type>time</type> is equivalent to
zone</type>; <type>time</type> is equivalent to
<type>time without time zone</type>.
</para>
......@@ -1752,7 +1752,7 @@ SELECT b, char_length(b) FROM test2;
</row>
<row>
<entry><literal>04:05 AM</literal></entry>
<entry>same as 04:05; AM does not affect value</entry>
<entry>same as 04:05 (AM ignored)</entry>
</row>
<row>
<entry><literal>04:05 PM</literal></entry>
......@@ -1854,7 +1854,7 @@ SELECT b, char_length(b) FROM test2;
</indexterm>
<para>
Valid input for the time stamp types consists of a concatenation
Valid input for the time stamp types consists of the concatenation
of a date and a time, followed by an optional time zone,
followed by an optional <literal>AD</literal> or <literal>BC</literal>.
(Alternatively, <literal>AD</literal>/<literal>BC</literal> can appear
......@@ -1870,7 +1870,7 @@ SELECT b, char_length(b) FROM test2;
</programlisting>
are valid values, which follow the <acronym>ISO</acronym> 8601
standard. In addition, the wide-spread format:
standard. In addition, the common format:
<programlisting>
January 8 04:05:06 1999 PST
</programlisting>
......@@ -1880,18 +1880,25 @@ January 8 04:05:06 1999 PST
<para>
The <acronym>SQL</acronym> standard differentiates <type>timestamp without time zone</type>
and <type>timestamp with time zone</type> literals by the presence of a
<quote>+</quote> or <quote>-</quote>. Hence, according to the standard,
<quote>+</quote> or <quote>-</quote> symbol after the time
indicating the time zone offset. Hence, according to the standard:
<programlisting>TIMESTAMP '2004-10-19 10:23:54'</programlisting>
is a <type>timestamp without time zone</type>, while
is a <type>timestamp without time zone</type>, while:
<programlisting>TIMESTAMP '2004-10-19 10:23:54+02'</programlisting>
is a <type>timestamp with time zone</type>.
<productname>PostgreSQL</productname> never examines the content of a
literal string before determining its type, and therefore will treat
both of the above as <type>timestamp without time zone</type>. To
ensure that a literal is treated as <type>timestamp with time
zone</type>, give it the correct explicit type:
<programlisting>TIMESTAMP WITH TIME ZONE '2004-10-19 10:23:54+02'</programlisting>
In a literal that has been decided to be <type>timestamp without time
In a literal that has been determined to be <type>timestamp without time
zone</type>, <productname>PostgreSQL</productname> will silently ignore
any time zone indication.
That is, the resulting value is derived from the date/time
......@@ -1923,7 +1930,7 @@ January 8 04:05:06 1999 PST
Conversions between <type>timestamp without time zone</type> and
<type>timestamp with time zone</type> normally assume that the
<type>timestamp without time zone</type> value should be taken or given
as <varname>timezone</> local time. A different zone reference can
as <varname>timezone</> local time. A different time zone can
be specified for the conversion using <literal>AT TIME ZONE</>.
</para>
</sect3>
......@@ -1947,11 +1954,11 @@ January 8 04:05:06 1999 PST
linkend="datatype-datetime-special-table">. The values
<literal>infinity</literal> and <literal>-infinity</literal>
are specially represented inside the system and will be displayed
the same way; but the others are simply notational shorthands
unchanged; but the others are simply notational shorthands
that will be converted to ordinary date/time values when read.
(In particular, <literal>now</> and related strings are converted
to a specific time value as soon as they are read.)
All of these values need to be written in single quotes when used
All of these values need to be enclosed in single quotes when used
as constants in SQL commands.
</para>
......@@ -2018,8 +2025,8 @@ January 8 04:05:06 1999 PST
<literal>CURRENT_TIMESTAMP</literal>, <literal>LOCALTIME</literal>,
<literal>LOCALTIMESTAMP</literal>. The latter four accept an
optional subsecond precision specification. (See <xref
linkend="functions-datetime-current">.) Note however that these are
SQL functions and are <emphasis>not</> recognized as data input strings.
linkend="functions-datetime-current">.) Note that these are
SQL functions and are <emphasis>not</> recognized in data input strings.
</para>
</sect3>
......@@ -2041,14 +2048,15 @@ January 8 04:05:06 1999 PST
</indexterm>
<para>
The output format of the date/time types can be set to one of the four
styles ISO 8601,
<acronym>SQL</acronym> (Ingres), traditional POSTGRES, and
German, using the command <literal>SET datestyle</literal>. The default
The output format of the date/time types can one of the four
styles: ISO 8601,
<acronym>SQL</acronym> (Ingres), traditional <productname>POSTGRES</>
(Unix <application>date</> format), and
German. It can be set using the <literal>SET datestyle</literal> command. The default
is the <acronym>ISO</acronym> format. (The
<acronym>SQL</acronym> standard requires the use of the ISO 8601
format. The name of the <quote>SQL</quote> output format is a
historical accident.) <xref
format. The name of the <literal>SQL</> output format poorly
chosen and an historical accident.) <xref
linkend="datatype-datetime-output-table"> shows examples of each
output style. The output of the <type>date</type> and
<type>time</type> types is of course only the date or time part
......@@ -2172,7 +2180,7 @@ January 8 04:05:06 1999 PST
<listitem>
<para>
Although the <type>date</type> type
does not have an associated time zone, the
cannot have an associated time zone, the
<type>time</type> type can.
Time zones in the real world have little meaning unless
associated with a date as well as a time,
......@@ -2184,7 +2192,7 @@ January 8 04:05:06 1999 PST
<listitem>
<para>
The default time zone is specified as a constant numeric offset
from <acronym>UTC</>. It is therefore not possible to adapt to
from <acronym>UTC</>. It is therefore impossible to adapt to
daylight-saving time when doing date/time arithmetic across
<acronym>DST</acronym> boundaries.
</para>
......@@ -2196,7 +2204,7 @@ January 8 04:05:06 1999 PST
<para>
To address these difficulties, we recommend using date/time types
that contain both date and time when using time zones. We
recommend <emphasis>not</emphasis> using the type <type>time with
do <emphasis>not</> recommend using the type <type>time with
time zone</type> (though it is supported by
<productname>PostgreSQL</productname> for legacy applications and
for compliance with the <acronym>SQL</acronym> standard).
......@@ -2230,12 +2238,12 @@ January 8 04:05:06 1999 PST
<para>
A time zone abbreviation, for example <literal>PST</>. Such a
specification merely defines a particular offset from UTC, in
contrast to full time zone names which might imply a set of daylight
contrast to full time zone names which can imply a set of daylight
savings transition-date rules as well. The recognized abbreviations
are listed in the <literal>pg_timezone_abbrevs</> view (see <xref
linkend="view-pg-timezone-abbrevs">). You cannot set the
configuration parameters <xref linkend="guc-timezone"> or
<xref linkend="guc-log-timezone"> using a time
<xref linkend="guc-log-timezone"> to a time
zone abbreviation, but you can use abbreviations in
date/time input values and with the <literal>AT TIME ZONE</>
operator.
......@@ -2252,7 +2260,7 @@ January 8 04:05:06 1999 PST
optional daylight-savings zone abbreviation, assumed to stand for one
hour ahead of the given offset. For example, if <literal>EST5EDT</>
were not already a recognized zone name, it would be accepted and would
be functionally equivalent to USA East Coast time. When a
be functionally equivalent to United States East Coast time. When a
daylight-savings zone name is present, it is assumed to be used
according to the same daylight-savings transition rules used in the
<literal>zoneinfo</> time zone database's <filename>posixrules</> entry.
......@@ -2265,10 +2273,10 @@ January 8 04:05:06 1999 PST
</listitem>
</itemizedlist>
There is a conceptual and practical difference between the abbreviations
and the full names: abbreviations always represent a fixed offset from
In summary, there is a difference between abbreviations
and full names: abbreviations always represent a fixed offset from
UTC, whereas most of the full names imply a local daylight-savings time
rule and so have two possible UTC offsets.
rule, and so have two possible UTC offsets.
</para>
<para>
......@@ -2287,7 +2295,7 @@ January 8 04:05:06 1999 PST
<para>
In all cases, timezone names are recognized case-insensitively.
(This is a change from <productname>PostgreSQL</productname> versions
prior to 8.2, which were case-sensitive in some contexts and not others.)
prior to 8.2, which were case-sensitive in some contexts but not others.)
</para>
<para>
......@@ -2308,7 +2316,7 @@ January 8 04:05:06 1999 PST
<listitem>
<para>
If <varname>timezone</> is not specified in
<filename>postgresql.conf</> nor as a server command-line option,
<filename>postgresql.conf</> or as a server command-line option,
the server attempts to use the value of the <envar>TZ</envar>
environment variable as the default time zone. If <envar>TZ</envar>
is not defined or is not any of the time zone names known to
......@@ -2318,7 +2326,7 @@ January 8 04:05:06 1999 PST
default time zone is selected as the closest match among
<productname>PostgreSQL</productname>'s known time zones.
(These rules are also used to choose the default value of
<xref linkend="guc-log-timezone">, if it is not specified.)
<xref linkend="guc-log-timezone">, if not specified.)
</para>
</listitem>
......@@ -2332,9 +2340,9 @@ January 8 04:05:06 1999 PST
<listitem>
<para>
The <envar>PGTZ</envar> environment variable, if set at the
client, is used by <application>libpq</application>
applications to send a <command>SET TIME ZONE</command>
The <envar>PGTZ</envar> environment variable is used by
<application>libpq</application> clients
to send a <command>SET TIME ZONE</command>
command to the server upon connection.
</para>
</listitem>
......@@ -2350,7 +2358,7 @@ January 8 04:05:06 1999 PST
</indexterm>
<para>
<type>interval</type> values can be written with the following
<type>interval</type> values can be written using the following:
verbose syntax:
<synopsis>
......@@ -2366,7 +2374,7 @@ January 8 04:05:06 1999 PST
or abbreviations or plurals of these units;
<replaceable>direction</> can be <literal>ago</literal> or
empty. The at sign (<literal>@</>) is optional noise. The amounts
of different units are implicitly added up with appropriate
of the different units are implicitly added with appropriate
sign accounting. <literal>ago</literal> negates all the fields.
This syntax is also used for interval output, if
<xref linkend="guc-intervalstyle"> is set to
......@@ -2639,8 +2647,8 @@ P <optional> <replaceable>years</>-<replaceable>months</>-<replaceable>days</> <
<para>
<productname>PostgreSQL</productname> uses Julian dates
for all date/time calculations. They have the nice property of correctly
predicting/calculating any date more recent than 4713 BC
for all date/time calculations. This has the useful property of correctly
calculating dates from 4713 BC
to far into the future, using the assumption that the length of the
year is 365.2425 days.
</para>
......@@ -2700,9 +2708,9 @@ P <optional> <replaceable>years</>-<replaceable>months</>-<replaceable>days</> <
<member><literal>'off'</literal></member>
<member><literal>'0'</literal></member>
</simplelist>
Leading and trailing whitespace is ignored. Using the key words
<literal>TRUE</literal> and <literal>FALSE</literal> is preferred
(and <acronym>SQL</acronym>-compliant).
Leading and trailing whitespace and case are ignored. The key words
<literal>TRUE</literal> and <literal>FALSE</literal> is the preferred
usage (and <acronym>SQL</acronym>-compliant).
</para>
<example id="datatype-boolean-example">
......@@ -2750,9 +2758,9 @@ SELECT * FROM test1 WHERE a;
<para>
Enumerated (enum) types are data types that
are comprised of a static, predefined set of values with a
specific order. They are equivalent to the <type>enum</type>
types in a number of programming languages. An example of an enum
comprise a static, ordered set of values.
They are equivalent to the <type>enum</type>
types supported in a number of programming languages. An example of an enum
type might be the days of the week, or a set of status values for
a piece of data.
</para>
......@@ -2796,7 +2804,7 @@ SELECT * FROM person WHERE current_mood = 'happy';
<para>
The ordering of the values in an enum type is the
order in which the values were listed when the type was declared.
order in which the values were listed when the type was created.
All standard comparison operators and related
aggregate functions are supported for enums. For example:
</para>
......@@ -2820,8 +2828,9 @@ SELECT * FROM person WHERE current_mood > 'sad' ORDER BY current_mood;
Moe | happy
(2 rows)
SELECT name FROM person
WHERE current_mood = (SELECT MIN(current_mood) FROM person);
SELECT name
FROM person
WHERE current_mood = (SELECT MIN(current_mood) FROM person);
name
-------
Larry
......@@ -2834,8 +2843,8 @@ SELECT name FROM person
<title>Type Safety</title>
<para>
Enumerated types are completely separate data types and may not
be compared with each other.
Each enumerated data type is separate and cannot
be compared with other enumerated types.
</para>
<example>
......@@ -2843,7 +2852,7 @@ SELECT name FROM person
<programlisting>
CREATE TYPE happiness AS ENUM ('happy', 'very happy', 'ecstatic');
CREATE TABLE holidays (
num_weeks int,
num_weeks integer,
happiness happiness
);
INSERT INTO holidays(num_weeks,happiness) VALUES (4, 'happy');
......@@ -2889,7 +2898,7 @@ SELECT person.name, holidays.num_weeks FROM person, holidays
<para>
Enum labels are case sensitive, so
<type>'happy'</type> is not the same as <type>'HAPPY'</type>.
Spaces in the labels are significant, too.
White space in the labels is significant too.
</para>
<para>
......@@ -2928,7 +2937,7 @@ SELECT person.name, holidays.num_weeks FROM person, holidays
<row>
<entry><type>point</type></entry>
<entry>16 bytes</entry>
<entry>Point on the plane</entry>
<entry>Point on a plane</entry>
<entry>(x,y)</entry>
</row>
<row>
......@@ -2971,7 +2980,7 @@ SELECT person.name, holidays.num_weeks FROM person, holidays
<entry><type>circle</type></entry>
<entry>24 bytes</entry>
<entry>Circle</entry>
<entry>&lt;(x,y),r&gt; (center and radius)</entry>
<entry>&lt;(x,y),r&gt; (center point and radius)</entry>
</row>
</tbody>
</tgroup>
......@@ -3000,7 +3009,7 @@ SELECT person.name, holidays.num_weeks FROM person, holidays
</synopsis>
where <replaceable>x</> and <replaceable>y</> are the respective
coordinates as floating-point numbers.
coordinates, as floating-point numbers.
</para>
</sect2>
......@@ -3063,11 +3072,9 @@ SELECT person.name, holidays.num_weeks FROM person, holidays
</para>
<para>
Boxes are output using the first syntax.
The corners are reordered on input to store
the upper right corner, then the lower left corner.
Other corners of the box can be entered, but the lower
left and upper right corners are determined from the input and stored.
Boxes are output using the first syntax. Any two opposite corners
can be supplied; the corners are reordered on input to store the
upper right and lower left corners.
</para>
</sect2>
......@@ -3081,7 +3088,7 @@ SELECT person.name, holidays.num_weeks FROM person, holidays
<para>
Paths are represented by lists of connected points. Paths can be
<firstterm>open</firstterm>, where
the first and last points in the list are not considered connected, or
the first and last points in the list are considered not connected, or
<firstterm>closed</firstterm>,
where the first and last points are considered connected.
</para>
......@@ -3104,7 +3111,7 @@ SELECT person.name, holidays.num_weeks FROM person, holidays
</para>
<para>
Paths are output using the first syntax.
Paths are output using the first appropriate syntax.
</para>
</sect2>
......@@ -3117,8 +3124,8 @@ SELECT person.name, holidays.num_weeks FROM person, holidays
<para>
Polygons are represented by lists of points (the vertexes of the
polygon). Polygons should probably be
considered equivalent to closed paths, but are stored differently
polygon). Polygons are very similar to closed paths, but are
stored differently
and have their own set of support routines.
</para>
......@@ -3149,7 +3156,7 @@ SELECT person.name, holidays.num_weeks FROM person, holidays
</indexterm>
<para>
Circles are represented by a center point and a radius.
Circles are represented by a center point and radius.
Values of type <type>circle</type> are specified using the following syntax:
<synopsis>
......@@ -3161,7 +3168,7 @@ SELECT person.name, holidays.num_weeks FROM person, holidays
where
<literal>(<replaceable>x</replaceable>,<replaceable>y</replaceable>)</literal>
is the center and <replaceable>r</replaceable> is the radius of the circle.
is the center point and <replaceable>r</replaceable> is the radius of the circle.
</para>
<para>
......@@ -3182,9 +3189,9 @@ SELECT person.name, holidays.num_weeks FROM person, holidays
<para>
<productname>PostgreSQL</> offers data types to store IPv4, IPv6, and MAC
addresses, as shown in <xref linkend="datatype-net-types-table">. It
is preferable to use these types instead of plain text types to store
network addresses, because
these types offer input error checking and several specialized
is better to use these types instead of plain text types to store
network addresses because
these types offer input error checking and specialized
operators and functions (see <xref linkend="functions-net">).
</para>
......@@ -3225,7 +3232,7 @@ SELECT person.name, holidays.num_weeks FROM person, holidays
<para>
When sorting <type>inet</type> or <type>cidr</type> data types,
IPv4 addresses will always sort before IPv6 addresses, including
IPv4 addresses encapsulated or mapped into IPv6 addresses, such as
IPv4 addresses encapsulated or mapped to IPv6 addresses, such as
::10.2.3.4 or ::ffff:10.4.3.2.
</para>
......@@ -3239,14 +3246,14 @@ SELECT person.name, holidays.num_weeks FROM person, holidays
<para>
The <type>inet</type> type holds an IPv4 or IPv6 host address, and
optionally the identity of the subnet it is in, all in one field.
The subnet identity is represented by stating how many bits of
the host address represent the network address (the
optionally its subnet, all in one field.
The subnet is represented by the number of network address bits
present in the host address (the
<quote>netmask</quote>). If the netmask is 32 and the address is IPv4,
then the value does not indicate a subnet, only a single host.
In IPv6, the address length is 128 bits, so 128 bits specify a
unique host address. Note that if you
want to accept networks only, you should use the
want to accept only networks, you should use the
<type>cidr</type> type rather than <type>inet</type>.
</para>
......@@ -3259,7 +3266,7 @@ SELECT person.name, holidays.num_weeks FROM person, holidays
<replaceable class="parameter">y</replaceable>
is the number of bits in the netmask. If the
<replaceable class="parameter">/y</replaceable>
part is left off, then the
is missing, the
netmask is 32 for IPv4 and 128 for IPv6, so the value represents
just a single host. On display, the
<replaceable class="parameter">/y</replaceable>
......@@ -3285,7 +3292,7 @@ SELECT person.name, holidays.num_weeks FROM person, holidays
class="parameter">y</> is the number of bits in the netmask. If
<replaceable class="parameter">y</> is omitted, it is calculated
using assumptions from the older classful network numbering system, except
that it will be at least large enough to include all of the octets
it will be at least large enough to include all of the octets
written in the input. It is an error to specify a network address
that has bits set to the right of the specified netmask.
</para>
......@@ -3553,9 +3560,9 @@ SELECT * FROM test;
are designed to support full text search, which is the activity of
searching through a collection of natural-language <firstterm>documents</>
to locate those that best match a <firstterm>query</>.
The <type>tsvector</type> type represents a document in a form suited
for text search, while the <type>tsquery</type> type similarly represents
a query.
The <type>tsvector</type> type represents a document stored in a form optimized
for text search; <type>tsquery</type> type similarly represents
a text query.
<xref linkend="textsearch"> provides a detailed explanation of this
facility, and <xref linkend="functions-textsearch"> summarizes the
related functions and operators.
......@@ -3570,9 +3577,9 @@ SELECT * FROM test;
<para>
A <type>tsvector</type> value is a sorted list of distinct
<firstterm>lexemes</>, which are words that have been
<firstterm>normalized</> to make different variants of the same word look
alike (see <xref linkend="textsearch"> for details). Sorting and
<firstterm>lexemes</>, which are words which have been
<firstterm>normalized</> to merge different variants of the same word
(see <xref linkend="textsearch"> for details). Sorting and
duplicate-elimination are done automatically during input, as shown in
this example:
......@@ -3593,8 +3600,8 @@ SELECT $$the lexeme ' ' contains spaces$$::tsvector;
' ' 'contains' 'lexeme' 'spaces' 'the'
</programlisting>
(We use dollar-quoted string literals in this example and the next one,
to avoid confusing matters by having to double quote marks within the
(We use dollar-quoted string literals in this example and the next one
to avoid the confusion of having to double quote marks within the
literals.) Embedded quotes and backslashes must be doubled:
<programlisting>
......@@ -3604,8 +3611,8 @@ SELECT $$the lexeme 'Joe''s' contains a quote$$::tsvector;
'Joe''s' 'a' 'contains' 'lexeme' 'quote' 'the'
</programlisting>
Optionally, integer <firstterm>position(s)</>
can be attached to any or all of the lexemes:
Optionally, integer <firstterm>positions</>
can be attached to lexemes:
<programlisting>
SELECT 'a:1 fat:2 cat:3 sat:4 on:5 a:6 mat:7 and:8 ate:9 a:10 fat:11 rat:12'::tsvector;
......@@ -3617,7 +3624,7 @@ SELECT 'a:1 fat:2 cat:3 sat:4 on:5 a:6 mat:7 and:8 ate:9 a:10 fat:11 rat:12'::ts
A position normally indicates the source word's location in the
document. Positional information can be used for
<firstterm>proximity ranking</firstterm>. Position values can
range from 1 to 16383; larger numbers are silently clamped to 16383.
range from 1 to 16383; larger numbers are silently set to 16383.
Duplicate positions for the same lexeme are discarded.
</para>
......@@ -3643,7 +3650,7 @@ SELECT 'a:1A fat:2B,4C cat:5D'::tsvector;
<para>
It is important to understand that the
<type>tsvector</type> type itself does not perform any normalization;
it assumes that the words it is given are normalized appropriately
it assumes the words it is given are normalized appropriately
for the application. For example,
<programlisting>
......@@ -3680,7 +3687,7 @@ SELECT to_tsvector('english', 'The Fat Rats');
<para>
A <type>tsquery</type> value stores lexemes that are to be
searched for, and combines them using the boolean operators
searched for, and combines them by honoring the boolean operators
<literal>&amp;</literal> (AND), <literal>|</literal> (OR), and
<literal>!</> (NOT). Parentheses can be used to enforce grouping
of the operators:
......@@ -3710,7 +3717,7 @@ SELECT 'fat &amp; rat &amp; ! cat'::tsquery;
<para>
Optionally, lexemes in a <type>tsquery</type> can be labeled with
one or more weight letters, which restricts them to match only
<type>tsvector</> lexemes with one of those weights:
<type>tsvector</> lexemes with matching weights:
<programlisting>
SELECT 'fat:ab &amp; cat'::tsquery;
......@@ -3734,10 +3741,10 @@ SELECT 'super:*'::tsquery;
</para>
<para>
Quoting rules for lexemes are the same as described above for
Quoting rules for lexemes are the same as described previously for
lexemes in <type>tsvector</>; and, as with <type>tsvector</>,
any required normalization of words must be done before putting
them into the <type>tsquery</> type. The <function>to_tsquery</>
any required normalization of words must be done before converting
to the <type>tsquery</> type. The <function>to_tsquery</>
function is convenient for performing such normalization:
<programlisting>
......@@ -3762,13 +3769,13 @@ SELECT to_tsquery('Fat:ab &amp; Cats');
<para>
The data type <type>uuid</type> stores Universally Unique Identifiers
(UUID) as defined by RFC 4122, ISO/IEC 9834-8:2005, and related standards.
(Some systems refer to this data type as globally unique identifier, or
GUID,<indexterm><primary>GUID</primary></indexterm> instead.) Such an
(Some systems refer to this data type as a globally unique identifier, or
GUID,<indexterm><primary>GUID</primary></indexterm> instead.) This
identifier is a 128-bit quantity that is generated by an algorithm chosen
to make it very unlikely that the same identifier will be generated by
anyone else in the known universe using the same algorithm. Therefore,
for distributed systems, these identifiers provide a better uniqueness
guarantee than that which can be achieved using sequence generators, which
guarantee than sequence generators, which
are only unique within a single database.
</para>
......@@ -3816,10 +3823,10 @@ a0ee-bc99-9c0b-4ef8-bb6d-6bb9-bd38-0a11
</indexterm>
<para>
The data type <type>xml</type> can be used to store XML data. Its
The <type>xml</type> data type can be used to store XML data. Its
advantage over storing XML data in a <type>text</type> field is that it
checks the input values for well-formedness, and there are support
functions to perform type-safe operations on it; see <xref
checks the input values for well-formedness, and support
functions can perform type-safe operations on it; see <xref
linkend="functions-xml">. Use of this data type requires the
installation to have been built with <command>configure
--with-libxml</>.
......@@ -3862,19 +3869,19 @@ xml '<foo>bar</foo>'
</para>
<para>
The <type>xml</type> type does not validate its input values
against a possibly included document type declaration
The <type>xml</type> type does not validate input values
against an optionally-supplied document type declaration
(DTD).<indexterm><primary>DTD</primary></indexterm>
</para>
<para>
The inverse operation, producing character string type values from
The inverse operation, producing a character string value from
<type>xml</type>, uses the function
<function>xmlserialize</function>:<indexterm><primary>xmlserialize</primary></indexterm>
<synopsis>
XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <replaceable>type</replaceable> )
</synopsis>
<replaceable>type</replaceable> can be one of
<replaceable>type</replaceable> can be
<type>character</type>, <type>character varying</type>, or
<type>text</type> (or an alias name for those). Again, according
to the SQL standard, this is the only way to convert between type
......@@ -3883,14 +3890,14 @@ XMLSERIALIZE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable> AS <repla
</para>
<para>
When character string values are cast to or from type
When a character string value is cast to or from type
<type>xml</type> without going through <type>XMLPARSE</type> or
<type>XMLSERIALIZE</type>, respectively, the choice of
<literal>DOCUMENT</literal> versus <literal>CONTENT</literal> is
determined by the <quote>XML option</quote>
<indexterm><primary>XML option</primary></indexterm>
session configuration parameter, which can be set using the
standard command
standard command:
<synopsis>
SET XML OPTION { DOCUMENT | CONTENT };
</synopsis>
......@@ -3915,38 +3922,38 @@ SET xmloption TO { DOCUMENT | CONTENT };
end; see <xref linkend="multibyte">. This includes string
representations of XML values, such as in the above examples.
This would ordinarily mean that encoding declarations contained in
XML data might become invalid as the character data is converted
to other encodings while travelling between client and server,
while the embedded encoding declaration is not changed. To cope
with this behavior, an encoding declaration contained in a
character string presented for input to the <type>xml</type> type
is <emphasis>ignored</emphasis>, and the content is always assumed
XML data can become invalid as the character data is converted
to other encodings while travelling between client and server
because the embedded encoding declaration is not changed. To cope
with this behavior, encoding declarations contained in
character strings presented for input to the <type>xml</type> type
are <emphasis>ignored</emphasis>, and content is assumed
to be in the current server encoding. Consequently, for correct
processing, such character strings of XML data must be sent off
processing, character strings of XML data must be sent
from the client in the current client encoding. It is the
responsibility of the client to either convert the document to the
current client encoding before sending it off to the server or to
responsibility of the client to either convert documents to the
current client encoding before sending them to the server or to
adjust the client encoding appropriately. On output, values of
type <type>xml</type> will not have an encoding declaration, and
clients must assume that the data is in the current client
clients should assume all data is in the current client
encoding.
</para>
<para>
When using the binary mode to pass query parameters to the server
When using binary mode to pass query parameters to the server
and query results back to the client, no character set conversion
is performed, so the situation is different. In this case, an
encoding declaration in the XML data will be observed, and if it
is absent, the data will be assumed to be in UTF-8 (as required by
the XML standard; note that PostgreSQL does not support UTF-16 at
all). On output, data will have an encoding declaration
the XML standard; note that PostgreSQL does not support UTF-16).
On output, data will have an encoding declaration
specifying the client encoding, unless the client encoding is
UTF-8, in which case it will be omitted.
</para>
<para>
Needless to say, processing XML data with PostgreSQL will be less
error-prone and more efficient if data encoding, client encoding,
error-prone and more efficient if the XML data encoding, client encoding,
and server encoding are the same. Since XML data is internally
processed in UTF-8, computations will be most efficient if the
server encoding is also UTF-8.
......@@ -3973,17 +3980,17 @@ SET xmloption TO { DOCUMENT | CONTENT };
Since there are no comparison operators for the <type>xml</type>
data type, it is not possible to create an index directly on a
column of this type. If speedy searches in XML data are desired,
possible workarounds would be casting the expression to a
possible workarounds include casting the expression to a
character string type and indexing that, or indexing an XPath
expression. The actual query would of course have to be adjusted
expression. Of course, the actual query would have to be adjusted
to search by the indexed expression.
</para>
<para>
The text-search functionality in PostgreSQL could also be used to speed
up full-document searches in XML data. The necessary
preprocessing support is, however, not available in the PostgreSQL
distribution in this release.
The text-search functionality in PostgreSQL can also be used to speed
up full-document searches of XML data. The necessary
preprocessing support is, however, not yet available in the PostgreSQL
distribution.
</para>
</sect2>
</sect1>
......@@ -4191,13 +4198,14 @@ SELECT * FROM pg_attribute
The <type>regproc</> and <type>regoper</> alias types will only
accept input names that are unique (not overloaded), so they are
of limited use; for most uses <type>regprocedure</> or
<type>regoperator</> is more appropriate. For <type>regoperator</>,
<type>regoperator</> are more appropriate. For <type>regoperator</>,
unary operators are identified by writing <literal>NONE</> for the unused
operand.
</para>
<para>
An additional property of the OID alias types is that if a
An additional property of the OID alias types is the creation of
dependencies. If a
constant of one of these types appears in a stored expression
(such as a column default expression or view), it creates a dependency
on the referenced object. For example, if a column has a default
......@@ -4311,7 +4319,7 @@ SELECT * FROM pg_attribute
<tbody>
<row>
<entry><type>any</></entry>
<entry>Indicates that a function accepts any input data type whatever.</entry>
<entry>Indicates that a function accepts any input data type.</entry>
</row>
<row>
......@@ -4398,7 +4406,7 @@ SELECT * FROM pg_attribute
<para>
The <type>internal</> pseudo-type is used to declare functions
that are meant only to be called internally by the database
system, and not by direct invocation in a <acronym>SQL</acronym>
system, and not by direct invocation in an <acronym>SQL</acronym>
query. If a function has at least one <type>internal</>-type
argument then it cannot be called from <acronym>SQL</acronym>. To
preserve the type safety of this restriction it is important to
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/ddl.sgml,v 1.85 2009/01/08 12:47:58 petere Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/ddl.sgml,v 1.86 2009/04/27 16:27:35 momjian Exp $ -->
<chapter id="ddl">
<title>Data Definition</title>
......@@ -153,7 +153,7 @@ DROP TABLE products;
</para>
<para>
If you need to modify a table that already exists look into <xref
If you need to modify a table that already exists, see <xref
linkend="ddl-alter"> later in this chapter.
</para>
......@@ -206,7 +206,7 @@ CREATE TABLE products (
The default value can be an expression, which will be
evaluated whenever the default value is inserted
(<emphasis>not</emphasis> when the table is created). A common example
is that a <type>timestamp</type> column can have a default of <literal>now()</>,
is for a <type>timestamp</type> column to have a default of <literal>CURRENT_TIMESTAMP</>,
so that it gets set to the time of row insertion. Another common
example is generating a <quote>serial number</> for each row.
In <productname>PostgreSQL</productname> this is typically done by
......@@ -374,8 +374,8 @@ CREATE TABLE products (
</para>
<para>
Names can be assigned to table constraints in just the same way as
for column constraints:
Names can be assigned to table constraints in the same way as
column constraints:
<programlisting>
CREATE TABLE products (
product_no integer,
......@@ -550,15 +550,15 @@ CREATE TABLE products (
</indexterm>
<para>
In general, a unique constraint is violated when there are two or
more rows in the table where the values of all of the
In general, a unique constraint is violated when there is more than
one row in the table where the values of all of the
columns included in the constraint are equal.
However, two null values are not considered equal in this
comparison. That means even in the presence of a
unique constraint it is possible to store duplicate
rows that contain a null value in at least one of the constrained
columns. This behavior conforms to the SQL standard, but we have
heard that other SQL databases might not follow this rule. So be
columns. This behavior conforms to the SQL standard, but there
might be other SQL databases might not follow this rule. So be
careful when developing applications that are intended to be
portable.
</para>
......@@ -857,7 +857,7 @@ CREATE TABLE order_items (
restrictions are separate from whether the name is a key word or
not; quoting a name will not allow you to escape these
restrictions.) You do not really need to be concerned about these
columns, just know they exist.
columns; just know they exist.
</para>
<indexterm>
......@@ -1037,8 +1037,8 @@ CREATE TABLE order_items (
Command identifiers are also 32-bit quantities. This creates a hard limit
of 2<superscript>32</> (4 billion) <acronym>SQL</acronym> commands
within a single transaction. In practice this limit is not a
problem &mdash; note that the limit is on number of
<acronym>SQL</acronym> commands, not number of rows processed.
problem &mdash; note that the limit is on the number of
<acronym>SQL</acronym> commands, not the number of rows processed.
Also, as of <productname>PostgreSQL</productname> 8.3, only commands
that actually modify the database contents will consume a command
identifier.
......@@ -1055,7 +1055,7 @@ CREATE TABLE order_items (
<para>
When you create a table and you realize that you made a mistake, or
the requirements of the application change, then you can drop the
the requirements of the application change, you can drop the
table and create it again. But this is not a convenient option if
the table is already filled with data, or if the table is
referenced by other database objects (for instance a foreign key
......@@ -1067,31 +1067,31 @@ CREATE TABLE order_items (
</para>
<para>
You can
You can:
<itemizedlist spacing="compact">
<listitem>
<para>Add columns,</para>
<para>Add columns</para>
</listitem>
<listitem>
<para>Remove columns,</para>
<para>Remove columns</para>
</listitem>
<listitem>
<para>Add constraints,</para>
<para>Add constraints</para>
</listitem>
<listitem>
<para>Remove constraints,</para>
<para>Remove constraints</para>
</listitem>
<listitem>
<para>Change default values,</para>
<para>Change default values</para>
</listitem>
<listitem>
<para>Change column data types,</para>
<para>Change column data types</para>
</listitem>
<listitem>
<para>Rename columns,</para>
<para>Rename columns</para>
</listitem>
<listitem>
<para>Rename tables.</para>
<para>Rename tables</para>
</listitem>
</itemizedlist>
......@@ -1110,7 +1110,7 @@ CREATE TABLE order_items (
</indexterm>
<para>
To add a column, use a command like this:
To add a column, use a command like:
<programlisting>
ALTER TABLE products ADD COLUMN description text;
</programlisting>
......@@ -1154,7 +1154,7 @@ ALTER TABLE products ADD COLUMN description text CHECK (description &lt;&gt; '')
</indexterm>
<para>
To remove a column, use a command like this:
To remove a column, use a command like:
<programlisting>
ALTER TABLE products DROP COLUMN description;
</programlisting>
......@@ -1250,7 +1250,7 @@ ALTER TABLE products ALTER COLUMN product_no DROP NOT NULL;
</indexterm>
<para>
To set a new default for a column, use a command like this:
To set a new default for a column, use a command like:
<programlisting>
ALTER TABLE products ALTER COLUMN price SET DEFAULT 7.77;
</programlisting>
......@@ -1279,7 +1279,7 @@ ALTER TABLE products ALTER COLUMN price DROP DEFAULT;
</indexterm>
<para>
To convert a column to a different data type, use a command like this:
To convert a column to a different data type, use a command like:
<programlisting>
ALTER TABLE products ALTER COLUMN price TYPE numeric(10,2);
</programlisting>
......@@ -1488,7 +1488,7 @@ REVOKE ALL ON accounts FROM PUBLIC;
<listitem>
<para>
Third-party applications can be put into separate schemas so
they cannot collide with the names of other objects.
they do not collide with the names of other objects.
</para>
</listitem>
</itemizedlist>
......@@ -1603,7 +1603,7 @@ CREATE SCHEMA <replaceable>schemaname</replaceable> AUTHORIZATION <replaceable>u
<para>
In the previous sections we created tables without specifying any
schema names. By default, such tables (and other objects) are
schema names. By default such tables (and other objects) are
automatically put into a schema named <quote>public</quote>. Every new
database contains such a schema. Thus, the following are equivalent:
<programlisting>
......@@ -1746,7 +1746,7 @@ SELECT 3 OPERATOR(pg_catalog.+) 4;
<para>
By default, users cannot access any objects in schemas they do not
own. To allow that, the owner of the schema needs to grant the
own. To allow that, the owner of the schema must grant the
<literal>USAGE</literal> privilege on the schema. To allow users
to make use of the objects in the schema, additional privileges
might need to be granted, as appropriate for the object.
......@@ -1802,7 +1802,7 @@ REVOKE CREATE ON SCHEMA public FROM PUBLIC;
such names, to ensure that you won't suffer a conflict if some
future version defines a system table named the same as your
table. (With the default search path, an unqualified reference to
your table name would be resolved as the system table instead.)
your table name would be resolved as a system table instead.)
System tables will continue to follow the convention of having
names beginning with <literal>pg_</>, so that they will not
conflict with unqualified user-table names so long as users avoid
......@@ -2024,7 +2024,7 @@ WHERE c.altitude &gt; 500;
<programlisting>
SELECT p.relname, c.name, c.altitude
FROM cities c, pg_class p
WHERE c.altitude &gt; 500 and c.tableoid = p.oid;
WHERE c.altitude &gt; 500 AND c.tableoid = p.oid;
</programlisting>
which returns:
......@@ -2130,7 +2130,7 @@ VALUES ('New York', NULL, NULL, 'NY');
<para>
Table access permissions are not automatically inherited. Therefore,
a user attempting to access a parent table must either have permissions
to do the operation on all its child tables as well, or must use the
to do the same operation on all its child tables as well, or must use the
<literal>ONLY</literal> notation. When adding a new child table to
an existing inheritance hierarchy, be careful to grant all the needed
permissions on it.
......@@ -2197,7 +2197,7 @@ VALUES ('New York', NULL, NULL, 'NY');
These deficiencies will probably be fixed in some future release,
but in the meantime considerable care is needed in deciding whether
inheritance is useful for your problem.
inheritance is useful for your application.
</para>
<note>
......@@ -2374,7 +2374,7 @@ CHECK ( outletID &gt;= 100 AND outletID &lt; 200 )
</programlisting>
Ensure that the constraints guarantee that there is no overlap
between the key values permitted in different partitions. A common
mistake is to set up range constraints like this:
mistake is to set up range constraints like:
<programlisting>
CHECK ( outletID BETWEEN 100 AND 200 )
CHECK ( outletID BETWEEN 200 AND 300 )
......@@ -2424,7 +2424,7 @@ CHECK ( outletID BETWEEN 200 AND 300 )
For example, suppose we are constructing a database for a large
ice cream company. The company measures peak temperatures every
day as well as ice cream sales in each region. Conceptually,
we want a table like this:
we want a table like:
<programlisting>
CREATE TABLE measurement (
......@@ -2571,12 +2571,15 @@ CREATE TRIGGER insert_measurement_trigger
CREATE OR REPLACE FUNCTION measurement_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
IF ( NEW.logdate &gt;= DATE '2006-02-01' AND NEW.logdate &lt; DATE '2006-03-01' ) THEN
IF ( NEW.logdate &gt;= DATE '2006-02-01' AND
NEW.logdate &lt; DATE '2006-03-01' ) THEN
INSERT INTO measurement_y2006m02 VALUES (NEW.*);
ELSIF ( NEW.logdate &gt;= DATE '2006-03-01' AND NEW.logdate &lt; DATE '2006-04-01' ) THEN
ELSIF ( NEW.logdate &gt;= DATE '2006-03-01' AND
NEW.logdate &lt; DATE '2006-04-01' ) THEN
INSERT INTO measurement_y2006m03 VALUES (NEW.*);
...
ELSIF ( NEW.logdate &gt;= DATE '2008-01-01' AND NEW.logdate &lt; DATE '2008-02-01' ) THEN
ELSIF ( NEW.logdate &gt;= DATE '2008-01-01' AND
NEW.logdate &lt; DATE '2008-02-01' ) THEN
INSERT INTO measurement_y2008m01 VALUES (NEW.*);
ELSE
RAISE EXCEPTION 'Date out of range. Fix the measurement_insert_trigger() function!';
......@@ -2706,9 +2709,9 @@ SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2008-01-01';
Without constraint exclusion, the above query would scan each of
the partitions of the <structname>measurement</> table. With constraint
exclusion enabled, the planner will examine the constraints of each
partition and try to prove that the partition need not
be scanned because it could not contain any rows meeting the query's
<literal>WHERE</> clause. When the planner can prove this, it
partition and try to determine which partitions need not
be scanned because they cannot not contain any rows meeting the query's
<literal>WHERE</> clause. When the planner can determine this, it
excludes the partition from the query plan.
</para>
......@@ -2875,7 +2878,7 @@ UNION ALL SELECT * FROM measurement_y2008m01;
<para>
If you are using manual <command>VACUUM</command> or
<command>ANALYZE</command> commands, don't forget that
you need to run them on each partition individually. A command like
you need to run them on each partition individually. A command like:
<programlisting>
ANALYZE measurement;
</programlisting>
......@@ -2903,7 +2906,7 @@ ANALYZE measurement;
<listitem>
<para>
Keep the partitioning constraints simple, else the planner may not be
Keep the partitioning constraints simple or else the planner may not be
able to prove that partitions don't need to be visited. Use simple
equality conditions for list partitioning, or simple
range tests for range partitioning, as illustrated in the preceding
......@@ -2937,7 +2940,7 @@ ANALYZE measurement;
that exist in a database. Many other kinds of objects can be
created to make the use and management of the data more efficient
or convenient. They are not discussed in this chapter, but we give
you a list here so that you are aware of what is possible.
you a list here so that you are aware of what is possible:
</para>
<itemizedlist>
......@@ -2988,7 +2991,7 @@ ANALYZE measurement;
<para>
When you create complex database structures involving many tables
with foreign key constraints, views, triggers, functions, etc. you
will implicitly create a net of dependencies between the objects.
implicitly create a net of dependencies between the objects.
For instance, a table with a foreign key constraint depends on the
table it references.
</para>
......@@ -3008,7 +3011,7 @@ ERROR: cannot drop table products because other objects depend on it
HINT: Use DROP ... CASCADE to drop the dependent objects too.
</screen>
The error message contains a useful hint: if you do not want to
bother deleting all the dependent objects individually, you can run
bother deleting all the dependent objects individually, you can run:
<screen>
DROP TABLE products CASCADE;
</screen>
......@@ -3024,7 +3027,7 @@ DROP TABLE products CASCADE;
the possible dependencies varies with the type of the object. You
can also write <literal>RESTRICT</literal> instead of
<literal>CASCADE</literal> to get the default behavior, which is to
prevent drops of objects that other objects depend on.
prevent the dropping of objects that other objects depend on.
</para>
<note>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/dml.sgml,v 1.17 2007/12/03 23:49:50 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/dml.sgml,v 1.18 2009/04/27 16:27:35 momjian Exp $ -->
<chapter id="dml">
<title>Data Manipulation</title>
......@@ -14,7 +14,7 @@
table data. We also introduce ways to effect automatic data changes
when certain events occur: triggers and rewrite rules. The chapter
after this will finally explain how to extract your long-lost data
back out of the database.
from the database.
</para>
<sect1 id="dml-insert">
......@@ -33,14 +33,14 @@
do before a database can be of much use is to insert data. Data is
conceptually inserted one row at a time. Of course you can also
insert more than one row, but there is no way to insert less than
one row at a time. Even if you know only some column values, a
one row. Even if you know only some column values, a
complete row must be created.
</para>
<para>
To create a new row, use the <xref linkend="sql-insert"
endterm="sql-insert-title"> command. The command requires the
table name and a value for each of the columns of the table. For
table name and column values. For
example, consider the products table from <xref linkend="ddl">:
<programlisting>
CREATE TABLE products (
......@@ -60,7 +60,7 @@ INSERT INTO products VALUES (1, 'Cheese', 9.99);
<para>
The above syntax has the drawback that you need to know the order
of the columns in the table. To avoid that you can also list the
of the columns in the table. To avoid this you can also list the
columns explicitly. For example, both of the following commands
have the same effect as the one above:
<programlisting>
......@@ -137,15 +137,15 @@ INSERT INTO products (product_no, name, price) VALUES
To perform an update, you need three pieces of information:
<orderedlist spacing="compact">
<listitem>
<para>The name of the table and column to update,</para>
<para>The name of the table and column to update</para>
</listitem>
<listitem>
<para>The new value of the column,</para>
<para>The new value of the column</para>
</listitem>
<listitem>
<para>Which row(s) to update.</para>
<para>Which row(s) to update</para>
</listitem>
</orderedlist>
</para>
......@@ -153,10 +153,10 @@ INSERT INTO products (product_no, name, price) VALUES
<para>
Recall from <xref linkend="ddl"> that SQL does not, in general,
provide a unique identifier for rows. Therefore it is not
necessarily possible to directly specify which row to update.
always possible to directly specify which row to update.
Instead, you specify which conditions a row must meet in order to
be updated. Only if you have a primary key in the table (no matter
whether you declared it or not) can you reliably address individual rows,
be updated. Only if you have a primary key in the table (independent of
whether you declared it or not) can you reliably address individual rows
by choosing a condition that matches the primary key.
Graphical database access tools rely on this fact to allow you to
update rows individually.
......@@ -177,7 +177,7 @@ UPDATE products SET price = 10 WHERE price = 5;
<literal>UPDATE</literal> followed by the table name. As usual,
the table name can be schema-qualified, otherwise it is looked up
in the path. Next is the key word <literal>SET</literal> followed
by the column name, an equals sign and the new column value. The
by the column name, an equal sign, and the new column value. The
new column value can be any scalar expression, not just a constant.
For example, if you want to raise the price of all products by 10%
you could use:
......@@ -248,7 +248,10 @@ DELETE FROM products WHERE price = 10;
<programlisting>
DELETE FROM products;
</programlisting>
then all rows in the table will be deleted! Caveat programmer.
then all rows in the table will be deleted! (<xref
linkend="sql-truncate" endterm="sql-truncate-title"> can also be used
to delete all rows.)
Caveat programmer.
</para>
</sect1>
</chapter>
<!-- $PostgreSQL: pgsql/doc/src/sgml/docguide.sgml,v 1.74 2008/11/03 15:39:38 alvherre Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/docguide.sgml,v 1.75 2009/04/27 16:27:35 momjian Exp $ -->
<appendix id="docguide">
<title>Documentation</title>
......@@ -358,7 +358,7 @@ CATALOG "dsssl/catalog"
Create the directory
<filename>/usr/local/share/sgml/docbook-4.2</filename> and change
to it. (The exact location is irrelevant, but this one is
reasonable within the layout we are following here.)
reasonable within the layout we are following here.):
<screen>
<prompt>$ </prompt><userinput>mkdir /usr/local/share/sgml/docbook-4.2</userinput>
<prompt>$ </prompt><userinput>cd /usr/local/share/sgml/docbook-4.2</userinput>
......@@ -368,7 +368,7 @@ CATALOG "dsssl/catalog"
<step>
<para>
Unpack the archive.
Unpack the archive:
<screen>
<prompt>$ </prompt><userinput>unzip -a ...../docbook-4.2.zip</userinput>
</screen>
......@@ -392,7 +392,7 @@ CATALOG "docbook-4.2/docbook.cat"
<para>
Download the <ulink url="http://www.oasis-open.org/cover/ISOEnts.zip">
ISO 8879 character entities archive</ulink>, unpack it, and put the
files in the same directory you put the DocBook files in.
files in the same directory you put the DocBook files in:
<screen>
<prompt>$ </prompt><userinput>cd /usr/local/share/sgml/docbook-4.2</userinput>
<prompt>$ </prompt><userinput>unzip ...../ISOEnts.zip</userinput>
......@@ -421,7 +421,7 @@ perl -pi -e 's/iso-(.*).gml/ISO\1/g' docbook.cat
To install the style sheets, unzip and untar the distribution and
move it to a suitable place, for example
<filename>/usr/local/share/sgml</filename>. (The archive will
automatically create a subdirectory.)
automatically create a subdirectory.):
<screen>
<prompt>$</prompt> <userinput>gunzip docbook-dsssl-1.<replaceable>xx</>.tar.gz</userinput>
<prompt>$</prompt> <userinput>tar -C /usr/local/share/sgml -xf docbook-dsssl-1.<replaceable>xx</>.tar</userinput>
......@@ -652,7 +652,7 @@ gmake man.tar.gz D2MDIR=<replaceable>directory</replaceable>
<screen>
<prompt>doc/src/sgml$ </prompt><userinput>gmake postgres-A4.pdf</userinput>
</screen>
or
or:
<screen>
<prompt>doc/src/sgml$ </prompt><userinput>gmake postgres-US.pdf</userinput>
</screen>
......@@ -738,7 +738,6 @@ save_size.pdfjadetex = 15000
following one. A utility, <command>fixrtf</command>, is
available in <filename>doc/src/sgml</filename> to accomplish
these repairs:
<screen>
<prompt>doc/src/sgml$ </prompt><userinput>./fixrtf --refentry postgres.rtf</userinput>
</screen>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/ecpg.sgml,v 1.87 2008/12/07 23:46:39 alvherre Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/ecpg.sgml,v 1.88 2009/04/27 16:27:35 momjian Exp $ -->
<chapter id="ecpg">
<title><application>ECPG</application> - Embedded <acronym>SQL</acronym> in C</title>
......@@ -750,7 +750,7 @@ EXEC SQL DEALLOCATE PREPARE <replaceable>name</replaceable>;
<para>
The pgtypes library maps <productname>PostgreSQL</productname> database
types to C equivalents that can be used in C programs. It also offers
functions to do basic calculations with those types within C, i.e. without
functions to do basic calculations with those types within C, i.e., without
the help of the <productname>PostgreSQL</productname> server. See the
following example:
<programlisting><![CDATA[
......@@ -1232,7 +1232,7 @@ date PGTYPESdate_from_asc(char *str, char **endptr);
char *PGTYPESdate_to_asc(date dDate);
</synopsis>
The function receives the date <literal>dDate</> as its only parameter.
It will output the date in the form <literal>1999-01-18</>, i.e. in the
It will output the date in the form <literal>1999-01-18</>, i.e., in the
<literal>YYYY-MM-DD</> format.
</para>
</listitem>
......
This source diff could not be displayed because it is too large. You can view the blob instead.
<!-- $PostgreSQL: pgsql/doc/src/sgml/high-availability.sgml,v 1.34 2008/11/19 04:46:37 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/high-availability.sgml,v 1.35 2009/04/27 16:27:35 momjian Exp $ -->
<chapter id="high-availability">
<title>High Availability, Load Balancing, and Replication</title>
......@@ -414,7 +414,7 @@ protocol to make nodes agree on a serializable transactional order.
<para>
Data partitioning splits tables into data sets. Each set can
be modified by only one server. For example, data can be
partitioned by offices, e.g. London and Paris, with a server
partitioned by offices, e.g., London and Paris, with a server
in each office. If queries combining London and Paris data
are necessary, an application can query both servers, or
master/slave replication can be used to keep a read-only copy
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/history.sgml,v 1.30 2007/10/30 23:06:06 petere Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/history.sgml,v 1.31 2009/04/27 16:27:35 momjian Exp $ -->
<sect1 id="history">
<title>A Brief History of <productname>PostgreSQL</productname></title>
......@@ -12,7 +12,7 @@
The object-relational database management system now known as
<productname>PostgreSQL</productname> is derived from the
<productname>POSTGRES</productname> package written at the
University of California at Berkeley. With over a decade of
University of California at Berkeley. With over two decades of
development behind it, <productname>PostgreSQL</productname> is now
the most advanced open-source database available anywhere.
</para>
......@@ -93,7 +93,7 @@
</indexterm>
<para>
In 1994, Andrew Yu and Jolly Chen added a SQL language interpreter
In 1994, Andrew Yu and Jolly Chen added an SQL language interpreter
to <productname>POSTGRES</productname>. Under a new name,
<productname>Postgres95</productname> was subsequently released to
the web to find its own way in the world as an open-source
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/indices.sgml,v 1.76 2009/02/07 20:05:44 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/indices.sgml,v 1.77 2009/04/27 16:27:35 momjian Exp $ -->
<chapter id="indexes">
<title id="indexes-title">Indexes</title>
......@@ -27,35 +27,35 @@ CREATE TABLE test1 (
content varchar
);
</programlisting>
and the application requires a lot of queries of the form:
and the application issues many queries of the form:
<programlisting>
SELECT content FROM test1 WHERE id = <replaceable>constant</replaceable>;
</programlisting>
With no advance preparation, the system would have to scan the entire
<structname>test1</structname> table, row by row, to find all
matching entries. If there are a lot of rows in
<structname>test1</structname> and only a few rows (perhaps only zero
or one) that would be returned by such a query, then this is clearly an
inefficient method. But if the system has been instructed to maintain an
index on the <structfield>id</structfield> column, then it can use a more
matching entries. If there are many rows in
<structname>test1</structname> and only a few rows (perhaps zero
or one) that would be returned by such a query, this is clearly an
inefficient method. But if the system maintains an
index on the <structfield>id</structfield> column, it can use a more
efficient method for locating matching rows. For instance, it
might only have to walk a few levels deep into a search tree.
</para>
<para>
A similar approach is used in most books of non-fiction: terms and
A similar approach is used in most non-fiction books: terms and
concepts that are frequently looked up by readers are collected in
an alphabetic index at the end of the book. The interested reader
can scan the index relatively quickly and flip to the appropriate
page(s), rather than having to read the entire book to find the
material of interest. Just as it is the task of the author to
anticipate the items that the readers are likely to look up,
anticipate the items that readers are likely to look up,
it is the task of the database programmer to foresee which indexes
will be of advantage.
will be useful.
</para>
<para>
The following command would be used to create the index on the
The following command can be used to create an index on the
<structfield>id</structfield> column, as discussed:
<programlisting>
CREATE INDEX test1_id_index ON test1 (id);
......@@ -73,7 +73,7 @@ CREATE INDEX test1_id_index ON test1 (id);
<para>
Once an index is created, no further intervention is required: the
system will update the index when the table is modified, and it will
use the index in queries when it thinks this would be more efficient
use the index in queries when it thinks it would be more efficient
than a sequential table scan. But you might have to run the
<command>ANALYZE</command> command regularly to update
statistics to allow the query planner to make educated decisions.
......@@ -87,14 +87,14 @@ CREATE INDEX test1_id_index ON test1 (id);
<command>DELETE</command> commands with search conditions.
Indexes can moreover be used in join searches. Thus,
an index defined on a column that is part of a join condition can
significantly speed up queries with joins.
also significantly speed up queries with joins.
</para>
<para>
Creating an index on a large table can take a long time. By default,
<productname>PostgreSQL</productname> allows reads (selects) to occur
on the table in parallel with creation of an index, but writes (inserts,
updates, deletes) are blocked until the index build is finished.
on the table in parallel with index creation, but writes (INSERTs,
UPDATEs, DELETEs) are blocked until the index build is finished.
In production environments this is often unacceptable.
It is possible to allow writes to occur in parallel with index
creation, but there are several caveats to be aware of &mdash;
......@@ -118,8 +118,8 @@ CREATE INDEX test1_id_index ON test1 (id);
<productname>PostgreSQL</productname> provides several index types:
B-tree, Hash, GiST and GIN. Each index type uses a different
algorithm that is best suited to different types of queries.
By default, the <command>CREATE INDEX</command> command will create a
B-tree index, which fits the most common situations.
By default, the <command>CREATE INDEX</command> command creates
B-tree indexes, which fit the most common situations.
</para>
<para>
......@@ -159,11 +159,11 @@ CREATE INDEX test1_id_index ON test1 (id);
'foo%'</literal> or <literal>col ~ '^foo'</literal>, but not
<literal>col LIKE '%bar'</literal>. However, if your database does not
use the C locale you will need to create the index with a special
operator class to support indexing of pattern-matching queries. See
operator class to support indexing of pattern-matching queries; see
<xref linkend="indexes-opclass"> below. It is also possible to use
B-tree indexes for <literal>ILIKE</literal> and
<literal>~*</literal>, but only if the pattern starts with
non-alphabetic characters, i.e. characters that are not affected by
non-alphabetic characters, i.e., characters that are not affected by
upper/lower case conversion.
</para>
......@@ -180,7 +180,7 @@ CREATE INDEX test1_id_index ON test1 (id);
Hash indexes can only handle simple equality comparisons.
The query planner will consider using a hash index whenever an
indexed column is involved in a comparison using the
<literal>=</literal> operator. (But hash indexes do not support
<literal>=</literal> operator. (Hash indexes do not support
<literal>IS NULL</> searches.)
The following command is used to create a hash index:
<synopsis>
......@@ -290,11 +290,11 @@ CREATE TABLE test2 (
);
</programlisting>
(say, you keep your <filename class="directory">/dev</filename>
directory in a database...) and you frequently make queries like:
directory in a database...) and you frequently issue queries like:
<programlisting>
SELECT name FROM test2 WHERE major = <replaceable>constant</replaceable> AND minor = <replaceable>constant</replaceable>;
</programlisting>
then it might be appropriate to define an index on the columns
then it might be appropriate to define an index on columns
<structfield>major</structfield> and
<structfield>minor</structfield> together, e.g.:
<programlisting>
......@@ -359,7 +359,7 @@ CREATE INDEX test2_mm_idx ON test2 (major, minor);
Indexes with more than three columns are unlikely to be helpful
unless the usage of the table is extremely stylized. See also
<xref linkend="indexes-bitmap-scans"> for some discussion of the
merits of different index setups.
merits of different index configurations.
</para>
</sect1>
......@@ -375,7 +375,7 @@ CREATE INDEX test2_mm_idx ON test2 (major, minor);
<para>
In addition to simply finding the rows to be returned by a query,
an index may be able to deliver them in a specific sorted order.
This allows a query's <literal>ORDER BY</> specification to be met
This allows a query's <literal>ORDER BY</> specification to be honored
without a separate sorting step. Of the index types currently
supported by <productname>PostgreSQL</productname>, only B-tree
can produce sorted output &mdash; the other index types return
......@@ -384,22 +384,23 @@ CREATE INDEX test2_mm_idx ON test2 (major, minor);
<para>
The planner will consider satisfying an <literal>ORDER BY</> specification
either by scanning any available index that matches the specification,
by either scanning an available index that matches the specification,
or by scanning the table in physical order and doing an explicit
sort. For a query that requires scanning a large fraction of the
table, the explicit sort is likely to be faster because it requires
less disk I/O due to a better-ordered access pattern. Indexes are
table, the explicit sort is likely to be faster than using an index
because it requires
less disk I/O due to a sequential access pattern. Indexes are
more useful when only a few rows need be fetched. An important
special case is <literal>ORDER BY</> in combination with
<literal>LIMIT</> <replaceable>n</>: an explicit sort will have to process
all the data to identify the first <replaceable>n</> rows, but if there is
an index matching the <literal>ORDER BY</> then the first <replaceable>n</>
all data to identify the first <replaceable>n</> rows, but if there is
an index matching the <literal>ORDER BY</>, the first <replaceable>n</>
rows can be retrieved directly, without scanning the remainder at all.
</para>
<para>
By default, B-tree indexes store their entries in ascending order
with nulls last. This means that a forward scan of an index on a
with nulls last. This means that a forward scan of an index on
column <literal>x</> produces output satisfying <literal>ORDER BY x</>
(or more verbosely, <literal>ORDER BY x ASC NULLS LAST</>). The
index can also be scanned backward, producing output satisfying
......@@ -432,14 +433,14 @@ CREATE INDEX test3_desc_index ON test3 (id DESC NULLS LAST);
<literal>ORDER BY x DESC, y DESC</> if we scan backward.
But it might be that the application frequently needs to use
<literal>ORDER BY x ASC, y DESC</>. There is no way to get that
ordering from a regular index, but it is possible if the index is defined
ordering from a simpler index, but it is possible if the index is defined
as <literal>(x ASC, y DESC)</> or <literal>(x DESC, y ASC)</>.
</para>
<para>
Obviously, indexes with non-default sort orderings are a fairly
specialized feature, but sometimes they can produce tremendous
speedups for certain queries. Whether it's worth keeping such an
speedups for certain queries. Whether it's worth creating such an
index depends on how often you use queries that require a special
sort ordering.
</para>
......@@ -468,7 +469,7 @@ CREATE INDEX test3_desc_index ON test3 (id DESC NULLS LAST);
</para>
<para>
Beginning in release 8.1,
Fortunately,
<productname>PostgreSQL</> has the ability to combine multiple indexes
(including multiple uses of the same index) to handle cases that cannot
be implemented by single index scans. The system can form <literal>AND</>
......@@ -513,7 +514,7 @@ CREATE INDEX test3_desc_index ON test3 (id DESC NULLS LAST);
more efficient than index combination for queries involving both
columns, but as discussed in <xref linkend="indexes-multicolumn">, it
would be almost useless for queries involving only <literal>y</>, so it
could not be the only index. A combination of the multicolumn index
should not be the only index. A combination of the multicolumn index
and a separate index on <literal>y</> would serve reasonably well. For
queries involving only <literal>x</>, the multicolumn index could be
used, though it would be larger and hence slower than an index on
......@@ -547,16 +548,16 @@ CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</repla
<para>
When an index is declared unique, multiple table rows with equal
indexed values will not be allowed. Null values are not considered
indexed values are not allowed. Null values are not considered
equal. A multicolumn unique index will only reject cases where all
of the indexed columns are equal in two rows.
indexed columns are equal in multiple rows.
</para>
<para>
<productname>PostgreSQL</productname> automatically creates a unique
index when a unique constraint or a primary key is defined for a table.
index when a unique constraint or primary key is defined for a table.
The index covers the columns that make up the primary key or unique
columns (a multicolumn index, if appropriate), and is the mechanism
constraint (a multicolumn index, if appropriate), and is the mechanism
that enforces the constraint.
</para>
......@@ -583,9 +584,9 @@ CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</repla
</indexterm>
<para>
An index column need not be just a column of the underlying table,
An index column need not be just a column of an underlying table,
but can be a function or scalar expression computed from one or
more columns of the table. This feature is useful to obtain fast
more columns of a table. This feature is useful to obtain fast
access to tables based on the results of computations.
</para>
......@@ -595,9 +596,9 @@ CREATE UNIQUE INDEX <replaceable>name</replaceable> ON <replaceable>table</repla
<programlisting>
SELECT * FROM test1 WHERE lower(col1) = 'value';
</programlisting>
This query can use an index, if one has been
This query can use an index if one has been
defined on the result of the <literal>lower(col1)</literal>
operation:
function:
<programlisting>
CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1));
</programlisting>
......@@ -612,7 +613,7 @@ CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1));
</para>
<para>
As another example, if one often does queries like this:
As another example, if one often does queries like:
<programlisting>
SELECT * FROM people WHERE (first_name || ' ' || last_name) = 'John Smith';
</programlisting>
......@@ -655,7 +656,7 @@ CREATE INDEX people_names ON people ((first_name || ' ' || last_name));
A <firstterm>partial index</firstterm> is an index built over a
subset of a table; the subset is defined by a conditional
expression (called the <firstterm>predicate</firstterm> of the
partial index). The index contains entries for only those table
partial index). The index contains entries only for those table
rows that satisfy the predicate. Partial indexes are a specialized
feature, but there are several situations in which they are useful.
</para>
......@@ -665,8 +666,8 @@ CREATE INDEX people_names ON people ((first_name || ' ' || last_name));
values. Since a query searching for a common value (one that
accounts for more than a few percent of all the table rows) will not
use the index anyway, there is no point in keeping those rows in the
index at all. This reduces the size of the index, which will speed
up queries that do use the index. It will also speed up many table
index. A partial index reduces the size of the index, which speeds
up queries that use the index. It will also speed up many table
update operations because the index does not need to be
updated in all cases. <xref linkend="indexes-partial-ex1"> shows a
possible application of this idea.
......@@ -700,39 +701,43 @@ CREATE TABLE access_log (
such as this:
<programlisting>
CREATE INDEX access_log_client_ip_ix ON access_log (client_ip)
WHERE NOT (client_ip &gt; inet '192.168.100.0' AND client_ip &lt; inet '192.168.100.255');
WHERE NOT (client_ip &gt; inet '192.168.100.0' AND
client_ip &lt; inet '192.168.100.255');
</programlisting>
</para>
<para>
A typical query that can use this index would be:
<programlisting>
SELECT * FROM access_log WHERE url = '/index.html' AND client_ip = inet '212.78.10.32';
SELECT *
FROM access_log
WHERE url = '/index.html' AND client_ip = inet '212.78.10.32';
</programlisting>
A query that cannot use this index is:
<programlisting>
SELECT * FROM access_log WHERE client_ip = inet '192.168.100.23';
SELECT *
FROM access_log
WHERE client_ip = inet '192.168.100.23';
</programlisting>
</para>
<para>
Observe that this kind of partial index requires that the common
values be predetermined. If the distribution of values is
inherent (due to the nature of the application) and static (not
changing over time), this is not difficult, but if the common values are
merely due to the coincidental data load this can require a lot of
maintenance work to change the index definition from time to time.
values be predetermined, so such partial indexes are best used for
data distribution that do not change. The indexes can be recreated
occasionally to adjust for new data distributions, but this adds
maintenance overhead.
</para>
</example>
<para>
Another possible use for a partial index is to exclude values from the
Another possible use for partial indexes is to exclude values from the
index that the
typical query workload is not interested in; this is shown in <xref
linkend="indexes-partial-ex2">. This results in the same
advantages as listed above, but it prevents the
<quote>uninteresting</quote> values from being accessed via that
index at all, even if an index scan might be profitable in that
index, even if an index scan might be profitable in that
case. Obviously, setting up partial indexes for this kind of
scenario will require a lot of care and experimentation.
</para>
......@@ -774,7 +779,7 @@ SELECT * FROM orders WHERE billed is not true AND amount &gt; 5000.00;
<programlisting>
SELECT * FROM orders WHERE order_nr = 3501;
</programlisting>
The order 3501 might be among the billed or among the unbilled
The order 3501 might be among the billed or unbilled
orders.
</para>
</example>
......@@ -799,9 +804,9 @@ SELECT * FROM orders WHERE order_nr = 3501;
<quote>x &lt; 1</quote> implies <quote>x &lt; 2</quote>; otherwise
the predicate condition must exactly match part of the query's
<literal>WHERE</> condition
or the index will not be recognized to be usable. Matching takes
or the index will not be recognized as usable. Matching takes
place at query planning time, not at run time. As a result,
parameterized query clauses will not work with a partial index. For
parameterized query clauses do not work with a partial index. For
example a prepared query with a parameter might specify
<quote>x &lt; ?</quote> which will never imply
<quote>x &lt; 2</quote> for all possible values of the parameter.
......@@ -835,7 +840,7 @@ CREATE TABLE tests (
CREATE UNIQUE INDEX tests_success_constraint ON tests (subject, target)
WHERE success;
</programlisting>
This is a particularly efficient way of doing it when there are few
This is a particularly efficient approach when there are few
successful tests and many unsuccessful ones.
</para>
</example>
......@@ -859,7 +864,7 @@ CREATE UNIQUE INDEX tests_success_constraint ON tests (subject, target)
know when an index might be profitable. Forming this knowledge
requires experience and understanding of how indexes in
<productname>PostgreSQL</> work. In most cases, the advantage of a
partial index over a regular index will not be much.
partial index over a regular index will be minimal.
</para>
<para>
......@@ -892,7 +897,7 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
would use the <literal>int4_ops</literal> class; this operator
class includes comparison functions for values of type <type>int4</type>.
In practice the default operator class for the column's data type is
usually sufficient. The main point of having operator classes is
usually sufficient. The main reason for having operator classes is
that for some data types, there could be more than one meaningful
index behavior. For example, we might want to sort a complex-number data
type either by absolute value or by real part. We could do this by
......@@ -931,7 +936,7 @@ CREATE INDEX test_index ON test_table (col varchar_pattern_ops);
to use an index. Such queries cannot use the
<literal><replaceable>xxx</replaceable>_pattern_ops</literal>
operator classes. (Ordinary equality comparisons can use these
operator classes, however.) It is allowed to create multiple
operator classes, however.) It is possible to create multiple
indexes on the same column with different operator classes.
If you do use the C locale, you do not need the
<literal><replaceable>xxx</replaceable>_pattern_ops</literal>
......@@ -990,7 +995,7 @@ SELECT am.amname AS index_method,
<para>
Although indexes in <productname>PostgreSQL</> do not need
maintenance and tuning, it is still important to check
maintenance or tuning, it is still important to check
which indexes are actually used by the real-life query workload.
Examining index usage for an individual query is done with the
<xref linkend="sql-explain" endterm="sql-explain-title">
......@@ -1002,10 +1007,10 @@ SELECT am.amname AS index_method,
<para>
It is difficult to formulate a general procedure for determining
which indexes to set up. There are a number of typical cases that
which indexes to create. There are a number of typical cases that
have been shown in the examples throughout the previous sections.
A good deal of experimentation will be necessary in most cases.
The rest of this section gives some tips for that.
A good deal of experimentation is often necessary.
The rest of this section gives some tips for that:
</para>
<itemizedlist>
......@@ -1014,7 +1019,7 @@ SELECT am.amname AS index_method,
Always run <xref linkend="sql-analyze" endterm="sql-analyze-title">
first. This command
collects statistics about the distribution of the values in the
table. This information is required to guess the number of rows
table. This information is required to estimate the number of rows
returned by a query, which is needed by the planner to assign
realistic costs to each possible query plan. In absence of any
real statistics, some default values are assumed, which are
......@@ -1035,13 +1040,13 @@ SELECT am.amname AS index_method,
It is especially fatal to use very small test data sets.
While selecting 1000 out of 100000 rows could be a candidate for
an index, selecting 1 out of 100 rows will hardly be, because the
100 rows will probably fit within a single disk page, and there
100 rows probably fit within a single disk page, and there
is no plan that can beat sequentially fetching 1 disk page.
</para>
<para>
Also be careful when making up test data, which is often
unavoidable when the application is not in production use yet.
unavoidable when the application is not yet in production.
Values that are very similar, completely random, or inserted in
sorted order will skew the statistics away from the distribution
that real data would have.
......@@ -1058,7 +1063,7 @@ SELECT am.amname AS index_method,
(<varname>enable_nestloop</>), which are the most basic plans,
will force the system to use a different plan. If the system
still chooses a sequential scan or nested-loop join then there is
probably a more fundamental reason why the index is not
probably a more fundamental reason why the index is not being
used; for example, the query condition does not match the index.
(What kind of query can use what kind of index is explained in
the previous sections.)
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/info.sgml,v 1.26 2008/01/09 02:37:45 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/info.sgml,v 1.27 2009/04/27 16:27:35 momjian Exp $ -->
<sect1 id="resources">
<title>Further Information</title>
......@@ -8,12 +8,17 @@
resources about <productname>PostgreSQL</productname>:
<variablelist>
<varlistentry>
<term>FAQs</term>
<term>Wiki</term>
<listitem>
<para>
The FAQ list <indexterm><primary>FAQ</></> contains
continuously updated answers to frequently asked questions.
The <productname>PostgreSQL</productname> <ulink
url="http://wiki.postgresql.org">wiki</ulink> contains the project's <ulink
url="http://wiki.postgresql.org/wiki/Frequently_Asked_Questions">FAQ</>
(Frequently Asked Questions) list, <ulink
url="http://wiki.postgresql.org/wiki/Todo">TODO</> list, and
detailed information about many more topics.
</para>
</listitem>
</varlistentry>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/install-win32.sgml,v 1.51 2009/01/09 13:37:18 petere Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/install-win32.sgml,v 1.52 2009/04/27 16:27:35 momjian Exp $ -->
<chapter id="install-win32">
<title>Installation from Source Code on <productname>Windows</productname></title>
......@@ -383,7 +383,7 @@
<para>
To build the <application>libpq</application> client library using
<productname>Visual Studio 7.1 or later</productname>, change into the
<filename>src</filename> directory and type the command
<filename>src</filename> directory and type the command:
<screen>
<userinput>nmake /f win32.mak</userinput>
</screen>
......@@ -392,7 +392,7 @@
To build a 64-bit version of the <application>libpq</application>
client library using <productname>Visual Studio 8.0 or
later</productname>, change into the <filename>src</filename>
directory and type in the command
directory and type in the command:
<screen>
<userinput>nmake /f win32.mak CPU=AMD64</userinput>
</screen>
......@@ -403,7 +403,7 @@
<para>
To build the <application>libpq</application> client library using
<productname>Borland C++</productname>, change into the
<filename>src</filename> directory and type the command
<filename>src</filename> directory and type the command:
<screen>
<userinput>make -N -DCFG=Release /f bcc32.mak</userinput>
</screen>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/installation.sgml,v 1.320 2009/03/23 01:52:38 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/installation.sgml,v 1.321 2009/04/27 16:27:35 momjian Exp $ -->
<chapter id="installation">
<title><![%standalone-include[<productname>PostgreSQL</>]]>
......@@ -11,7 +11,7 @@
<para>
This <![%standalone-include;[document]]>
<![%standalone-ignore;[chapter]]> describes the installation of
<productname>PostgreSQL</productname> from the source code
<productname>PostgreSQL</productname> using the source code
distribution. (If you are installing a pre-packaged distribution,
such as an RPM or Debian package, ignore this
<![%standalone-include;[document]]>
......@@ -75,7 +75,7 @@ su - postgres
refer to it by that name. (On some systems
<acronym>GNU</acronym> <application>make</> is the default tool with the name
<filename>make</>.) To test for <acronym>GNU</acronym>
<application>make</application> enter
<application>make</application> enter:
<screen>
<userinput>gmake --version</userinput>
</screen>
......@@ -85,9 +85,10 @@ su - postgres
<listitem>
<para>
You need an <acronym>ISO</>/<acronym>ANSI</> C compiler. Recent
You need an <acronym>ISO</>/<acronym>ANSI</> C compiler (minimum
C89-compliant). Recent
versions of <productname>GCC</> are recommendable, but
<productname>PostgreSQL</> is known to build with a wide variety
<productname>PostgreSQL</> is known to build using a wide variety
of compilers from different vendors.
</para>
</listitem>
......@@ -95,7 +96,7 @@ su - postgres
<listitem>
<para>
<application>tar</> is required to unpack the source
distribution in the first place, in addition to either
distribution, in addition to either
<application>gzip</> or <application>bzip2</>. In
addition, <application>gzip</> is required to install the
documentation.
......@@ -117,7 +118,7 @@ su - postgres
command you type, and allows you to use arrow keys to recall and
edit previous commands. This is very helpful and is strongly
recommended. If you don't want to use it then you must specify
the <option>--without-readline</option> option for
the <option>--without-readline</option> option of
<filename>configure</>. As an alternative, you can often use the
BSD-licensed <filename>libedit</filename> library, originally
developed on <productname>NetBSD</productname>. The
......@@ -140,7 +141,7 @@ su - postgres
The <productname>zlib</productname> compression library will be
used by default. If you don't want to use it then you must
specify the <option>--without-zlib</option> option for
specify the <option>--without-zlib</option> option to
<filename>configure</filename>. Using this option disables
support for compressed archives in <application>pg_dump</> and
<application>pg_restore</>.
......@@ -152,7 +153,7 @@ su - postgres
<para>
The following packages are optional. They are not required in the
default configuration, but they are needed when certain build
options are enabled, as explained below.
options are enabled, as explained below:
<itemizedlist>
<listitem>
......@@ -172,7 +173,8 @@ su - postgres
<para>
If you don't have the shared library but you need one, a message
like this will appear during the build to point out this fact:
like this will appear during the <productname>PostgreSQL</>
build to point out this fact:
<screen>
*** Cannot build PL/Perl because libperl is not a shared library.
*** You might have to rebuild your Perl installation. Refer to
......@@ -206,7 +208,7 @@ su - postgres
<filename>libpython</filename> library must be a shared library
also on most platforms. This is not the case in a default
<productname>Python</productname> installation. If after
building and installing you have a file called
building and installing <productname>PostgreSQL</> you have a file called
<filename>plpython.so</filename> (possibly a different
extension), then everything went well. Otherwise you should
have seen a notice like this flying by:
......@@ -216,7 +218,7 @@ su - postgres
*** the documentation for details.
</screen>
That means you have to rebuild (part of) your
<productname>Python</productname> installation to supply this
<productname>Python</productname> installation to create this
shared library.
</para>
......@@ -272,7 +274,7 @@ su - postgres
<para>
If you are building from a <acronym>CVS</acronym> tree instead of
using a released source package, or if you want to do development,
using a released source package, or if you want to do server development,
you also need the following packages:
<itemizedlist>
......@@ -314,7 +316,7 @@ su - postgres
Also check that you have sufficient disk space. You will need about
65 MB for the source tree during compilation and about 15 MB for
the installation directory. An empty database cluster takes about
25 MB, databases take about five times the amount of space that a
25 MB; databases take about five times the amount of space that a
flat text file with the same data would take. If you are going to
run the regression tests you will temporarily need up to an extra
90 MB. Use the <command>df</command> command to check free disk
......@@ -420,7 +422,7 @@ su - postgres
On systems that have <productname>PostgreSQL</> started at boot time,
there is probably a start-up file that will accomplish the same thing. For
example, on a <systemitem class="osname">Red Hat Linux</> system one
might find that
might find that:
<screen>
<userinput>/etc/rc.d/init.d/postgresql stop</userinput>
</screen>
......@@ -469,7 +471,7 @@ su - postgres
<step>
<para>
Start the database server, again from the special database user
Start the database server, again the special database user
account:
<programlisting>
<userinput>/usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data</>
......@@ -479,7 +481,7 @@ su - postgres
<step>
<para>
Finally, restore your data from backup with
Finally, restore your data from backup with:
<screen>
<userinput>/usr/local/pgsql/bin/psql -d postgres -f <replaceable>outputfile</></userinput>
</screen>
......@@ -514,12 +516,12 @@ su - postgres
The first step of the installation procedure is to configure the
source tree for your system and choose the options you would like.
This is done by running the <filename>configure</> script. For a
default installation simply enter
default installation simply enter:
<screen>
<userinput>./configure</userinput>
</screen>
This script will run a number of tests to guess values for various
system dependent variables and detect some quirks of your
This script will run a number of tests to determine values for various
system dependent variables and detect any quirks of your
operating system, and finally will create several files in the
build tree to record what it found. (You can also run
<filename>configure</filename> in a directory outside the source
......@@ -719,7 +721,7 @@ su - postgres
internal header files and the server header files are installed
into private directories under <varname>includedir</varname>. See
the documentation of each interface for information about how to
get at the its header files. Finally, a private subdirectory will
access its header files. Finally, a private subdirectory will
also be created, if appropriate, under <varname>libdir</varname>
for dynamically loadable modules.
</para>
......@@ -769,7 +771,7 @@ su - postgres
Enables Native Language Support (<acronym>NLS</acronym>),
that is, the ability to display a program's messages in a
language other than English.
<replaceable>LANGUAGES</replaceable> is a space-separated
<replaceable>LANGUAGES</replaceable> is an optional space-separated
list of codes of the languages that you want supported, for
example <literal>--enable-nls='de fr'</>. (The intersection
between your list and the set of actually provided
......@@ -927,11 +929,11 @@ su - postgres
and libpq]]><![%standalone-ignore[<xref linkend="libpq-ldap"> and
<xref linkend="auth-ldap">]]> for more information). On Unix,
this requires the <productname>OpenLDAP</> package to be
installed. <filename>configure</> will check for the required
installed. On Windows, the default <productname>WinLDAP</>
library is used. <filename>configure</> will check for the required
header files and libraries to make sure that your
<productname>OpenLDAP</> installation is sufficient before
proceeding. On Windows, the default <productname>WinLDAP</>
library is used.
proceeding.
</para>
</listitem>
</varlistentry>
......@@ -1225,7 +1227,7 @@ su - postgres
<listitem>
<para>
Compiles all programs and libraries with debugging symbols.
This means that you can run the programs through a debugger
This means that you can run the programs in a debugger
to analyze problems. This enlarges the size of the installed
executables considerably, and on non-GCC compilers it usually
also disables compiler optimization, causing slowdowns. However,
......@@ -1293,7 +1295,7 @@ su - postgres
be rebuilt when any header file is changed. This is useful
if you are doing development work, but is just wasted overhead
if you intend only to compile once and install. At present,
this option will work only if you use GCC.
this option only works with GCC.
</para>
</listitem>
</varlistentry>
......@@ -1510,13 +1512,13 @@ su - postgres
<title>Build</title>
<para>
To start the build, type
To start the build, type:
<screen>
<userinput>gmake</userinput>
</screen>
(Remember to use <acronym>GNU</> <application>make</>.) The build
will take a few minutes depending on your
hardware. The last line displayed should be
hardware. The last line displayed should be:
<screen>
All of PostgreSQL is successfully made. Ready to install.
</screen>
......@@ -1535,7 +1537,7 @@ All of PostgreSQL is successfully made. Ready to install.
you can run the regression tests at this point. The regression
tests are a test suite to verify that <productname>PostgreSQL</>
runs on your machine in the way the developers expected it
to. Type
to. Type:
<screen>
<userinput>gmake check</userinput>
</screen>
......@@ -1550,7 +1552,7 @@ All of PostgreSQL is successfully made. Ready to install.
</step>
<step id="install">
<title>Installing The Files</title>
<title>Installing the Files</title>
<note>
<para>
......@@ -1562,14 +1564,14 @@ All of PostgreSQL is successfully made. Ready to install.
</note>
<para>
To install <productname>PostgreSQL</> enter
To install <productname>PostgreSQL</> enter:
<screen>
<userinput>gmake install</userinput>
</screen>
This will install files into the directories that were specified
in <xref linkend="configure">. Make sure that you have appropriate
permissions to write into that area. Normally you need to do this
step as root. Alternatively, you could create the target
step as root. Alternatively, you can create the target
directories in advance and arrange for appropriate permissions to
be granted.
</para>
......@@ -1639,14 +1641,14 @@ All of PostgreSQL is successfully made. Ready to install.
<title>Cleaning:</title>
<para>
After the installation you can make room by removing the built
After the installation you can free disk space by removing the built
files from the source tree with the command <command>gmake
clean</>. This will preserve the files made by the <command>configure</command>
program, so that you can rebuild everything with <command>gmake</>
later on. To reset the source tree to the state in which it was
distributed, use <command>gmake distclean</>. If you are going to
build for several platforms within the same source tree you must do
this and re-configure for each build. (Alternatively, use
this and rebuild for each platform. (Alternatively, use
a separate build tree for each platform, so that the source tree
remains unmodified.)
</para>
......@@ -1673,8 +1675,8 @@ All of PostgreSQL is successfully made. Ready to install.
</indexterm>
<para>
On some systems that have shared libraries (which most systems do)
you need to tell your system how to find the newly installed
On several systems with shared libraries
you need to tell the system how to find the newly installed
shared libraries. The systems on which this is
<emphasis>not</emphasis> necessary include <systemitem
class="osname">BSD/OS</>, <systemitem class="osname">FreeBSD</>,
......@@ -1688,7 +1690,7 @@ All of PostgreSQL is successfully made. Ready to install.
<para>
The method to set the shared library search path varies between
platforms, but the most widely usable method is to set the
platforms, but the most widely-used method is to set the
environment variable <envar>LD_LIBRARY_PATH</> like so: In Bourne
shells (<command>sh</>, <command>ksh</>, <command>bash</>, <command>zsh</>):
<programlisting>
......@@ -1724,7 +1726,7 @@ setenv LD_LIBRARY_PATH /usr/local/pgsql/lib
<para>
If in doubt, refer to the manual pages of your system (perhaps
<command>ld.so</command> or <command>rld</command>). If you later
on get a message like
get a message like:
<screen>
psql: error in loading shared libraries
libpq.so.2.1: cannot open shared object file: No such file or directory
......@@ -1776,7 +1778,7 @@ libpq.so.2.1: cannot open shared object file: No such file or directory
<para>
To do this, add the following to your shell start-up file, such as
<filename>~/.bash_profile</> (or <filename>/etc/profile</>, if you
want it to affect every user):
want it to affect all users):
<programlisting>
PATH=/usr/local/pgsql/bin:$PATH
export PATH
......@@ -1807,7 +1809,7 @@ export MANPATH
server, overriding the compiled-in defaults. If you are going to
run client applications remotely then it is convenient if every
user that plans to use the database sets <envar>PGHOST</>. This
is not required, however: the settings can be communicated via command
is not required, however; the settings can be communicated via command
line options to most client programs.
</para>
</sect2>
......@@ -1902,7 +1904,7 @@ kill `cat /usr/local/pgsql/data/postmaster.pid`
<screen>
<userinput>createdb testdb</>
</screen>
Then enter
Then enter:
<screen>
<userinput>psql testdb</>
</screen>
......@@ -2950,7 +2952,7 @@ LIBOBJS = snprintf.o
<para>
If you see the linking of the postgres executable abort with an
error message like
error message like:
<screen>
Undefined first referenced
symbol in file
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/intro.sgml,v 1.34 2009/01/27 12:40:14 petere Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/intro.sgml,v 1.35 2009/04/27 16:27:36 momjian Exp $ -->
<preface id="preface">
<title>Preface</title>
<para>
This book is the official documentation of
<productname>PostgreSQL</productname>. It is being written by the
<productname>PostgreSQL</productname>. It has been written by the
<productname>PostgreSQL</productname> developers and other
volunteers in parallel to the development of the
<productname>PostgreSQL</productname> software. It describes all
......@@ -58,7 +58,7 @@
<para>
<xref linkend="server-programming"> contains information for
advanced users about the extensibility capabilities of the
server. Topics are, for instance, user-defined data types and
server. Topics include user-defined data types and
functions.
</para>
</listitem>
......@@ -148,7 +148,7 @@
<para>
And because of the liberal license,
<productname>PostgreSQL</productname> can be used, modified, and
distributed by everyone free of charge for any purpose, be it
distributed by anyone free of charge for any purpose, be it
private, commercial, or academic.
</para>
</sect1>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/libpq.sgml,v 1.287 2009/04/24 14:10:41 mha Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/libpq.sgml,v 1.288 2009/04/27 16:27:36 momjian Exp $ -->
<chapter id="libpq">
<title><application>libpq</application> - C Library</title>
......@@ -6633,7 +6633,7 @@ myEventProc(PGEventId evtId, void *evtInfo, void *passThrough)
#include &lt;libpq-fe.h&gt;
</programlisting>
If you failed to do that then you will normally get error messages
from your compiler similar to
from your compiler similar to:
<screen>
foo.c: In function `main':
foo.c:34: `PGconn' undeclared (first use in this function)
......@@ -6679,7 +6679,7 @@ CPPFLAGS += -I/usr/local/pgsql/include
<para>
Failure to specify the correct option to the compiler will
result in an error message such as
result in an error message such as:
<screen>
testlibpq.c:8:22: libpq-fe.h: No such file or directory
</screen>
......@@ -6713,7 +6713,7 @@ cc -o testprog testprog1.o testprog2.o -L/usr/local/pgsql/lib -lpq
<para>
Error messages that point to problems in this area could look like
the following.
the following:
<screen>
testlibpq.o: In function `main':
testlibpq.o(.text+0x60): undefined reference to `PQsetdbLogin'
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/monitoring.sgml,v 1.68 2009/04/10 03:13:36 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/monitoring.sgml,v 1.69 2009/04/27 16:27:36 momjian Exp $ -->
<chapter id="monitoring">
<title>Monitoring Database Activity</title>
......@@ -929,7 +929,7 @@ postgres: <replaceable>user</> <replaceable>database</> <replaceable>host</> <re
<function>read()</> calls issued for the table, index, or
database; the number of actual physical reads is usually
lower due to kernel-level buffering. The <literal>*_blks_read</>
statistics columns uses this subtraction, i.e. fetched minus hit.
statistics columns uses this subtraction, i.e., fetched minus hit.
</para>
</note>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/mvcc.sgml,v 2.70 2009/02/04 16:05:50 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/mvcc.sgml,v 2.71 2009/04/27 16:27:36 momjian Exp $ -->
<chapter id="mvcc">
<title>Concurrency Control</title>
......@@ -43,7 +43,7 @@
</para>
<para>
The main advantage to using the <acronym>MVCC</acronym> model of
The main advantage of using the <acronym>MVCC</acronym> model of
concurrency control rather than locking is that in
<acronym>MVCC</acronym> locks acquired for querying (reading) data
do not conflict with locks acquired for writing data, and so
......@@ -246,7 +246,7 @@
committed before the query began; it never sees either uncommitted
data or changes committed during query execution by concurrent
transactions. In effect, a <command>SELECT</command> query sees
a snapshot of the database as of the instant the query begins to
a snapshot of the database at the instant the query begins to
run. However, <command>SELECT</command> does see the effects
of previous updates executed within its own transaction, even
though they are not yet committed. Also note that two successive
......@@ -260,7 +260,7 @@
FOR UPDATE</command>, and <command>SELECT FOR SHARE</command> commands
behave the same as <command>SELECT</command>
in terms of searching for target rows: they will only find target rows
that were committed as of the command start time. However, such a target
that were committed before the command start time. However, such a target
row might have already been updated (or deleted or locked) by
another concurrent transaction by the time it is found. In this case, the
would-be updater will wait for the first updating transaction to commit or
......@@ -296,7 +296,7 @@ COMMIT;
</screen>
If two such transactions concurrently try to change the balance of account
12345, we clearly want the second transaction to start from the updated
12345, we clearly want the second transaction to start with the updated
version of the account's row. Because each command is affecting only a
predetermined row, letting it see the updated version of the row does
not create any troublesome inconsistency.
......@@ -306,7 +306,7 @@ COMMIT;
More complex usage can produce undesirable results in Read Committed
mode. For example, consider a <command>DELETE</command> command
operating on data that is being both added and removed from its
restriction criteria by another command, e.g. assume
restriction criteria by another command, e.g., assume
<literal>website</literal> is a two-row table with
<literal>website.hits</literal> equaling <literal>9</literal> and
<literal>10</literal>:
......@@ -354,7 +354,7 @@ COMMIT;
</indexterm>
<para>
The level <firstterm>Serializable</firstterm> provides the strictest transaction
The <firstterm>Serializable</firstterm> isolation level provides the strictest transaction
isolation. This level emulates serial transaction execution,
as if transactions had been executed one after another, serially,
rather than concurrently. However, applications using this level must
......@@ -362,19 +362,21 @@ COMMIT;
</para>
<para>
When a transaction is on the serializable level,
a <command>SELECT</command> query sees only data committed before the
When a transaction is using the serializable level,
a <command>SELECT</command> query only sees data committed before the
transaction began; it never sees either uncommitted data or changes
committed
during transaction execution by concurrent transactions. (However, the
during transaction execution by concurrent transactions. (However,
<command>SELECT</command> does see the effects of previous updates
executed within its own transaction, even though they are not yet
committed.) This is different from Read Committed in that the
<command>SELECT</command>
sees a snapshot as of the start of the transaction, not as of the start
committed.) This is different from Read Committed in that
<command>SELECT</command> in a serializable transaction
sees a snapshot as of the start of the <emphasis>transaction</>, not as of the start
of the current query within the transaction. Thus, successive
<command>SELECT</command> commands within a single transaction always see the same
data.
<command>SELECT</command> commands within a <emphasis>single</>
transaction see the same data, i.e. they never see changes made by
transactions that committed after its own transaction started. (This
behavior can be ideal for reporting applications.)
</para>
<para>
......@@ -382,7 +384,7 @@ COMMIT;
FOR UPDATE</command>, and <command>SELECT FOR SHARE</command> commands
behave the same as <command>SELECT</command>
in terms of searching for target rows: they will only find target rows
that were committed as of the transaction start time. However, such a
that were committed before the transaction start time. However, such a
target
row might have already been updated (or deleted or locked) by
another concurrent transaction by the time it is found. In this case, the
......@@ -402,9 +404,9 @@ ERROR: could not serialize access due to concurrent update
</para>
<para>
When the application receives this error message, it should abort
the current transaction and then retry the whole transaction from
the beginning. The second time through, the transaction sees the
When an application receives this error message, it should abort
the current transaction and retry the whole transaction from
the beginning. The second time through, the transaction will see the
previously-committed change as part of its initial view of the database,
so there is no logical conflict in using the new version of the row
as the starting point for the new transaction's update.
......@@ -420,8 +422,8 @@ ERROR: could not serialize access due to concurrent update
transaction sees a wholly consistent view of the database. However,
the application has to be prepared to retry transactions when concurrent
updates make it impossible to sustain the illusion of serial execution.
Since the cost of redoing complex transactions might be significant,
this mode is recommended only when updating transactions contain logic
Since the cost of redoing complex transactions can be significant,
serializable mode is recommended only when updating transactions contain logic
sufficiently complex that they might give wrong answers in Read
Committed mode. Most commonly, Serializable mode is necessary when
a transaction executes several successive commands that must see
......@@ -449,7 +451,7 @@ ERROR: could not serialize access due to concurrent update
is not sufficient to guarantee true serializability, and in fact
<productname>PostgreSQL</productname>'s Serializable mode <emphasis>does
not guarantee serializable execution in this sense</>. As an example,
consider a table <structname>mytab</>, initially containing
consider a table <structname>mytab</>, initially containing:
<screen>
class | value
-------+-------
......@@ -458,18 +460,18 @@ ERROR: could not serialize access due to concurrent update
2 | 100
2 | 200
</screen>
Suppose that serializable transaction A computes
Suppose that serializable transaction A computes:
<screen>
SELECT SUM(value) FROM mytab WHERE class = 1;
</screen>
and then inserts the result (30) as the <structfield>value</> in a
new row with <structfield>class</> = 2. Concurrently, serializable
transaction B computes
new row with <structfield>class</><literal> = 2</>. Concurrently, serializable
transaction B computes:
<screen>
SELECT SUM(value) FROM mytab WHERE class = 2;
</screen>
and obtains the result 300, which it inserts in a new row with
<structfield>class</> = 1. Then both transactions commit. None of
<structfield>class</><literal> = 1</>. Then both transactions commit. None of
the listed undesirable behaviors have occurred, yet we have a result
that could not have occurred in either order serially. If A had
executed before B, B would have computed the sum 330, not 300, and
......@@ -505,7 +507,7 @@ SELECT SUM(value) FROM mytab WHERE class = 2;
</para>
<para>
In those cases where the possibility of nonserializable execution
In cases where the possibility of non-serializable execution
is a real hazard, problems can be prevented by appropriate use of
explicit locking. Further discussion appears in the following
sections.
......@@ -588,7 +590,7 @@ SELECT SUM(value) FROM mytab WHERE class = 2;
<para>
The <command>SELECT</command> command acquires a lock of this mode on
referenced tables. In general, any query that only reads a table
referenced tables. In general, any query that only <emphasis>reads</> a table
and does not modify it will acquire this lock mode.
</para>
</listitem>
......@@ -632,7 +634,7 @@ SELECT SUM(value) FROM mytab WHERE class = 2;
acquire this lock mode on the target table (in addition to
<literal>ACCESS SHARE</literal> locks on any other referenced
tables). In general, this lock mode will be acquired by any
command that modifies the data in a table.
command that <emphasis>modifies data</> in a table.
</para>
</listitem>
</varlistentry>
......@@ -664,10 +666,9 @@ SELECT SUM(value) FROM mytab WHERE class = 2;
</term>
<listitem>
<para>
Conflicts with the <literal>ROW EXCLUSIVE</literal>,
<literal>SHARE UPDATE EXCLUSIVE</literal>, <literal>SHARE ROW
EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
<literal>ACCESS EXCLUSIVE</literal> lock modes.
Conflicts all lock modes except <literal>ACCESS SHARE</literal>,
<literal>ROW SHARE</literal>, and <literal>SHARE</literal> (it
does not conflict with itself).
This mode protects a table against concurrent data changes.
</para>
......@@ -684,11 +685,8 @@ SELECT SUM(value) FROM mytab WHERE class = 2;
</term>
<listitem>
<para>
Conflicts with the <literal>ROW EXCLUSIVE</literal>,
<literal>SHARE UPDATE EXCLUSIVE</literal>,
<literal>SHARE</literal>, <literal>SHARE ROW
EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
<literal>ACCESS EXCLUSIVE</literal> lock modes.
Conflicts all lock modes except <literal>ACCESS SHARE</literal>
and <literal>ROW SHARE</literal>.
</para>
<para>
......@@ -704,11 +702,7 @@ SELECT SUM(value) FROM mytab WHERE class = 2;
</term>
<listitem>
<para>
Conflicts with the <literal>ROW SHARE</literal>, <literal>ROW
EXCLUSIVE</literal>, <literal>SHARE UPDATE
EXCLUSIVE</literal>, <literal>SHARE</literal>, <literal>SHARE
ROW EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
<literal>ACCESS EXCLUSIVE</literal> lock modes.
Conflicts all lock modes except <literal>ACCESS SHARE</literal>.
This mode allows only concurrent <literal>ACCESS SHARE</literal> locks,
i.e., only reads from the table can proceed in parallel with a
transaction holding this lock mode.
......@@ -717,7 +711,7 @@ SELECT SUM(value) FROM mytab WHERE class = 2;
<para>
This lock mode is not automatically acquired on user tables by any
<productname>PostgreSQL</productname> command. However it is
acquired on certain system catalogs in some operations.
acquired during certain internal system catalogs operations.
</para>
</listitem>
</varlistentry>
......@@ -728,12 +722,7 @@ SELECT SUM(value) FROM mytab WHERE class = 2;
</term>
<listitem>
<para>
Conflicts with locks of all modes (<literal>ACCESS
SHARE</literal>, <literal>ROW SHARE</literal>, <literal>ROW
EXCLUSIVE</literal>, <literal>SHARE UPDATE
EXCLUSIVE</literal>, <literal>SHARE</literal>, <literal>SHARE
ROW EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
<literal>ACCESS EXCLUSIVE</literal>).
Conflicts with all lock modes.
This mode guarantees that the
holder is the only transaction accessing the table in any way.
</para>
......@@ -760,7 +749,7 @@ SELECT SUM(value) FROM mytab WHERE class = 2;
<para>
Once acquired, a lock is normally held till end of transaction. But if a
lock is acquired after establishing a savepoint, the lock is released
immediately if the savepoint is rolled back to. This is consistent with
immediately if the savepoint is rolled back. This is consistent with
the principle that <command>ROLLBACK</> cancels all effects of the
commands since the savepoint. The same holds for locks acquired within a
<application>PL/pgSQL</> exception block: an error escape from the block
......@@ -893,9 +882,9 @@ SELECT SUM(value) FROM mytab WHERE class = 2;
can be exclusive or shared locks. An exclusive row-level lock on a
specific row is automatically acquired when the row is updated or
deleted. The lock is held until the transaction commits or rolls
back, in just the same way as for table-level locks. Row-level locks do
not affect data querying; they block <emphasis>writers to the same
row</emphasis> only.
back, like table-level locks. Row-level locks do
not affect data querying; they only block <emphasis>writers to the same
row</emphasis>.
</para>
<para>
......@@ -917,10 +906,10 @@ SELECT SUM(value) FROM mytab WHERE class = 2;
<para>
<productname>PostgreSQL</productname> doesn't remember any
information about modified rows in memory, so it has no limit to
information about modified rows in memory, so there is no limit on
the number of rows locked at one time. However, locking a row
might cause a disk write; thus, for example, <command>SELECT FOR
UPDATE</command> will modify selected rows to mark them locked, and so
might cause a disk write, e.g., <command>SELECT FOR
UPDATE</command> modifies selected rows to mark them locked, and so
will result in disk writes.
</para>
......@@ -929,7 +918,7 @@ SELECT SUM(value) FROM mytab WHERE class = 2;
used to control read/write access to table pages in the shared buffer
pool. These locks are released immediately after a row is fetched or
updated. Application developers normally need not be concerned with
page-level locks, but we mention them for completeness.
page-level locks, but they are mentioned for completeness.
</para>
</sect2>
......@@ -953,14 +942,14 @@ SELECT SUM(value) FROM mytab WHERE class = 2;
deadlock situations and resolves them by aborting one of the
transactions involved, allowing the other(s) to complete.
(Exactly which transaction will be aborted is difficult to
predict and should not be relied on.)
predict and should not be relied upon.)
</para>
<para>
Note that deadlocks can also occur as the result of row-level
locks (and thus, they can occur even if explicit locking is not
used). Consider the case in which there are two concurrent
transactions modifying a table. The first transaction executes:
used). Consider the case in which two concurrent
transactions modify a table. The first transaction executes:
<screen>
UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 11111;
......@@ -1003,10 +992,10 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
above, if both transactions
had updated the rows in the same order, no deadlock would have
occurred. One should also ensure that the first lock acquired on
an object in a transaction is the highest mode that will be
an object in a transaction is the most restrictive mode that will be
needed for that object. If it is not feasible to verify this in
advance, then deadlocks can be handled on-the-fly by retrying
transactions that are aborted due to deadlock.
transactions that abort due to deadlocks.
</para>
<para>
......@@ -1055,7 +1044,7 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
<xref linkend="guc-max-locks-per-transaction"> and
<xref linkend="guc-max-connections">.
Care must be taken not to exhaust this
memory or the server will not be able to grant any locks at all.
memory or the server will be unable to grant any locks at all.
This imposes an upper limit on the number of advisory locks
grantable by the server, typically in the tens to hundreds of thousands
depending on how the server is configured.
......@@ -1068,7 +1057,7 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
While a flag stored in a table could be used for the same purpose,
advisory locks are faster, avoid MVCC bloat, and are automatically
cleaned up by the server at the end of the session.
In certain cases using this method, especially in queries
In certain cases using this advisory locking method, especially in queries
involving explicit ordering and <literal>LIMIT</> clauses, care must be
taken to control the locks acquired because of the order in which SQL
expressions are evaluated. For example:
......@@ -1109,9 +1098,9 @@ SELECT pg_advisory_lock(q.id) FROM
if a row is returned by <command>SELECT</command> it doesn't mean that
the row is still current at the instant it is returned (i.e., sometime
after the current query began). The row might have been modified or
deleted by an already-committed transaction that committed after this one
started.
Even if the row is still valid <quote>now</quote>, it could be changed or
deleted by an already-committed transaction that committed after
the <command>SELECT</command> started.
Even if the row is still valid <emphasis>now</>, it could be changed or
deleted
before the current transaction does a commit or rollback.
</para>
......@@ -1132,7 +1121,7 @@ SELECT pg_advisory_lock(q.id) FROM
concurrent updates one must use <command>SELECT FOR UPDATE</command>,
<command>SELECT FOR SHARE</command>, or an appropriate <command>LOCK
TABLE</command> statement. (<command>SELECT FOR UPDATE</command>
or <command>SELECT FOR SHARE</command> locks just the
or <command>SELECT FOR SHARE</command> lock just the
returned rows against concurrent updates, while <command>LOCK
TABLE</command> locks the whole table.) This should be taken into
account when porting applications to
......@@ -1144,10 +1133,10 @@ SELECT pg_advisory_lock(q.id) FROM
For example, a banking application might wish to check that the sum of
all credits in one table equals the sum of debits in another table,
when both tables are being actively updated. Comparing the results of two
successive <literal>SELECT sum(...)</literal> commands will not work reliably under
successive <literal>SELECT sum(...)</literal> commands will not work reliably in
Read Committed mode, since the second query will likely include the results
of transactions not counted by the first. Doing the two sums in a
single serializable transaction will give an accurate picture of the
single serializable transaction will give an accurate picture of only the
effects of transactions that committed before the serializable transaction
started &mdash; but one might legitimately wonder whether the answer is still
relevant by the time it is delivered. If the serializable transaction
......@@ -1164,8 +1153,8 @@ SELECT pg_advisory_lock(q.id) FROM
<para>
Note also that if one is
relying on explicit locking to prevent concurrent changes, one should use
Read Committed mode, or in Serializable mode be careful to obtain the
lock(s) before performing queries. A lock obtained by a
either Read Committed mode, or in Serializable mode be careful to obtain
locks before performing queries. A lock obtained by a
serializable transaction guarantees that no other transactions modifying
the table are still running, but if the snapshot seen by the
transaction predates obtaining the lock, it might predate some now-committed
......@@ -1173,7 +1162,7 @@ SELECT pg_advisory_lock(q.id) FROM
frozen at the start of its first query or data-modification command
(<literal>SELECT</>, <literal>INSERT</>,
<literal>UPDATE</>, or <literal>DELETE</>), so
it's possible to obtain locks explicitly before the snapshot is
it is often desirable to obtain locks explicitly before the snapshot is
frozen.
</para>
</sect1>
......@@ -1189,7 +1178,7 @@ SELECT pg_advisory_lock(q.id) FROM
<para>
Though <productname>PostgreSQL</productname>
provides nonblocking read/write access to table
data, nonblocking read/write access is not currently offered for every
data, nonblocking read/write access is currently not offered for every
index access method implemented
in <productname>PostgreSQL</productname>.
The various index types are handled as follows:
......@@ -1232,8 +1221,8 @@ SELECT pg_advisory_lock(q.id) FROM
<para>
Short-term share/exclusive page-level locks are used for
read/write access. Locks are released immediately after each
index row is fetched or inserted. But note that a GIN-indexed
value insertion usually produces several index key insertions
index row is fetched or inserted. But note insertion of a GIN-indexed
value usually produces several index key insertions
per row, so GIN might do substantial work for a single value's
insertion.
</para>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.69 2008/12/13 19:13:43 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.70 2009/04/27 16:27:36 momjian Exp $ -->
<chapter id="performance-tips">
<title>Performance Tips</title>
......@@ -9,7 +9,7 @@
<para>
Query performance can be affected by many things. Some of these can
be manipulated by the user, while others are fundamental to the underlying
be controlled by the user, while others are fundamental to the underlying
design of the system. This chapter provides some hints about understanding
and tuning <productname>PostgreSQL</productname> performance.
</para>
......@@ -27,10 +27,10 @@
<para>
<productname>PostgreSQL</productname> devises a <firstterm>query
plan</firstterm> for each query it is given. Choosing the right
plan</firstterm> for each query it receives. Choosing the right
plan to match the query structure and the properties of the data
is absolutely critical for good performance, so the system includes
a complex <firstterm>planner</> that tries to select good plans.
a complex <firstterm>planner</> that tries to choose good plans.
You can use the
<xref linkend="sql-explain" endterm="sql-explain-title"> command
to see what query plan the planner creates for any query.
......@@ -40,14 +40,13 @@
<para>
The structure of a query plan is a tree of <firstterm>plan nodes</>.
Nodes at the bottom level are table scan nodes: they return raw rows
Nodes at the bottom level of the tree are table scan nodes: they return raw rows
from a table. There are different types of scan nodes for different
table access methods: sequential scans, index scans, and bitmap index
scans. If the query requires joining, aggregation, sorting, or other
operations on the raw rows, then there will be additional nodes
<quote>atop</> the scan nodes to perform these operations. Again,
there is usually more than one possible way to do these operations,
so different node types can appear here too. The output
above the scan nodes to perform these operations. Other nodes types
are also supported. The output
of <command>EXPLAIN</command> has one line for each node in the plan
tree, showing the basic node type plus the cost estimates that the planner
made for the execution of that plan node. The first line (topmost node)
......@@ -56,15 +55,15 @@
</para>
<para>
Here is a trivial example, just to show what the output looks like.
Here is a trivial example, just to show what the output looks like:
<footnote>
<para>
Examples in this section are drawn from the regression test database
after doing a <command>VACUUM ANALYZE</>, using 8.2 development sources.
You should be able to get similar results if you try the examples yourself,
but your estimated costs and row counts will probably vary slightly
but your estimated costs and row counts might vary slightly
because <command>ANALYZE</>'s statistics are random samples rather
than being exact.
than exact.
</para>
</footnote>
......@@ -78,22 +77,23 @@ EXPLAIN SELECT * FROM tenk1;
</para>
<para>
The numbers that are quoted by <command>EXPLAIN</command> are:
The numbers that are quoted by <command>EXPLAIN</command> are (left
to right):
<itemizedlist>
<listitem>
<para>
Estimated start-up cost (Time expended before output scan can start,
e.g., time to do the sorting in a sort node.)
Estimated start-up cost, e.g., time expended before the output scan can start,
time to do the sorting in a sort node
</para>
</listitem>
<listitem>
<para>
Estimated total cost (If all rows were to be retrieved, though they might
not be: for example, a query with a <literal>LIMIT</> clause will stop
short of paying the total cost of the <literal>Limit</> plan node's
input node.)
Estimated total cost if all rows were to be retrieved (though they might
not be, e.g., a query with a <literal>LIMIT</> clause will stop
short of paying the total cost of the <literal>Limit</> node's
input node)
</para>
</listitem>
......@@ -119,8 +119,8 @@ EXPLAIN SELECT * FROM tenk1;
Traditional practice is to measure the costs in units of disk page
fetches; that is, <xref linkend="guc-seq-page-cost"> is conventionally
set to <literal>1.0</> and the other cost parameters are set relative
to that. The examples in this section are run with the default cost
parameters.
to that. (The examples in this section are run with the default cost
parameters.)
</para>
<para>
......@@ -129,17 +129,18 @@ EXPLAIN SELECT * FROM tenk1;
the cost only reflects things that the planner cares about.
In particular, the cost does not consider the time spent transmitting
result rows to the client, which could be an important
factor in the true elapsed time; but the planner ignores it because
factor in the total elapsed time; but the planner ignores it because
it cannot change it by altering the plan. (Every correct plan will
output the same row set, we trust.)
</para>
<para>
Rows output is a little tricky because it is <emphasis>not</emphasis> the
The <command>EXPLAIN</command> <literal>rows=</> value is a little tricky
because it is <emphasis>not</emphasis> the
number of rows processed or scanned by the plan node. It is usually less,
reflecting the estimated selectivity of any <literal>WHERE</>-clause
conditions that are being
applied at the node. Ideally the top-level rows estimate will
applied to the node. Ideally the top-level rows estimate will
approximate the number of rows actually returned, updated, or deleted
by the query.
</para>
......@@ -163,16 +164,16 @@ EXPLAIN SELECT * FROM tenk1;
SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1';
</programlisting>
you will find out that <classname>tenk1</classname> has 358 disk
pages and 10000 rows. The estimated cost is (disk pages read *
you will find that <classname>tenk1</classname> has 358 disk
pages and 10000 rows. The estimated cost is computed as (disk pages read *
<xref linkend="guc-seq-page-cost">) + (rows scanned *
<xref linkend="guc-cpu-tuple-cost">). By default,
<varname>seq_page_cost</> is 1.0 and <varname>cpu_tuple_cost</> is 0.01.
So the estimated cost is (358 * 1.0) + (10000 * 0.01) = 458.
<varname>seq_page_cost</> is 1.0 and <varname>cpu_tuple_cost</> is 0.01,
so the estimated cost is (358 * 1.0) + (10000 * 0.01) = 458.
</para>
<para>
Now let's modify the query to add a <literal>WHERE</> condition:
Now let's modify the original query to add a <literal>WHERE</> condition:
<programlisting>
EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 7000;
......@@ -187,7 +188,7 @@ EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 7000;
clause being applied as a <quote>filter</> condition; this means that
the plan node checks the condition for each row it scans, and outputs
only the ones that pass the condition.
The estimate of output rows has gone down because of the <literal>WHERE</>
The estimate of output rows has been reduced because of the <literal>WHERE</>
clause.
However, the scan will still have to visit all 10000 rows, so the cost
hasn't decreased; in fact it has gone up a bit (by 10000 * <xref
......@@ -196,7 +197,7 @@ EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 7000;
</para>
<para>
The actual number of rows this query would select is 7000, but the rows
The actual number of rows this query would select is 7000, but the <literal>rows=</>
estimate is only approximate. If you try to duplicate this experiment,
you will probably get a slightly different estimate; moreover, it will
change after each <command>ANALYZE</command> command, because the
......@@ -224,16 +225,16 @@ EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 100;
from the table itself. Fetching the rows separately is much more
expensive than sequentially reading them, but because not all the pages
of the table have to be visited, this is still cheaper than a sequential
scan. (The reason for using two levels of plan is that the upper plan
scan. (The reason for using two plan levels is that the upper plan
node sorts the row locations identified by the index into physical order
before reading them, so as to minimize the costs of the separate fetches.
before reading them, to minimize the cost of separate fetches.
The <quote>bitmap</> mentioned in the node names is the mechanism that
does the sorting.)
</para>
<para>
If the <literal>WHERE</> condition is selective enough, the planner might
switch to a <quote>simple</> index scan plan:
switch to a <emphasis>simple</> index scan plan:
<programlisting>
EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 3;
......@@ -247,8 +248,8 @@ EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 3;
In this case the table rows are fetched in index order, which makes them
even more expensive to read, but there are so few that the extra cost
of sorting the row locations is not worth it. You'll most often see
this plan type for queries that fetch just a single row, and for queries
that request an <literal>ORDER BY</> condition that matches the index
this plan type in queries that fetch just a single row, and for queries
with an <literal>ORDER BY</> condition that matches the index
order.
</para>
......@@ -271,11 +272,11 @@ EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 3 AND stringu1 = 'xxx';
cannot be applied as an index condition (since this index is only on
the <literal>unique1</> column). Instead it is applied as a filter on
the rows retrieved by the index. Thus the cost has actually gone up
a little bit to reflect this extra checking.
slightly to reflect this extra checking.
</para>
<para>
If there are indexes on several columns used in <literal>WHERE</>, the
If there are indexes on several columns referenced in <literal>WHERE</>, the
planner might choose to use an AND or OR combination of the indexes:
<programlisting>
......@@ -302,7 +303,9 @@ EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 100 AND unique2 &gt; 9000;
Let's try joining two tables, using the columns we have been discussing:
<programlisting>
EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2;
EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2;
QUERY PLAN
--------------------------------------------------------------------------------------
......@@ -317,12 +320,12 @@ EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 100 AND t1.unique
</para>
<para>
In this nested-loop join, the outer scan is the same bitmap index scan we
In this nested-loop join, the outer scan (upper) is the same bitmap index scan we
saw earlier, and so its cost and row count are the same because we are
applying the <literal>WHERE</> clause <literal>unique1 &lt; 100</literal>
at that node.
The <literal>t1.unique2 = t2.unique2</literal> clause is not relevant yet,
so it doesn't affect row count of the outer scan. For the inner scan, the
so it doesn't affect the row count of the outer scan. For the inner (lower) scan, the
<literal>unique2</> value of the current outer-scan row is plugged into
the inner index scan to produce an index condition like
<literal>t2.unique2 = <replaceable>constant</replaceable></literal>.
......@@ -335,8 +338,8 @@ EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 100 AND t1.unique
<para>
In this example the join's output row count is the same as the product
of the two scans' row counts, but that's not true in general, because
in general you can have <literal>WHERE</> clauses that mention both tables
of the two scans' row counts, but that's not true in all cases because
you can have <literal>WHERE</> clauses that mention both tables
and so can only be applied at the join point, not to either input scan.
For example, if we added
<literal>WHERE ... AND t1.hundred &lt; t2.hundred</literal>,
......@@ -346,14 +349,16 @@ EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 100 AND t1.unique
<para>
One way to look at variant plans is to force the planner to disregard
whatever strategy it thought was the winner, using the enable/disable
whatever strategy it thought was the cheapest, using the enable/disable
flags described in <xref linkend="runtime-config-query-enable">.
(This is a crude tool, but useful. See
also <xref linkend="explicit-joins">.)
<programlisting>
SET enable_nestloop = off;
EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2;
EXPLAIN SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2;
QUERY PLAN
------------------------------------------------------------------------------------------
......@@ -370,9 +375,9 @@ EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 100 AND t1.unique
This plan proposes to extract the 100 interesting rows of <classname>tenk1</classname>
using that same old index scan, stash them into an in-memory hash table,
and then do a sequential scan of <classname>tenk2</classname>, probing into the hash table
for possible matches of <literal>t1.unique2 = t2.unique2</literal> at each <classname>tenk2</classname> row.
The cost to read <classname>tenk1</classname> and set up the hash table is entirely start-up
cost for the hash join, since we won't get any rows out until we can
for possible matches of <literal>t1.unique2 = t2.unique2</literal> for each <classname>tenk2</classname> row.
The cost to read <classname>tenk1</classname> and set up the hash table is a start-up
cost for the hash join, since there will be no output until we can
start reading <classname>tenk2</classname>. The total time estimate for the join also
includes a hefty charge for the CPU time to probe the hash table
10000 times. Note, however, that we are <emphasis>not</emphasis> charging 10000 times 232.35;
......@@ -380,14 +385,16 @@ EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 100 AND t1.unique
</para>
<para>
It is possible to check on the accuracy of the planner's estimated costs
It is possible to check the accuracy of the planner's estimated costs
by using <command>EXPLAIN ANALYZE</>. This command actually executes the query,
and then displays the true run time accumulated within each plan node
along with the same estimated costs that a plain <command>EXPLAIN</command> shows.
For example, we might get a result like this:
<screen>
EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2;
EXPLAIN ANALYZE SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 &lt; 100 AND t1.unique2 = t2.unique2;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
......@@ -402,7 +409,7 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 100 AND t
</screen>
Note that the <quote>actual time</quote> values are in milliseconds of
real time, whereas the <quote>cost</quote> estimates are expressed in
real time, whereas the <literal>cost=</> estimates are expressed in
arbitrary units; so they are unlikely to match up.
The thing to pay attention to is whether the ratios of actual time and
estimated costs are consistent.
......@@ -412,11 +419,11 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 100 AND t
In some query plans, it is possible for a subplan node to be executed more
than once. For example, the inner index scan is executed once per outer
row in the above nested-loop plan. In such cases, the
<quote>loops</quote> value reports the
<literal>loops=</> value reports the
total number of executions of the node, and the actual time and rows
values shown are averages per-execution. This is done to make the numbers
comparable with the way that the cost estimates are shown. Multiply by
the <quote>loops</quote> value to get the total time actually spent in
the <literal>loops=</> value to get the total time actually spent in
the node.
</para>
......@@ -429,9 +436,9 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 100 AND t
reported for the top-level plan node. For <command>INSERT</>,
<command>UPDATE</>, and <command>DELETE</> commands, the total run time
might be considerably larger, because it includes the time spent processing
the result rows. In these commands, the time for the top plan node
essentially is the time spent computing the new rows and/or locating the
old ones, but it doesn't include the time spent applying the changes.
the result rows. For these commands, the time for the top plan node is
essentially the time spent locating the old rows and/or computing
the new ones, but it doesn't include the time spent applying the changes.
Time spent firing triggers, if any, is also outside the top plan node,
and is shown separately for each trigger.
</para>
......@@ -475,7 +482,9 @@ EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 &lt; 100 AND t
queries similar to this one:
<screen>
SELECT relname, relkind, reltuples, relpages FROM pg_class WHERE relname LIKE 'tenk1%';
SELECT relname, relkind, reltuples, relpages
FROM pg_class
WHERE relname LIKE 'tenk1%';
relname | relkind | reltuples | relpages
----------------------+---------+-----------+----------
......@@ -512,7 +521,7 @@ SELECT relname, relkind, reltuples, relpages FROM pg_class WHERE relname LIKE 't
<para>
Most queries retrieve only a fraction of the rows in a table, due
to having <literal>WHERE</> clauses that restrict the rows to be
to <literal>WHERE</> clauses that restrict the rows to be
examined. The planner thus needs to make an estimate of the
<firstterm>selectivity</> of <literal>WHERE</> clauses, that is,
the fraction of rows that match each condition in the
......@@ -544,7 +553,9 @@ SELECT relname, relkind, reltuples, relpages FROM pg_class WHERE relname LIKE 't
For example, we might do:
<screen>
SELECT attname, n_distinct, most_common_vals FROM pg_stats WHERE tablename = 'road';
SELECT attname, n_distinct, most_common_vals
FROM pg_stats
WHERE tablename = 'road';
attname | n_distinct | most_common_vals
---------+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
......@@ -769,7 +780,8 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
</indexterm>
<para>
Turn off autocommit and just do one commit at the end. (In plain
When doing <command>INSERT</>s, turn off autocommit and just do
one commit at the end. (In plain
SQL, this means issuing <command>BEGIN</command> at the start and
<command>COMMIT</command> at the end. Some client libraries might
do this behind your back, in which case you need to make sure the
......@@ -812,7 +824,7 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
<para>
Note that loading a large number of rows using
<command>COPY</command> is almost always faster than using
<command>INSERT</command>, even if <command>PREPARE</> is used and
<command>INSERT</command>, even if the <command>PREPARE ... INSERT</> is used and
multiple insertions are batched into a single transaction.
</para>
......@@ -823,7 +835,7 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
needs to be written, because in case of an error, the files
containing the newly loaded data will be removed anyway.
However, this consideration does not apply when
<xref linkend="guc-archive-mode"> is set, as all commands
<xref linkend="guc-archive-mode"> is on, as all commands
must write WAL in that case.
</para>
......@@ -833,7 +845,7 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
<title>Remove Indexes</title>
<para>
If you are loading a freshly created table, the fastest way is to
If you are loading a freshly created table, the fastest method is to
create the table, bulk load the table's data using
<command>COPY</command>, then create any indexes needed for the
table. Creating an index on pre-existing data is quicker than
......@@ -844,8 +856,8 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
If you are adding large amounts of data to an existing table,
it might be a win to drop the index,
load the table, and then recreate the index. Of course, the
database performance for other users might be adversely affected
during the time that the index is missing. One should also think
database performance for other users might suffer
during the time the index is missing. One should also think
twice before dropping unique indexes, since the error checking
afforded by the unique constraint will be lost while the index is
missing.
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/pgbuffercache.sgml,v 2.3 2008/08/14 12:56:41 heikki Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/pgbuffercache.sgml,v 2.4 2009/04/27 16:27:36 momjian Exp $ -->
<sect1 id="pgbuffercache">
<title>pg_buffercache</title>
......@@ -141,7 +141,8 @@
b.reldatabase IN (0, (SELECT oid FROM pg_database
WHERE datname = current_database()))
GROUP BY c.relname
ORDER BY 2 DESC LIMIT 10;
ORDER BY 2 DESC
LIMIT 10;
relname | buffers
---------------------------------+---------
tenk2 | 345
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/postgres.sgml,v 1.86 2008/05/07 16:36:43 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/postgres.sgml,v 1.87 2009/04/27 16:27:36 momjian Exp $ -->
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [
......@@ -78,7 +78,7 @@
chapters individually as they choose. The information in this
part is presented in a narrative fashion in topical units.
Readers looking for a complete description of a particular command
should look into <xref linkend="reference">.
should see <xref linkend="reference">.
</para>
<para>
......@@ -127,14 +127,14 @@
self-contained and can be read individually as desired. The
information in this part is presented in a narrative fashion in
topical units. Readers looking for a complete description of a
particular command should look into <xref linkend="reference">.
particular command should see <xref linkend="reference">.
</para>
<para>
The first few chapters are written so that they can be understood
without prerequisite knowledge, so that new users who need to set
The first few chapters are written so they can be understood
without prerequisite knowledge, so new users who need to set
up their own server can begin their exploration with this part.
The rest of this part is about tuning and management; that material
The rest of this part is about tuning and management; the material
assumes that the reader is familiar with the general use of
the <productname>PostgreSQL</> database system. Readers are
encouraged to look at <xref linkend="tutorial"> and <xref
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/problems.sgml,v 2.29 2009/01/06 17:27:06 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/problems.sgml,v 2.30 2009/04/27 16:27:36 momjian Exp $ -->
<sect1 id="bug-reporting">
<title>Bug Reporting Guidelines</title>
......@@ -136,7 +136,7 @@
file that can be run through the <application>psql</application>
frontend that shows the problem. (Be sure to not have anything
in your <filename>~/.psqlrc</filename> start-up file.) An easy
start at this file is to use <application>pg_dump</application>
way to create this file is to use <application>pg_dump</application>
to dump out the table declarations and data needed to set the
scene, then add the problem query. You are encouraged to
minimize the size of your example, but this is not absolutely
......@@ -252,7 +252,7 @@
C library, processor, memory information, and so on. In most
cases it is sufficient to report the vendor and version, but do
not assume everyone knows what exactly <quote>Debian</quote>
contains or that everyone runs on Pentiums. If you have
contains or that everyone runs on i386s. If you have
installation problems then information about the toolchain on
your machine (compiler, <application>make</application>, and so
on) is also necessary.
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.53 2009/02/07 20:11:16 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/queries.sgml,v 1.54 2009/04/27 16:27:36 momjian Exp $ -->
<chapter id="queries">
<title>Queries</title>
......@@ -14,7 +14,7 @@
<para>
The previous chapters explained how to create tables, how to fill
them with data, and how to manipulate that data. Now we finally
discuss how to retrieve the data out of the database.
discuss how to retrieve the data from the database.
</para>
......@@ -63,7 +63,7 @@ SELECT a, b + c FROM table1;
</para>
<para>
<literal>FROM table1</literal> is a particularly simple kind of
<literal>FROM table1</literal> is a simple kind of
table expression: it reads just one table. In general, table
expressions can be complex constructs of base tables, joins, and
subqueries. But you can also omit the table expression entirely and
......@@ -133,8 +133,8 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r
<para>
When a table reference names a table that is the parent of a
table inheritance hierarchy, the table reference produces rows of
not only that table but all of its descendant tables, unless the
table inheritance hierarchy, the table reference produces rows
not only of that table but all of its descendant tables, unless the
key word <literal>ONLY</> precedes the table name. However, the
reference produces only the columns that appear in the named table
&mdash; any columns added in subtables are ignored.
......@@ -174,11 +174,12 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r
</synopsis>
<para>
For each combination of rows from
Produce every possible combination of rows from
<replaceable>T1</replaceable> and
<replaceable>T2</replaceable>, the derived table will contain a
row consisting of all columns in <replaceable>T1</replaceable>
followed by all columns in <replaceable>T2</replaceable>. If
<replaceable>T2</replaceable> (i.e., a Cartesian product),
with output columns consisting of
all <replaceable>T1</replaceable> columns
followed by all <replaceable>T2</replaceable> columns. If
the tables have N and M rows respectively, the joined
table will have N * M rows.
</para>
......@@ -242,14 +243,15 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r
comma-separated list of column names, which the joined tables
must have in common, and forms a join condition specifying
equality of each of these pairs of columns. Furthermore, the
output of a <literal>JOIN USING</> has one column for each of
the equated pairs of input columns, followed by all of the
output of <literal>JOIN USING</> has one column for each of
the equated pairs of input columns, followed by the
other columns from each table. Thus, <literal>USING (a, b,
c)</literal> is equivalent to <literal>ON (t1.a = t2.a AND
t1.b = t2.b AND t1.c = t2.c)</literal> with the exception that
if <literal>ON</> is used there will be two columns
<literal>a</>, <literal>b</>, and <literal>c</> in the result,
whereas with <literal>USING</> there will be only one of each.
whereas with <literal>USING</> there will be only one of each
(and they will appear first if <command>SELECT *</> is used).
</para>
<para>
......@@ -262,7 +264,7 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r
</indexterm>
Finally, <literal>NATURAL</> is a shorthand form of
<literal>USING</>: it forms a <literal>USING</> list
consisting of exactly those column names that appear in both
consisting of all column names that appear in both
input tables. As with <literal>USING</>, these columns appear
only once in the output table.
</para>
......@@ -298,8 +300,8 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r
<para>
First, an inner join is performed. Then, for each row in
T1 that does not satisfy the join condition with any row in
T2, a joined row is added with null values in columns of
T2. Thus, the joined table unconditionally has at least
T2, a row is added with null values in columns of
T2. Thus, the joined table always has at least
one row for each row in T1.
</para>
</listitem>
......@@ -321,9 +323,9 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r
<para>
First, an inner join is performed. Then, for each row in
T2 that does not satisfy the join condition with any row in
T1, a joined row is added with null values in columns of
T1, a row is added with null values in columns of
T1. This is the converse of a left join: the result table
will unconditionally have a row for each row in T2.
will always have a row for each row in T2.
</para>
</listitem>
</varlistentry>
......@@ -335,9 +337,9 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r
<para>
First, an inner join is performed. Then, for each row in
T1 that does not satisfy the join condition with any row in
T2, a joined row is added with null values in columns of
T2, a row is added with null values in columns of
T2. Also, for each row of T2 that does not satisfy the
join condition with any row in T1, a joined row with null
join condition with any row in T1, a row with null
values in the columns of T1 is added.
</para>
</listitem>
......@@ -350,8 +352,8 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r
<para>
Joins of all types can be chained together or nested: either or
both of <replaceable>T1</replaceable> and
<replaceable>T2</replaceable> might be joined tables. Parentheses
both <replaceable>T1</replaceable> and
<replaceable>T2</replaceable> can be joined tables. Parentheses
can be used around <literal>JOIN</> clauses to control the join
order. In the absence of parentheses, <literal>JOIN</> clauses
nest left-to-right.
......@@ -460,6 +462,19 @@ FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_r
3 | c | |
(3 rows)
</screen>
Notice that placing the restriction in the <literal>WHERE</> clause
produces a different result:
<screen>
<prompt>=&gt;</> <userinput>SELECT * FROM t1 LEFT JOIN t2 ON t1.num = t2.num WHERE t2.value = 'xxx';</>
num | name | num | value
-----+------+-----+-------
1 | a | 1 | xxx
(1 row)
</screen>
This is because a restriction placed in the <literal>ON</>
clause is processed <emphasis>before</> the join, while
a restriction placed in the <literal>WHERE</> clause is processed
<emphasis>after</> the join.
</para>
</sect3>
......@@ -513,7 +528,7 @@ SELECT * FROM some_very_long_table_name s JOIN another_fairly_long_name a ON s.i
SELECT * FROM my_table AS m WHERE my_table.a &gt; 5;
</programlisting>
is not valid according to the SQL standard. In
<productname>PostgreSQL</productname> this will draw an error if the
<productname>PostgreSQL</productname> this will draw an error, assuming the
<xref linkend="guc-add-missing-from"> configuration variable is
<literal>off</> (as it is by default). If it is <literal>on</>,
an implicit table reference will be added to the
......@@ -559,8 +574,8 @@ FROM <replaceable>table_reference</replaceable> <optional>AS</optional> <replace
<para>
When an alias is applied to the output of a <literal>JOIN</>
clause, using any of these forms, the alias hides the original
names within the <literal>JOIN</>. For example:
clause, the alias hides the original
name referenced in the <literal>JOIN</>. For example:
<programlisting>
SELECT a.* FROM my_table AS a JOIN your_table AS b ON ...
</programlisting>
......@@ -568,7 +583,7 @@ SELECT a.* FROM my_table AS a JOIN your_table AS b ON ...
<programlisting>
SELECT a.* FROM (my_table AS a JOIN your_table AS b ON ...) AS c
</programlisting>
is not valid: the table alias <literal>a</> is not visible
is not valid; the table alias <literal>a</> is not visible
outside the alias <literal>c</>.
</para>
</sect3>
......@@ -631,7 +646,7 @@ FROM (VALUES ('anne', 'smith'), ('bob', 'jones'), ('joe', 'blow'))
<para>
If a table function returns a base data type, the single result
column is named like the function. If the function returns a
column name matches the function name. If the function returns a
composite type, the result columns get the same names as the
individual attributes of the type.
</para>
......@@ -655,8 +670,11 @@ $$ LANGUAGE SQL;
SELECT * FROM getfoo(1) AS t1;
SELECT * FROM foo
WHERE foosubid IN (select foosubid from getfoo(foo.fooid) z
where z.fooid = foo.fooid);
WHERE foosubid IN (
SELECT foosubid
FROM getfoo(foo.fooid) z
WHERE z.fooid = foo.fooid
);
CREATE VIEW vw_getfoo AS SELECT * FROM getfoo(1);
......@@ -668,13 +686,14 @@ SELECT * FROM vw_getfoo;
In some cases it is useful to define table functions that can
return different column sets depending on how they are invoked.
To support this, the table function can be declared as returning
the pseudotype <type>record</>. When such a function is used in
the pseudotype <type>record</>, rather than <literal>SET OF</>.
When such a function is used in
a query, the expected row structure must be specified in the
query itself, so that the system can know how to parse and plan
the query. Consider this example:
<programlisting>
SELECT *
FROM dblink('dbname=mydb', 'select proname, prosrc from pg_proc')
FROM dblink('dbname=mydb', 'SELECT proname, prosrc FROM pg_proc')
AS t1(proname name, prosrc text)
WHERE proname LIKE 'bytea%';
</programlisting>
......@@ -710,9 +729,9 @@ WHERE <replaceable>search_condition</replaceable>
After the processing of the <literal>FROM</> clause is done, each
row of the derived virtual table is checked against the search
condition. If the result of the condition is true, the row is
kept in the output table, otherwise (that is, if the result is
kept in the output table, otherwise (i.e., if the result is
false or null) it is discarded. The search condition typically
references at least some column of the table generated in the
references at least one column of the table generated in the
<literal>FROM</> clause; this is not required, but otherwise the
<literal>WHERE</> clause will be fairly useless.
</para>
......@@ -735,11 +754,12 @@ FROM a NATURAL JOIN b WHERE b.val &gt; 5
</programlisting>
Which one of these you use is mainly a matter of style. The
<literal>JOIN</> syntax in the <literal>FROM</> clause is
probably not as portable to other SQL database management systems. For
outer joins there is no choice in any case: they must be done in
the <literal>FROM</> clause. An <literal>ON</>/<literal>USING</>
probably not as portable to other SQL database management systems,
even though it is in the SQL standard. For
outer joins there is no choice: they must be done in
the <literal>FROM</> clause. The <literal>ON</>/<literal>USING</>
clause of an outer join is <emphasis>not</> equivalent to a
<literal>WHERE</> condition, because it determines the addition
<literal>WHERE</> condition, because it affects the addition
of rows (for unmatched input rows) as well as the removal of rows
from the final result.
</para>
......@@ -760,7 +780,7 @@ SELECT ... FROM fdt WHERE c1 BETWEEN (SELECT c3 FROM t2 WHERE c2 = fdt.c1 + 10)
SELECT ... FROM fdt WHERE EXISTS (SELECT c1 FROM t2 WHERE c2 &gt; fdt.c1)
</programlisting>
<literal>fdt</literal> is the table derived in the
<literal>fdt</literal> is the table used in the
<literal>FROM</> clause. Rows that do not meet the search
condition of the <literal>WHERE</> clause are eliminated from
<literal>fdt</literal>. Notice the use of scalar subqueries as
......@@ -803,11 +823,11 @@ SELECT <replaceable>select_list</replaceable>
<para>
The <xref linkend="sql-groupby" endterm="sql-groupby-title"> is
used to group together those rows in a table that share the same
used to group together those rows in a table that have the same
values in all the columns listed. The order in which the columns
are listed does not matter. The effect is to combine each set
of rows sharing common values into one group row that is
representative of all rows in the group. This is done to
of rows having common values into one group row that
represents all rows in the group. This is done to
eliminate redundancy in the output and/or compute aggregates that
apply to these groups. For instance:
<screen>
......@@ -840,7 +860,7 @@ SELECT <replaceable>select_list</replaceable>
<para>
In general, if a table is grouped, columns that are not
used in the grouping cannot be referenced except in aggregate
the same in the group cannot be referenced except in aggregate
expressions. An example with aggregate expressions is:
<screen>
<prompt>=&gt;</> <userinput>SELECT x, sum(y) FROM test1 GROUP BY x;</>
......@@ -860,7 +880,7 @@ SELECT <replaceable>select_list</replaceable>
<tip>
<para>
Grouping without aggregate expressions effectively calculates the
set of distinct values in a column. This can also be achieved
set of distinct values in a column. This can more clearly be achieved
using the <literal>DISTINCT</> clause (see <xref
linkend="queries-distinct">).
</para>
......@@ -868,7 +888,7 @@ SELECT <replaceable>select_list</replaceable>
<para>
Here is another example: it calculates the total sales for each
product (rather than the total sales on all products):
product (rather than the total sales of all products):
<programlisting>
SELECT product_id, p.name, (sum(s.units) * p.price) AS sales
FROM products p LEFT JOIN sales s USING (product_id)
......@@ -877,10 +897,10 @@ SELECT product_id, p.name, (sum(s.units) * p.price) AS sales
In this example, the columns <literal>product_id</literal>,
<literal>p.name</literal>, and <literal>p.price</literal> must be
in the <literal>GROUP BY</> clause since they are referenced in
the query select list. (Depending on how exactly the products
the query select list. (Depending on how the products
table is set up, name and price might be fully dependent on the
product ID, so the additional groupings could theoretically be
unnecessary, but this is not implemented yet.) The column
unnecessary, though this is not implemented.) The column
<literal>s.units</> does not have to be in the <literal>GROUP
BY</> list since it is only used in an aggregate expression
(<literal>sum(...)</literal>), which represents the sales
......@@ -901,11 +921,11 @@ SELECT product_id, p.name, (sum(s.units) * p.price) AS sales
</indexterm>
<para>
If a table has been grouped using a <literal>GROUP BY</literal>
clause, but then only certain groups are of interest, the
If a table has been grouped using <literal>GROUP BY</literal>,
but only certain groups are of interest, the
<literal>HAVING</literal> clause can be used, much like a
<literal>WHERE</> clause, to eliminate groups from a grouped
table. The syntax is:
<literal>WHERE</> clause, to eliminate groups from the result.
The syntax is:
<synopsis>
SELECT <replaceable>select_list</replaceable> FROM ... <optional>WHERE ...</optional> GROUP BY ... HAVING <replaceable>boolean_expression</replaceable>
</synopsis>
......@@ -1068,8 +1088,7 @@ SELECT tbl1.*, tbl2.a FROM ...
the row's values substituted for any column references. But the
expressions in the select list do not have to reference any
columns in the table expression of the <literal>FROM</> clause;
they could be constant arithmetic expressions as well, for
instance.
they can be constant arithmetic expressions as well.
</para>
</sect2>
......@@ -1083,9 +1102,8 @@ SELECT tbl1.*, tbl2.a FROM ...
<para>
The entries in the select list can be assigned names for further
processing. The <quote>further processing</quote> in this case is
an optional sort specification and the client application (e.g.,
column headers for display). For example:
processing, perhaps for reference in an <literal>ORDER BY</> clause
or for display by the client application. For example:
<programlisting>
SELECT a AS value, b + c AS sum FROM ...
</programlisting>
......@@ -1122,8 +1140,8 @@ SELECT a "value", b + c AS sum FROM ...
<para>
The naming of output columns here is different from that done in
the <literal>FROM</> clause (see <xref
linkend="queries-table-aliases">). This pipeline will in fact
allow you to rename the same column twice, but the name chosen in
linkend="queries-table-aliases">). It is possible
to rename the same column twice, but the name used in
the select list is the one that will be passed on.
</para>
</note>
......@@ -1181,7 +1199,7 @@ SELECT DISTINCT ON (<replaceable>expression</replaceable> <optional>, <replaceab
The <literal>DISTINCT ON</> clause is not part of the SQL standard
and is sometimes considered bad style because of the potentially
indeterminate nature of its results. With judicious use of
<literal>GROUP BY</> and subqueries in <literal>FROM</> the
<literal>GROUP BY</> and subqueries in <literal>FROM</>, this
construct can be avoided, but it is often the most convenient
alternative.
</para>
......@@ -1229,7 +1247,7 @@ SELECT DISTINCT ON (<replaceable>expression</replaceable> <optional>, <replaceab
<synopsis>
<replaceable>query1</replaceable> UNION <replaceable>query2</replaceable> UNION <replaceable>query3</replaceable>
</synopsis>
which really says
which is executed as:
<synopsis>
(<replaceable>query1</replaceable> UNION <replaceable>query2</replaceable>) UNION <replaceable>query3</replaceable>
</synopsis>
......@@ -1328,9 +1346,9 @@ SELECT a, b FROM table1 ORDER BY a + b, c;
<para>
The <literal>NULLS FIRST</> and <literal>NULLS LAST</> options can be
used to determine whether nulls appear before or after non-null values
in the sort ordering. By default, null values sort as if larger than any
non-null value; that is, <literal>NULLS FIRST</> is the default for
<literal>DESC</> order, and <literal>NULLS LAST</> otherwise.
in the sort ordering. The default behavior is for null values sort as
if larger than all non-null values (<literal>NULLS FIRST</>), except
in <literal>DESC</> ordering, where <literal>NULLS LAST</> is the default.
</para>
<para>
......@@ -1341,15 +1359,14 @@ SELECT a, b FROM table1 ORDER BY a + b, c;
</para>
<para>
For backwards compatibility with the SQL92 version of the standard,
a <replaceable>sort_expression</> can instead be the name or number
A <replaceable>sort_expression</> can also be the column label or number
of an output column, as in:
<programlisting>
SELECT a + b AS sum, c FROM table1 ORDER BY sum;
SELECT a, max(b) FROM table1 GROUP BY a ORDER BY 1;
</programlisting>
both of which sort by the first output column. Note that an output
column name has to stand alone, it's not allowed as part of an expression
column name has to stand alone, e.g., it cannot be used in an expression
&mdash; for example, this is <emphasis>not</> correct:
<programlisting>
SELECT a + b AS sum, c FROM table1 ORDER BY sum + c; -- wrong
......@@ -1412,16 +1429,16 @@ SELECT <replaceable>select_list</replaceable>
<para>
When using <literal>LIMIT</>, it is important to use an
<literal>ORDER BY</> clause that constrains the result rows into a
<literal>ORDER BY</> clause that constrains the result rows in a
unique order. Otherwise you will get an unpredictable subset of
the query's rows. You might be asking for the tenth through
twentieth rows, but tenth through twentieth in what ordering? The
twentieth rows, but tenth through twentieth using what ordering? The
ordering is unknown, unless you specified <literal>ORDER BY</>.
</para>
<para>
The query optimizer takes <literal>LIMIT</> into account when
generating a query plan, so you are very likely to get different
generating query plans, so you are very likely to get different
plans (yielding different row orders) depending on what you give
for <literal>LIMIT</> and <literal>OFFSET</>. Thus, using
different <literal>LIMIT</>/<literal>OFFSET</> values to select
......@@ -1455,7 +1472,7 @@ SELECT <replaceable>select_list</replaceable>
<synopsis>
VALUES ( <replaceable class="PARAMETER">expression</replaceable> [, ...] ) [, ...]
</synopsis>
Each parenthesized list of expressions generates a row in the table.
Each parenthesized list of expressions generates a row in the table expression.
The lists must all have the same number of elements (i.e., the number
of columns in the table), and corresponding entries in each list must
have compatible data types. The actual data type assigned to each column
......@@ -1489,12 +1506,12 @@ SELECT 3, 'three';
<para>
Syntactically, <literal>VALUES</> followed by expression lists is
treated as equivalent to
treated as equivalent to:
<synopsis>
SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable>
</synopsis>
and can appear anywhere a <literal>SELECT</> can. For example, you can
use it as an arm of a <literal>UNION</>, or attach a
use it as part of a <literal>UNION</>, or attach a
<replaceable>sort_specification</replaceable> (<literal>ORDER BY</>,
<literal>LIMIT</>, and/or <literal>OFFSET</>) to it. <literal>VALUES</>
is most commonly used as the data source in an <command>INSERT</> command,
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/query.sgml,v 1.51 2008/12/28 18:53:54 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/query.sgml,v 1.52 2009/04/27 16:27:36 momjian Exp $ -->
<chapter id="tutorial-sql">
<title>The <acronym>SQL</acronym> Language</title>
......@@ -38,7 +38,7 @@
functions and types. (If you installed a pre-packaged version of
<productname>PostgreSQL</productname> rather than building from source,
look for a directory named <filename>tutorial</> within the
<productname>PostgreSQL</productname> documentation. The <quote>make</>
<productname>PostgreSQL</productname> distribution. The <quote>make</>
part should already have been done for you.)
Then, to start the tutorial, do the following:
......@@ -53,7 +53,7 @@
</screen>
The <literal>\i</literal> command reads in commands from the
specified file. The <literal>-s</literal> option puts you in
specified file. The <command>psql</command> <literal>-s</> option puts you in
single step mode which pauses before sending each statement to the
server. The commands used in this section are in the file
<filename>basics.sql</filename>.
......@@ -165,7 +165,7 @@ CREATE TABLE weather (
and a rich set of geometric types.
<productname>PostgreSQL</productname> can be customized with an
arbitrary number of user-defined data types. Consequently, type
names are not syntactical key words, except where required to
names are not special key words in the syntax except where required to
support special cases in the <acronym>SQL</acronym> standard.
</para>
......@@ -421,7 +421,7 @@ SELECT DISTINCT city
<literal>DISTINCT</literal> automatically orders the rows and
so <literal>ORDER BY</literal> is unnecessary. But this is not
required by the SQL standard, and current
<productname>PostgreSQL</productname> doesn't guarantee that
<productname>PostgreSQL</productname> does not guarantee that
<literal>DISTINCT</literal> causes the rows to be ordered.
</para>
</footnote>
......@@ -451,8 +451,8 @@ SELECT DISTINCT city
<firstterm>join</firstterm> query. As an example, say you wish to
list all the weather records together with the location of the
associated city. To do that, we need to compare the city column of
each row of the weather table with the name column of all rows in
the cities table, and select the pairs of rows where these values match.
each row of the <literal>weather</> table with the name column of all rows in
the <literal>cities</> table, and select the pairs of rows where these values match.
<note>
<para>
This is only a conceptual model. The join is usually performed
......@@ -486,7 +486,7 @@ SELECT *
There is no result row for the city of Hayward. This is
because there is no matching entry in the
<classname>cities</classname> table for Hayward, so the join
ignores the unmatched rows in the weather table. We will see
ignores the unmatched rows in the <literal>weather</> table. We will see
shortly how this can be fixed.
</para>
</listitem>
......@@ -494,9 +494,9 @@ SELECT *
<listitem>
<para>
There are two columns containing the city name. This is
correct because the lists of columns of the
correct because the columns from the
<classname>weather</classname> and the
<classname>cities</classname> table are concatenated. In
<classname>cities</classname> tables are concatenated. In
practice this is undesirable, though, so you will probably want
to list the output columns explicitly rather than using
<literal>*</literal>:
......@@ -514,14 +514,14 @@ SELECT city, temp_lo, temp_hi, prcp, date, location
<title>Exercise:</title>
<para>
Attempt to find out the semantics of this query when the
Attempt to determine the semantics of this query when the
<literal>WHERE</literal> clause is omitted.
</para>
</formalpara>
<para>
Since the columns all had different names, the parser
automatically found out which table they belong to. If there
automatically found which table they belong to. If there
were duplicate column names in the two tables you'd need to
<firstterm>qualify</> the column names to show which one you
meant, as in:
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/regress.sgml,v 1.62 2009/02/12 13:26:03 petere Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/regress.sgml,v 1.63 2009/04/27 16:27:36 momjian Exp $ -->
<chapter id="regress">
<title id="regress-title">Regression Tests</title>
......@@ -37,7 +37,7 @@
<para>
To run the regression tests after building but before installation,
type
type:
<screen>
gmake check
</screen>
......@@ -45,7 +45,7 @@ gmake check
<filename>src/test/regress</filename> and run the command there.)
This will first build several auxiliary files, such as
some sample user-defined trigger functions, and then run the test driver
script. At the end you should see something like
script. At the end you should see something like:
<screen>
<computeroutput>
=======================
......@@ -64,7 +64,7 @@ gmake check
If you already did the build as root, you do not have to start all
over. Instead, make the regression test directory writable by
some other user, log in as that user, and restart the tests.
For example
For example:
<screen>
<prompt>root# </prompt><userinput>chmod -R a+w src/test/regress</userinput>
<prompt>root# </prompt><userinput>su - joeuser</userinput>
......@@ -101,7 +101,7 @@ gmake check
make sure this limit is at least fifty or so, else you might get
random-seeming failures in the parallel test. If you are not in
a position to raise the limit, you can cut down the degree of parallelism
by setting the <literal>MAX_CONNECTIONS</> parameter. For example,
by setting the <literal>MAX_CONNECTIONS</> parameter. For example:
<screen>
gmake MAX_CONNECTIONS=10 check
</screen>
......@@ -111,11 +111,11 @@ gmake MAX_CONNECTIONS=10 check
<para>
To run the tests after installation<![%standalone-ignore;[ (see <xref linkend="installation">)]]>,
initialize a data area and start the
server, <![%standalone-ignore;[as explained in <xref linkend="runtime">, ]]> then type
server, <![%standalone-ignore;[as explained in <xref linkend="runtime">, ]]> then type:
<screen>
gmake installcheck
</screen>
or for a parallel test
or for a parallel test:
<screen>
gmake installcheck-parallel
</screen>
......@@ -130,14 +130,14 @@ gmake installcheck-parallel
At present, these tests can be used only against an already-installed
server. To run the tests for all procedural languages that have been
built and installed, change to the <filename>src/pl</> directory of the
build tree and type
build tree and type:
<screen>
gmake installcheck
</screen>
You can also do this in any of the subdirectories of <filename>src/pl</>
to run tests for just one procedural language. To run the tests for all
<filename>contrib</> modules that have them, change to the
<filename>contrib</> directory of the build tree and type
<filename>contrib</> directory of the build tree and type:
<screen>
gmake installcheck
</screen>
......@@ -479,7 +479,7 @@ gmake coverage-html
</para>
<para>
To reset the execution counts between test runs, run
To reset the execution counts between test runs, run:
<screen>
gmake coverage-clean
</screen>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/rowtypes.sgml,v 2.9 2007/02/01 00:28:18 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/rowtypes.sgml,v 2.10 2009/04/27 16:27:36 momjian Exp $ -->
<sect1 id="rowtypes">
<title>Composite Types</title>
......@@ -12,9 +12,9 @@
</indexterm>
<para>
A <firstterm>composite type</> describes the structure of a row or record;
it is in essence just a list of field names and their data types.
<productname>PostgreSQL</productname> allows values of composite types to be
A <firstterm>composite type</> represents the structure of a row or record;
it is essentially just a list of field names and their data types.
<productname>PostgreSQL</productname> allows composite types to be
used in many of the same ways that simple types can be used. For example, a
column of a table can be declared to be of a composite type.
</para>
......@@ -39,9 +39,9 @@ CREATE TYPE inventory_item AS (
The syntax is comparable to <command>CREATE TABLE</>, except that only
field names and types can be specified; no constraints (such as <literal>NOT
NULL</>) can presently be included. Note that the <literal>AS</> keyword
is essential; without it, the system will think a quite different kind
of <command>CREATE TYPE</> command is meant, and you'll get odd syntax
errors.
is essential; without it, the system will think a different kind
of <command>CREATE TYPE</> command is meant, and you will get odd syntax
error.
</para>
<para>
......@@ -68,8 +68,8 @@ SELECT price_extension(item, 10) FROM on_hand;
</para>
<para>
Whenever you create a table, a composite type is also automatically
created, with the same name as the table, to represent the table's
Whenever you create a table, a composite type is automatically
created also, with the same name as the table, to represent the table's
row type. For example, had we said:
<programlisting>
CREATE TABLE inventory_item (
......@@ -135,7 +135,7 @@ CREATE TABLE inventory_item (
<para>
The <literal>ROW</literal> expression syntax can also be used to
construct composite values. In most cases this is considerably
simpler to use than the string-literal syntax, since you don't have
simpler to use than the string-literal syntax since you don't have
to worry about multiple layers of quoting. We already used this
method above:
<programlisting>
......@@ -169,7 +169,8 @@ SELECT item.name FROM on_hand WHERE item.price &gt; 9.99;
</programlisting>
This will not work since the name <literal>item</> is taken to be a table
name, not a field name, per SQL syntax rules. You must write it like this:
name, not a column name of <literal>on_hand</>, per SQL syntax rules.
You must write it like this:
<programlisting>
SELECT (item).name FROM on_hand WHERE (item).price &gt; 9.99;
......@@ -195,7 +196,7 @@ SELECT (on_hand.item).name FROM on_hand WHERE (on_hand.item).price &gt; 9.99;
SELECT (my_func(...)).field FROM ...
</programlisting>
Without the extra parentheses, this will provoke a syntax error.
Without the extra parentheses, this will generate a syntax error.
</para>
</sect2>
......@@ -249,7 +250,7 @@ INSERT INTO mytab (complex_col.r, complex_col.i) VALUES(1.1, 2.2);
The external text representation of a composite value consists of items that
are interpreted according to the I/O conversion rules for the individual
field types, plus decoration that indicates the composite structure.
The decoration consists of parentheses (<literal>(</> and <literal>)</>)
The decoration consists of parentheses
around the whole value, plus commas (<literal>,</>) between adjacent
items. Whitespace outside the parentheses is ignored, but within the
parentheses it is considered part of the field value, and might or might not be
......@@ -263,7 +264,7 @@ INSERT INTO mytab (complex_col.r, complex_col.i) VALUES(1.1, 2.2);
</para>
<para>
As shown previously, when writing a composite value you can write double
As shown previously, when writing a composite value you can use double
quotes around any individual field value.
You <emphasis>must</> do so if the field value would otherwise
confuse the composite-value parser. In particular, fields containing
......@@ -272,7 +273,8 @@ INSERT INTO mytab (complex_col.r, complex_col.i) VALUES(1.1, 2.2);
precede it with a backslash. (Also, a pair of double quotes within a
double-quoted field value is taken to represent a double quote character,
analogously to the rules for single quotes in SQL literal strings.)
Alternatively, you can use backslash-escaping to protect all data characters
Alternatively, you can avoid quoting and use backslash-escaping to
protect all data characters
that would otherwise be taken as composite syntax.
</para>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/runtime.sgml,v 1.427 2009/04/24 20:46:16 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/runtime.sgml,v 1.428 2009/04/27 16:27:36 momjian Exp $ -->
<chapter Id="runtime">
<title>Server Setup and Operation</title>
......@@ -76,7 +76,7 @@
linkend="app-initdb">,<indexterm><primary>initdb</></> which is
installed with <productname>PostgreSQL</productname>. The desired
file system location of your database cluster is indicated by the
<option>-D</option> option, for example
<option>-D</option> option, for example:
<screen>
<prompt>$</> <userinput>initdb -D /usr/local/pgsql/data</userinput>
</screen>
......@@ -382,7 +382,7 @@ FATAL: could not create TCP/IP listen socket
</para>
<para>
A message like
A message like:
<screen>
FATAL: could not create shared memory segment: Invalid argument
DETAIL: Failed system call was shmget(key=5440001, size=4011376640, 03600).
......@@ -401,7 +401,7 @@ DETAIL: Failed system call was shmget(key=5440001, size=4011376640, 03600).
</para>
<para>
An error like
An error like:
<screen>
FATAL: could not create semaphores: No space left on device
DETAIL: Failed system call was semget(5440126, 17, 03600).
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/sources.sgml,v 2.32 2008/10/27 19:37:21 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/sources.sgml,v 2.33 2009/04/27 16:27:36 momjian Exp $ -->
<chapter id="source">
<title>PostgreSQL Coding Conventions</title>
......@@ -661,10 +661,10 @@ BETTER: unrecognized node type: 42
<formalpara>
<title>May vs. Can vs. Might</title>
<para>
<quote>May</quote> suggests permission (e.g. "You may borrow my rake."),
<quote>May</quote> suggests permission (e.g., "You may borrow my rake."),
and has little use in documentation or error messages.
<quote>Can</quote> suggests ability (e.g. "I can lift that log."),
and <quote>might</quote> suggests possibility (e.g. "It might rain
<quote>Can</quote> suggests ability (e.g., "I can lift that log."),
and <quote>might</quote> suggests possibility (e.g., "It might rain
today."). Using the proper word clarifies meaning and assists
translation.
</para>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/sql.sgml,v 1.47 2008/02/15 22:17:06 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/sql.sgml,v 1.48 2009/04/27 16:27:36 momjian Exp $ -->
<chapter id="sql-intro">
<title>SQL</title>
......@@ -95,7 +95,7 @@
as <firstterm><acronym>SQL3</acronym></firstterm>
is under development. It is planned to make <acronym>SQL</acronym>
a Turing-complete
language, i.e. all computable queries (e.g. recursive queries) will be
language, i.e., all computable queries (e.g., recursive queries) will be
possible. This has now been completed as SQL:2003.
</para>
......@@ -761,7 +761,7 @@ x(A) &mid; F(x)
<para>
The relational algebra and the relational calculus have the same
<firstterm>expressive power</firstterm>; i.e. all queries that
<firstterm>expressive power</firstterm>; i.e., all queries that
can be formulated using relational algebra can also be formulated
using the relational calculus and vice versa.
This was first proved by E. F. Codd in
......@@ -811,7 +811,7 @@ x(A) &mid; F(x)
<para>
Arithmetic capability: In <acronym>SQL</acronym> it is possible
to involve
arithmetic operations as well as comparisons, e.g.
arithmetic operations as well as comparisons, e.g.:
<programlisting>
A &lt; B + 3.
......@@ -1027,7 +1027,7 @@ SELECT S.SNAME, P.PNAME
SUPPLIER &times; PART &times; SELLS
is derived. Now only those tuples satisfying the
conditions given in the WHERE clause are selected (i.e. the common
conditions given in the WHERE clause are selected (i.e., the common
named attributes have to be equal). Finally we project out all
columns but S.SNAME and P.PNAME.
</para>
......@@ -1312,7 +1312,7 @@ SELECT COUNT(PNO)
<acronym>SQL</acronym> allows one to partition the tuples of a table
into groups. Then the
aggregate functions described above can be applied to the groups &mdash;
i.e. the value of the aggregate function is no longer calculated over
i.e., the value of the aggregate function is no longer calculated over
all the values of the specified column but over all values of a
group. Thus the aggregate function is evaluated separately for every
group.
......@@ -1517,7 +1517,7 @@ SELECT *
<para>
If we want to know all suppliers that do not sell any part
(e.g. to be able to remove these suppliers from the database) we use:
(e.g., to be able to remove these suppliers from the database) we use:
<programlisting>
SELECT *
......@@ -1533,7 +1533,7 @@ SELECT *
sells at least one part. Note that we use S.SNO from the outer
<command>SELECT</command> within the WHERE clause of the inner
<command>SELECT</command>. Here the subquery must be evaluated
afresh for each tuple from the outer query, i.e. the value for
afresh for each tuple from the outer query, i.e., the value for
S.SNO is always taken from the current tuple of the outer
<command>SELECT</command>.
</para>
......@@ -1811,7 +1811,7 @@ CREATE INDEX I ON SUPPLIER (SNAME);
</para>
<para>
The created index is maintained automatically, i.e. whenever a new
The created index is maintained automatically, i.e., whenever a new
tuple is inserted into the relation SUPPLIER the index I is
adapted. Note that the only changes a user can perceive when an
index is present are increased speed for <command>SELECT</command>
......@@ -1826,7 +1826,7 @@ CREATE INDEX I ON SUPPLIER (SNAME);
<para>
A view can be regarded as a <firstterm>virtual table</firstterm>,
i.e. a table that
i.e., a table that
does not <emphasis>physically</emphasis> exist in the database
but looks to the user
as if it does. By contrast, when we talk of a
......@@ -1838,7 +1838,7 @@ CREATE INDEX I ON SUPPLIER (SNAME);
<para>
Views do not have their own, physically separate, distinguishable
stored data. Instead, the system stores the definition of the
view (i.e. the rules about how to access physically stored base
view (i.e., the rules about how to access physically stored base
tables in order to materialize the view) somewhere in the system
catalogs (see
<xref linkend="tutorial-catalogs-title" endterm="tutorial-catalogs-title">). For a
......@@ -2082,7 +2082,7 @@ DELETE FROM SUPPLIER
<para>
In this section we will sketch how <acronym>SQL</acronym> can be
embedded into a host language (e.g. <literal>C</literal>).
embedded into a host language (e.g., <literal>C</literal>).
There are two main reasons why we want to use <acronym>SQL</acronym>
from a host language:
......@@ -2090,7 +2090,7 @@ DELETE FROM SUPPLIER
<listitem>
<para>
There are queries that cannot be formulated using pure <acronym>SQL</acronym>
(i.e. recursive queries). To be able to perform such queries we need a
(i.e., recursive queries). To be able to perform such queries we need a
host language with a greater expressive power than
<acronym>SQL</acronym>.
</para>
......@@ -2099,7 +2099,7 @@ DELETE FROM SUPPLIER
<listitem>
<para>
We simply want to access a database from some application that
is written in the host language (e.g. a ticket reservation system
is written in the host language (e.g., a ticket reservation system
with a graphical user interface is written in C and the information
about which tickets are still left is stored in a database that can be
accessed using embedded <acronym>SQL</acronym>).
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/start.sgml,v 1.48 2009/01/06 03:05:23 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/start.sgml,v 1.49 2009/04/27 16:27:36 momjian Exp $ -->
<chapter id="tutorial-start">
<title>Getting Started</title>
......@@ -74,7 +74,7 @@
<para>
A server process, which manages the database files, accepts
connections to the database from client applications, and
performs actions on the database on behalf of the clients. The
performs database actions on the behalf of the clients. The
database server program is called
<filename>postgres</filename>.
<indexterm><primary>postgres</primary></indexterm>
......@@ -108,7 +108,7 @@
<para>
The <productname>PostgreSQL</productname> server can handle
multiple concurrent connections from clients. For that purpose it
multiple concurrent connections from clients. To achieve this it
starts (<quote>forks</quote>) a new process for each connection.
From that point on, the client and the new server process
communicate without intervention by the original
......@@ -159,25 +159,26 @@
</para>
<para>
If you see a message similar to
If you see a message similar to:
<screen>
createdb: command not found
</screen>
then <productname>PostgreSQL</> was not installed properly. Either it was not
installed at all or the search path was not set correctly. Try
installed at all or your shell's search path was not set correctly. Try
calling the command with an absolute path instead:
<screen>
<prompt>$</prompt> <userinput>/usr/local/pgsql/bin/createdb mydb</userinput>
</screen>
The path at your site might be different. Contact your site
administrator or check back in the installation instructions to
administrator or check the installation instructions to
correct the situation.
</para>
<para>
Another response could be this:
<screen>
createdb: could not connect to database postgres: could not connect to server: No such file or directory
createdb: could not connect to database postgres: could not connect
to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/tmp/.s.PGSQL.5432"?
</screen>
......@@ -246,7 +247,7 @@ createdb: database creation failed: ERROR: permission denied to create database
length. A convenient choice is to create a database with the same
name as your current user name. Many tools assume that database
name as the default, so it can save you some typing. To create
that database, simply type
that database, simply type:
<screen>
<prompt>$</prompt> <userinput>createdb</userinput>
</screen>
......@@ -299,7 +300,7 @@ createdb: database creation failed: ERROR: permission denied to create database
<para>
Using an existing graphical frontend tool like
<application>pgAdmin</application> or an office suite with
<acronym>ODBC</acronym> support to create and manipulate a
<acronym>ODBC</> or <acronym>JDBC</> support to create and manipulate a
database. These possibilities are not covered in this
tutorial.
</para>
......@@ -314,15 +315,15 @@ createdb: database creation failed: ERROR: permission denied to create database
</listitem>
</itemizedlist>
You probably want to start up <command>psql</command>, to try out
You probably want to start up <command>psql</command> to try
the examples in this tutorial. It can be activated for the
<literal>mydb</literal> database by typing the command:
<screen>
<prompt>$</prompt> <userinput>psql mydb</userinput>
</screen>
If you leave off the database name then it will default to your
If you do not supply the database name then it will default to your
user account name. You already discovered this scheme in the
previous section.
previous section using <command>createdb</command>.
</para>
<para>
......@@ -335,15 +336,15 @@ Type "help" for help.
mydb=&gt;
</screen>
<indexterm><primary>superuser</primary></indexterm>
The last line could also be
The last line could also be:
<screen>
mydb=#
</screen>
That would mean you are a database superuser, which is most likely
the case if you installed <productname>PostgreSQL</productname>
yourself. Being a superuser means that you are not subject to
access controls. For the purposes of this tutorial that is not of
importance.
access controls. For the purposes of this tutorial that is not
important.
</para>
<para>
......@@ -395,7 +396,7 @@ mydb=#
</para>
<para>
To get out of <command>psql</command>, type
To get out of <command>psql</command>, type:
<screen>
<prompt>mydb=&gt;</prompt> <userinput>\q</userinput>
</screen>
......@@ -407,7 +408,7 @@ mydb=#
installed correctly you can also type <literal>man psql</literal>
at the operating system shell prompt to see the documentation. In
this tutorial we will not use these features explicitly, but you
can use them yourself when you see fit.
can use them yourself when it is helpful.
</para>
</sect1>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.130 2009/02/04 21:30:41 alvherre Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/syntax.sgml,v 1.131 2009/04/27 16:27:36 momjian Exp $ -->
<chapter id="sql-syntax">
<title>SQL Syntax</title>
......@@ -11,12 +11,12 @@
<para>
This chapter describes the syntax of SQL. It forms the foundation
for understanding the following chapters which will go into detail
about how the SQL commands are applied to define and modify data.
about how SQL commands are applied to define and modify data.
</para>
<para>
We also advise users who are already familiar with SQL to read this
chapter carefully because there are several rules and concepts that
chapter carefully because it contains several rules and concepts that
are implemented inconsistently among SQL databases or that are
specific to <productname>PostgreSQL</productname>.
</para>
......@@ -293,7 +293,7 @@ U&amp;"d!0061t!+000061" UESCAPE '!'
bounded by single quotes (<literal>'</literal>), for example
<literal>'This is a string'</literal>. To include
a single-quote character within a string constant,
write two adjacent single quotes, e.g.
write two adjacent single quotes, e.g.,
<literal>'Dianne''s horse'</literal>.
Note that this is <emphasis>not</> the same as a double-quote
character (<literal>"</>). <!-- font-lock sanity: " -->
......@@ -337,7 +337,7 @@ SELECT 'foo' 'bar';
string constants, which are an extension to the SQL standard.
An escape string constant is specified by writing the letter
<literal>E</literal> (upper or lower case) just before the opening single
quote, e.g. <literal>E'foo'</>. (When continuing an escape string
quote, e.g., <literal>E'foo'</>. (When continuing an escape string
constant across lines, write <literal>E</> only before the first opening
quote.)
Within an escape string, a backslash character (<literal>\</>) begins a
......@@ -422,14 +422,14 @@ SELECT 'foo' 'bar';
<xref linkend="guc-standard-conforming-strings"> is <literal>off</>,
then <productname>PostgreSQL</productname> recognizes backslash escapes
in both regular and escape string constants. This is for backward
compatibility with the historical behavior, in which backslash escapes
compatibility with the historical behavior, where backslash escapes
were always recognized.
Although <varname>standard_conforming_strings</> currently defaults to
<literal>off</>, the default will change to <literal>on</> in a future
release for improved standards compliance. Applications are therefore
encouraged to migrate away from using backslash escapes. If you need
to use a backslash escape to represent a special character, write the
constant with an <literal>E</> to be sure it will be handled the same
string constant with an <literal>E</> to be sure it will be handled the same
way in future releases.
</para>
......@@ -442,7 +442,7 @@ SELECT 'foo' 'bar';
</caution>
<para>
The character with the code zero cannot be in a string constant.
The zero-byte (null byte) character cannot be in a string constant.
</para>
</sect3>
......@@ -896,7 +896,7 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
</indexterm>
<para>
A comment is an arbitrary sequence of characters beginning with
A comment is a sequence of characters beginning with
double dashes and extending to the end of the line, e.g.:
<programlisting>
-- This is a standard SQL comment
......@@ -918,8 +918,8 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
</para>
<para>
A comment is removed from the input stream before further syntax
analysis and is effectively replaced by whitespace.
Comment are removed from the input stream before further syntax
analysis and are effectively replaced by whitespace.
</para>
</sect2>
......@@ -1112,7 +1112,7 @@ SELECT 3 OPERATOR(pg_catalog.+) 4;
</programlisting>
the <literal>OPERATOR</> construct is taken to have the default precedence
shown in <xref linkend="sql-precedence-table"> for <quote>any other</> operator. This is true no matter
which specific operator name appears inside <literal>OPERATOR()</>.
which specific operator appears inside <literal>OPERATOR()</>.
</para>
</sect2>
</sect1>
......@@ -1154,80 +1154,80 @@ SELECT 3 OPERATOR(pg_catalog.+) 4;
<itemizedlist>
<listitem>
<para>
A constant or literal value.
A constant or literal value
</para>
</listitem>
<listitem>
<para>
A column reference.
A column reference
</para>
</listitem>
<listitem>
<para>
A positional parameter reference, in the body of a function definition
or prepared statement.
or prepared statement
</para>
</listitem>
<listitem>
<para>
A subscripted expression.
A subscripted expression
</para>
</listitem>
<listitem>
<para>
A field selection expression.
A field selection expression
</para>
</listitem>
<listitem>
<para>
An operator invocation.
An operator invocation
</para>
</listitem>
<listitem>
<para>
A function call.
A function call
</para>
</listitem>
<listitem>
<para>
An aggregate expression.
An aggregate expression
</para>
</listitem>
<listitem>
<para>
A window function call.
A window function call
</para>
</listitem>
<listitem>
<para>
A type cast.
A type cast
</para>
</listitem>
<listitem>
<para>
A scalar subquery.
A scalar subquery
</para>
</listitem>
<listitem>
<para>
An array constructor.
An array constructor
</para>
</listitem>
<listitem>
<para>
A row constructor.
A row constructor
</para>
</listitem>
......@@ -1264,7 +1264,7 @@ SELECT 3 OPERATOR(pg_catalog.+) 4;
</indexterm>
<para>
A column can be referenced in the form
A column can be referenced in the form:
<synopsis>
<replaceable>correlation</replaceable>.<replaceable>columnname</replaceable>
</synopsis>
......@@ -1426,7 +1426,7 @@ $1.somecolumn
where the <replaceable>operator</replaceable> token follows the syntax
rules of <xref linkend="sql-syntax-operators">, or is one of the
key words <token>AND</token>, <token>OR</token>, and
<token>NOT</token>, or is a qualified operator name in the form
<token>NOT</token>, or is a qualified operator name in the form:
<synopsis>
<literal>OPERATOR(</><replaceable>schema</><literal>.</><replaceable>operatorname</><literal>)</>
</synopsis>
......@@ -1714,7 +1714,7 @@ CAST ( <replaceable>expression</replaceable> AS <replaceable>type</replaceable>
casts that are marked <quote>OK to apply implicitly</>
in the system catalogs. Other casts must be invoked with
explicit casting syntax. This restriction is intended to prevent
surprising conversions from being applied silently.
surprising conversions from being silently applied.
</para>
<para>
......@@ -1730,7 +1730,7 @@ CAST ( <replaceable>expression</replaceable> AS <replaceable>type</replaceable>
<literal>timestamp</> can only be used in this fashion if they are
double-quoted, because of syntactic conflicts. Therefore, the use of
the function-like cast syntax leads to inconsistencies and should
probably be avoided in new applications.
probably be avoided.
</para>
<note>
......@@ -1794,7 +1794,7 @@ SELECT name, (SELECT max(pop) FROM cities WHERE cities.state = states.name)
<para>
An array constructor is an expression that builds an
array value from values for its member elements. A simple array
array using values for its member elements. A simple array
constructor
consists of the key word <literal>ARRAY</literal>, a left square bracket
<literal>[</>, a list of expressions (separated by commas) for the
......@@ -1925,8 +1925,8 @@ SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%');
</indexterm>
<para>
A row constructor is an expression that builds a row value (also
called a composite value) from values
A row constructor is an expression that builds a row (also
called a composite value) using values
for its member fields. A row constructor consists of the key word
<literal>ROW</literal>, a left parenthesis, zero or more
expressions (separated by commas) for the row field values, and finally
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/textsearch.sgml,v 1.50 2009/04/19 20:36:06 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/textsearch.sgml,v 1.51 2009/04/27 16:27:36 momjian Exp $ -->
<chapter id="textsearch">
<title id="textsearch-title">Full Text Search</title>
......@@ -74,7 +74,7 @@
<listitem>
<para>
<emphasis>Parsing documents into <firstterm>tokens</></emphasis>. It is
useful to identify various classes of tokens, e.g. numbers, words,
useful to identify various classes of tokens, e.g., numbers, words,
complex words, email addresses, so that they can be processed
differently. In principle token classes depend on the specific
application, but for most purposes it is adequate to use a predefined
......@@ -323,7 +323,7 @@ text @@ text
The above are all simple text search examples. As mentioned before, full
text search functionality includes the ability to do many more things:
skip indexing certain words (stop words), process synonyms, and use
sophisticated parsing, e.g. parse based on more than just white space.
sophisticated parsing, e.g., parse based on more than just white space.
This functionality is controlled by <firstterm>text search
configurations</>. <productname>PostgreSQL</> comes with predefined
configurations for many languages, and you can easily create your own
......@@ -389,7 +389,7 @@ text @@ text
<para>
Text search parsers and templates are built from low-level C functions;
therefore it requires C programming ability to develop new ones, and
therefore C programming ability is required to develop new ones, and
superuser privileges to install one into a database. (There are examples
of add-on parsers and templates in the <filename>contrib/</> area of the
<productname>PostgreSQL</> distribution.) Since dictionaries and
......@@ -416,7 +416,7 @@ text @@ text
<title>Searching a Table</title>
<para>
It is possible to do full text search with no index. A simple query
It is possible to do a full text search without an index. A simple query
to print the <structname>title</> of each row that contains the word
<literal>friend</> in its <structfield>body</> field is:
......@@ -455,7 +455,8 @@ WHERE to_tsvector(body) @@ to_tsquery('friend');
SELECT title
FROM pgweb
WHERE to_tsvector(title || ' ' || body) @@ to_tsquery('create &amp; table')
ORDER BY last_mod_date DESC LIMIT 10;
ORDER BY last_mod_date DESC
LIMIT 10;
</programlisting>
For clarity we omitted the <function>coalesce</function> function calls
......@@ -518,7 +519,7 @@ CREATE INDEX pgweb_idx ON pgweb USING gin(to_tsvector(config_name, body));
recording which configuration was used for each index entry. This
would be useful, for example, if the document collection contained
documents in different languages. Again,
queries that are to use the index must be phrased to match, e.g.
queries that wish to use the index must be phrased to match, e.g.,
<literal>WHERE to_tsvector(config_name, body) @@ 'a &amp; b'</>.
</para>
......@@ -555,7 +556,8 @@ CREATE INDEX textsearch_idx ON pgweb USING gin(textsearchable_index_col);
SELECT title
FROM pgweb
WHERE textsearchable_index_col @@ to_tsquery('create &amp; table')
ORDER BY last_mod_date DESC LIMIT 10;
ORDER BY last_mod_date DESC
LIMIT 10;
</programlisting>
</para>
......@@ -840,7 +842,7 @@ SELECT plainto_tsquery('english', 'The Fat &amp; Rats:C');
document, and how important is the part of the document where they occur.
However, the concept of relevancy is vague and very application-specific.
Different applications might require additional information for ranking,
e.g. document modification time. The built-in ranking functions are only
e.g., document modification time. The built-in ranking functions are only
examples. You can write your own ranking functions and/or combine their
results with additional factors to fit your specific needs.
</para>
......@@ -877,7 +879,8 @@ SELECT plainto_tsquery('english', 'The Fat &amp; Rats:C');
<term>
<synopsis>
ts_rank_cd(<optional> <replaceable class="PARAMETER">weights</replaceable> <type>float4[]</>, </optional> <replaceable class="PARAMETER">vector</replaceable> <type>tsvector</>, <replaceable class="PARAMETER">query</replaceable> <type>tsquery</> <optional>, <replaceable class="PARAMETER">normalization</replaceable> <type>integer</> </optional>) returns <type>float4</>
ts_rank_cd(<optional> <replaceable class="PARAMETER">weights</replaceable> <type>float4[]</>, </optional> <replaceable class="PARAMETER">vector</replaceable> <type>tsvector</>,
<replaceable class="PARAMETER">query</replaceable> <type>tsquery</> <optional>, <replaceable class="PARAMETER">normalization</replaceable> <type>integer</> </optional>) returns <type>float4</>
</synopsis>
</term>
......@@ -921,13 +924,13 @@ SELECT plainto_tsquery('english', 'The Fat &amp; Rats:C');
</programlisting>
Typically weights are used to mark words from special areas of the
document, like the title or an initial abstract, so that they can be
treated as more or less important than words in the document body.
document, like the title or an initial abstract, so they can be
treated with more or less importance than words in the document body.
</para>
<para>
Since a longer document has a greater chance of containing a query term
it is reasonable to take into account document size, e.g. a hundred-word
it is reasonable to take into account document size, e.g., a hundred-word
document with five instances of a search word is probably more relevant
than a thousand-word document with five instances. Both ranking functions
take an integer <replaceable>normalization</replaceable> option that
......@@ -996,7 +999,8 @@ SELECT plainto_tsquery('english', 'The Fat &amp; Rats:C');
SELECT title, ts_rank_cd(textsearch, query) AS rank
FROM apod, to_tsquery('neutrino|(dark &amp; matter)') query
WHERE query @@ textsearch
ORDER BY rank DESC LIMIT 10;
ORDER BY rank DESC
LIMIT 10;
title | rank
-----------------------------------------------+----------
Neutrinos in the Sun | 3.1
......@@ -1017,7 +1021,8 @@ ORDER BY rank DESC LIMIT 10;
SELECT title, ts_rank_cd(textsearch, query, 32 /* rank/(rank+1) */ ) AS rank
FROM apod, to_tsquery('neutrino|(dark &amp; matter)') query
WHERE query @@ textsearch
ORDER BY rank DESC LIMIT 10;
ORDER BY rank DESC
LIMIT 10;
title | rank
-----------------------------------------------+-------------------
Neutrinos in the Sun | 0.756097569485493
......@@ -1037,7 +1042,7 @@ ORDER BY rank DESC LIMIT 10;
Ranking can be expensive since it requires consulting the
<type>tsvector</type> of each matching document, which can be I/O bound and
therefore slow. Unfortunately, it is almost impossible to avoid since
practical queries often result in large numbers of matches.
practical queries often result in a large number of matches.
</para>
</sect2>
......@@ -1063,7 +1068,7 @@ ORDER BY rank DESC LIMIT 10;
<para>
<function>ts_headline</function> accepts a document along
with a query, and returns an excerpt from
with a query, and returns an excerpt of
the document in which terms from the query are highlighted. The
configuration to be used to parse the document can be specified by
<replaceable>config</replaceable>; if <replaceable>config</replaceable>
......@@ -1080,8 +1085,8 @@ ORDER BY rank DESC LIMIT 10;
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<para>
<literal>StartSel</>, <literal>StopSel</literal>: the strings with which
query words appearing in the document should be delimited to distinguish
<literal>StartSel</>, <literal>StopSel</literal>: the strings to delimit
query words appearing in the document, to distinguish
them from other excerpted words. You must double-quote these strings
if they contain spaces or commas.
</para>
......@@ -1183,7 +1188,8 @@ SELECT id, ts_headline(body, q), rank
FROM (SELECT id, body, q, ts_rank_cd(ti, q) AS rank
FROM apod, to_tsquery('stars') q
WHERE ti @@ q
ORDER BY rank DESC LIMIT 10) AS foo;
ORDER BY rank DESC
LIMIT 10) AS foo;
</programlisting>
</para>
......@@ -1267,7 +1273,7 @@ FROM (SELECT id, body, q, ts_rank_cd(ti, q) AS rank
<listitem>
<para>
This function returns a copy of the input vector in which every
<function>setweight</> returns a copy of the input vector in which every
position has been labeled with the given <replaceable>weight</>, either
<literal>A</literal>, <literal>B</literal>, <literal>C</literal>, or
<literal>D</literal>. (<literal>D</literal> is the default for new
......@@ -1467,7 +1473,7 @@ SELECT querytree(to_tsquery('!defined'));
<para>
The <function>ts_rewrite</function> family of functions search a
given <type>tsquery</> for occurrences of a target
subquery, and replace each occurrence with another
subquery, and replace each occurrence with a
substitute subquery. In essence this operation is a
<type>tsquery</>-specific version of substring replacement.
A target and substitute combination can be
......@@ -1567,7 +1573,9 @@ SELECT ts_rewrite(to_tsquery('supernovae &amp; crab'), 'SELECT * FROM aliases');
We can change the rewriting rules just by updating the table:
<programlisting>
UPDATE aliases SET s = to_tsquery('supernovae|sn &amp; !nebulae') WHERE t = to_tsquery('supernovae');
UPDATE aliases
SET s = to_tsquery('supernovae|sn &amp; !nebulae')
WHERE t = to_tsquery('supernovae');
SELECT ts_rewrite(to_tsquery('supernovae &amp; crab'), 'SELECT * FROM aliases');
ts_rewrite
......@@ -1578,7 +1586,7 @@ SELECT ts_rewrite(to_tsquery('supernovae &amp; crab'), 'SELECT * FROM aliases');
<para>
Rewriting can be slow when there are many rewriting rules, since it
checks every rule for a possible hit. To filter out obvious non-candidate
checks every rule for a possible match. To filter out obvious non-candidate
rules we can use the containment operators for the <type>tsquery</type>
type. In the example below, we select only those rules which might match
the original query:
......@@ -1670,9 +1678,9 @@ SELECT title, body FROM messages WHERE tsv @@ to_tsquery('title &amp; body');
</para>
<para>
A limitation of the built-in triggers is that they treat all the
A limitation of built-in triggers is that they treat all the
input columns alike. To process columns differently &mdash; for
example, to weight title differently from body &mdash; it is necessary
example, to weigh title differently from body &mdash; it is necessary
to write a custom trigger. Here is an example using
<application>PL/pgSQL</application> as the trigger language:
......@@ -1714,11 +1722,13 @@ ON messages FOR EACH ROW EXECUTE PROCEDURE messages_trigger();
</para>
<synopsis>
ts_stat(<replaceable class="PARAMETER">sqlquery</replaceable> <type>text</>, <optional> <replaceable class="PARAMETER">weights</replaceable> <type>text</>, </optional> OUT <replaceable class="PARAMETER">word</replaceable> <type>text</>, OUT <replaceable class="PARAMETER">ndoc</replaceable> <type>integer</>, OUT <replaceable class="PARAMETER">nentry</replaceable> <type>integer</>) returns <type>setof record</>
ts_stat(<replaceable class="PARAMETER">sqlquery</replaceable> <type>text</>, <optional> <replaceable class="PARAMETER">weights</replaceable> <type>text</>,
</optional> OUT <replaceable class="PARAMETER">word</replaceable> <type>text</>, OUT <replaceable class="PARAMETER">ndoc</replaceable> <type>integer</>,
OUT <replaceable class="PARAMETER">nentry</replaceable> <type>integer</>) returns <type>setof record</>
</synopsis>
<para>
<replaceable>sqlquery</replaceable> is a text value containing a SQL
<replaceable>sqlquery</replaceable> is a text value containing an SQL
query which must return a single <type>tsvector</type> column.
<function>ts_stat</> executes the query and returns statistics about
each distinct lexeme (word) contained in the <type>tsvector</type>
......@@ -1930,7 +1940,7 @@ LIMIT 10;
only the basic ASCII letters are reported as a separate token type,
since it is sometimes useful to distinguish them. In most European
languages, token types <literal>word</> and <literal>asciiword</>
should always be treated alike.
should be treated alike.
</para>
</note>
......@@ -2077,7 +2087,7 @@ SELECT alias, description, token FROM ts_debug('http://example.com/stuff/index.h
by the parser, each dictionary in the list is consulted in turn,
until some dictionary recognizes it as a known word. If it is identified
as a stop word, or if no dictionary recognizes the token, it will be
discarded and not indexed or searched for.
discarded and not indexed or searched.
The general rule for configuring a list of dictionaries
is to place first the most narrow, most specific dictionary, then the more
general dictionaries, finishing with a very general dictionary, like
......@@ -2268,7 +2278,8 @@ CREATE TEXT SEARCH DICTIONARY my_synonym (
);
ALTER TEXT SEARCH CONFIGURATION english
ALTER MAPPING FOR asciiword WITH my_synonym, english_stem;
ALTER MAPPING FOR asciiword
WITH my_synonym, english_stem;
SELECT * FROM ts_debug('english', 'Paris');
alias | description | token | dictionaries | dictionary | lexemes
......@@ -2428,7 +2439,8 @@ CREATE TEXT SEARCH DICTIONARY thesaurus_simple (
<programlisting>
ALTER TEXT SEARCH CONFIGURATION russian
ALTER MAPPING FOR asciiword, asciihword, hword_asciipart WITH thesaurus_simple;
ALTER MAPPING FOR asciiword, asciihword, hword_asciipart
WITH thesaurus_simple;
</programlisting>
</para>
......@@ -2457,7 +2469,8 @@ CREATE TEXT SEARCH DICTIONARY thesaurus_astro (
);
ALTER TEXT SEARCH CONFIGURATION russian
ALTER MAPPING FOR asciiword, asciihword, hword_asciipart WITH thesaurus_astro, english_stem;
ALTER MAPPING FOR asciiword, asciihword, hword_asciipart
WITH thesaurus_astro, english_stem;
</programlisting>
Now we can see how it works.
......@@ -2520,7 +2533,7 @@ SELECT plainto_tsquery('supernova star');
<firstterm>morphological dictionaries</>, which can normalize many
different linguistic forms of a word into the same lexeme. For example,
an English <application>Ispell</> dictionary can match all declensions and
conjugations of the search term <literal>bank</literal>, e.g.
conjugations of the search term <literal>bank</literal>, e.g.,
<literal>banking</>, <literal>banked</>, <literal>banks</>,
<literal>banks'</>, and <literal>bank's</>.
</para>
......@@ -2567,9 +2580,8 @@ CREATE TEXT SEARCH DICTIONARY english_ispell (
</para>
<para>
Ispell dictionaries support splitting compound words.
This is a nice feature and
<productname>PostgreSQL</productname> supports it.
Ispell dictionaries support splitting compound words;
a useful feature.
Notice that the affix file should specify a special flag using the
<literal>compoundwords controlled</literal> statement that marks dictionary
words that can participate in compound formation:
......@@ -2603,8 +2615,8 @@ SELECT ts_lexize('norwegian_ispell', 'sjokoladefabrikk');
<title><application>Snowball</> Dictionary</title>
<para>
The <application>Snowball</> dictionary template is based on the project
of Martin Porter, inventor of the popular Porter's stemming algorithm
The <application>Snowball</> dictionary template is based on a project
by Martin Porter, inventor of the popular Porter's stemming algorithm
for the English language. Snowball now provides stemming algorithms for
many languages (see the <ulink url="http://snowball.tartarus.org">Snowball
site</ulink> for more information). Each algorithm understands how to
......@@ -2668,7 +2680,7 @@ CREATE TEXT SEARCH DICTIONARY english_stem (
<para>
As an example, we will create a configuration
<literal>pg</literal>, starting from a duplicate of the built-in
<literal>pg</literal> by duplicating the built-in
<literal>english</> configuration.
<programlisting>
......@@ -2767,7 +2779,7 @@ SHOW default_text_search_config;
<para>
The behavior of a custom text search configuration can easily become
complicated enough to be confusing or undesirable. The functions described
confusing. The functions described
in this section are useful for testing text search objects. You can
test a complete configuration, or test parsers and dictionaries separately.
</para>
......@@ -2938,7 +2950,7 @@ SELECT * FROM ts_debug('public.english','The Brightest supernovaes');
</para>
<para>
You can reduce the volume of output by explicitly specifying which columns
You can reduce the width of the output by explicitly specifying which columns
you want to see:
<programlisting>
......@@ -2968,8 +2980,10 @@ FROM ts_debug('public.english','The Brightest supernovaes');
</indexterm>
<synopsis>
ts_parse(<replaceable class="PARAMETER">parser_name</replaceable> <type>text</>, <replaceable class="PARAMETER">document</replaceable> <type>text</>, OUT <replaceable class="PARAMETER">tokid</> <type>integer</>, OUT <replaceable class="PARAMETER">token</> <type>text</>) returns <type>setof record</>
ts_parse(<replaceable class="PARAMETER">parser_oid</replaceable> <type>oid</>, <replaceable class="PARAMETER">document</replaceable> <type>text</>, OUT <replaceable class="PARAMETER">tokid</> <type>integer</>, OUT <replaceable class="PARAMETER">token</> <type>text</>) returns <type>setof record</>
ts_parse(<replaceable class="PARAMETER">parser_name</replaceable> <type>text</>, <replaceable class="PARAMETER">document</replaceable> <type>text</>,
OUT <replaceable class="PARAMETER">tokid</> <type>integer</>, OUT <replaceable class="PARAMETER">token</> <type>text</>) returns <type>setof record</>
ts_parse(<replaceable class="PARAMETER">parser_oid</replaceable> <type>oid</>, <replaceable class="PARAMETER">document</replaceable> <type>text</>,
OUT <replaceable class="PARAMETER">tokid</> <type>integer</>, OUT <replaceable class="PARAMETER">token</> <type>text</>) returns <type>setof record</>
</synopsis>
<para>
......@@ -2997,8 +3011,10 @@ SELECT * FROM ts_parse('default', '123 - a number');
</indexterm>
<synopsis>
ts_token_type(<replaceable class="PARAMETER">parser_name</> <type>text</>, OUT <replaceable class="PARAMETER">tokid</> <type>integer</>, OUT <replaceable class="PARAMETER">alias</> <type>text</>, OUT <replaceable class="PARAMETER">description</> <type>text</>) returns <type>setof record</>
ts_token_type(<replaceable class="PARAMETER">parser_oid</> <type>oid</>, OUT <replaceable class="PARAMETER">tokid</> <type>integer</>, OUT <replaceable class="PARAMETER">alias</> <type>text</>, OUT <replaceable class="PARAMETER">description</> <type>text</>) returns <type>setof record</>
ts_token_type(<replaceable class="PARAMETER">parser_name</> <type>text</>, OUT <replaceable class="PARAMETER">tokid</> <type>integer</>,
OUT <replaceable class="PARAMETER">alias</> <type>text</>, OUT <replaceable class="PARAMETER">description</> <type>text</>) returns <type>setof record</>
ts_token_type(<replaceable class="PARAMETER">parser_oid</> <type>oid</>, OUT <replaceable class="PARAMETER">tokid</> <type>integer</>,
OUT <replaceable class="PARAMETER">alias</> <type>text</>, OUT <replaceable class="PARAMETER">description</> <type>text</>) returns <type>setof record</>
</synopsis>
<para>
......@@ -3121,11 +3137,11 @@ SELECT plainto_tsquery('supernovae stars');
</indexterm>
<para>
There are two kinds of indexes that can be used to speed up full text
There are two kinds of indexes which can be used to speed up full text
searches.
Note that indexes are not mandatory for full text searching, but in
cases where a column is searched on a regular basis, an index will
usually be desirable.
cases where a column is searched on a regular basis, an index is
usually desirable.
<variablelist>
......@@ -3179,7 +3195,7 @@ SELECT plainto_tsquery('supernovae stars');
<para>
There are substantial performance differences between the two index types,
so it is important to understand which to use.
so it is important to understand their characteristics.
</para>
<para>
......@@ -3188,7 +3204,7 @@ SELECT plainto_tsquery('supernovae stars');
to check the actual table row to eliminate such false matches.
(<productname>PostgreSQL</productname> does this automatically when needed.)
GiST indexes are lossy because each document is represented in the
index by a fixed-length signature. The signature is generated by hashing
index using a fixed-length signature. The signature is generated by hashing
each word into a random bit in an n-bit string, with all these bits OR-ed
together to produce an n-bit document signature. When two words hash to
the same bit position there will be a false match. If all words in
......@@ -3197,7 +3213,7 @@ SELECT plainto_tsquery('supernovae stars');
</para>
<para>
Lossiness causes performance degradation due to useless fetches of table
Lossiness causes performance degradation due to unnecessary fetches of table
records that turn out to be false matches. Since random access to table
records is slow, this limits the usefulness of GiST indexes. The
likelihood of false matches depends on several factors, in particular the
......@@ -3284,7 +3300,7 @@ SELECT plainto_tsquery('supernovae stars');
</para>
<para>
The optional parameter <literal>PATTERN</literal> should be the name of
The optional parameter <literal>PATTERN</literal> can be the name of
a text search object, optionally schema-qualified. If
<literal>PATTERN</literal> is omitted then information about all
visible objects will be displayed. <literal>PATTERN</literal> can be a
......@@ -3565,7 +3581,7 @@ Parser: "pg_catalog.default"
Text search configuration setup is completely different now.
Instead of manually inserting rows into configuration tables,
search is configured through the specialized SQL commands shown
earlier in this chapter. There is not currently any automated
earlier in this chapter. There is no automated
support for converting an existing custom configuration for 8.3;
you're on your own here.
</para>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/typeconv.sgml,v 1.58 2008/12/18 18:20:33 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/typeconv.sgml,v 1.59 2009/04/27 16:27:36 momjian Exp $ -->
<chapter Id="typeconv">
<title>Type Conversion</title>
......@@ -10,15 +10,15 @@
<para>
<acronym>SQL</acronym> statements can, intentionally or not, require
mixing of different data types in the same expression.
the mixing of different data types in the same expression.
<productname>PostgreSQL</productname> has extensive facilities for
evaluating mixed-type expressions.
</para>
<para>
In many cases a user will not need
In many cases a user does not need
to understand the details of the type conversion mechanism.
However, the implicit conversions done by <productname>PostgreSQL</productname>
However, implicit conversions done by <productname>PostgreSQL</productname>
can affect the results of a query. When necessary, these results
can be tailored by using <emphasis>explicit</emphasis> type conversion.
</para>
......@@ -38,21 +38,21 @@ operators.
<acronym>SQL</acronym> is a strongly typed language. That is, every data item
has an associated data type which determines its behavior and allowed usage.
<productname>PostgreSQL</productname> has an extensible type system that is
much more general and flexible than other <acronym>SQL</acronym> implementations.
more general and flexible than other <acronym>SQL</acronym> implementations.
Hence, most type conversion behavior in <productname>PostgreSQL</productname>
is governed by general rules rather than by <foreignphrase>ad hoc</>
heuristics. This allows
mixed-type expressions to be meaningful even with user-defined types.
heuristics. This allows the use of mixed-type expressions even with
user-defined types.
</para>
<para>
The <productname>PostgreSQL</productname> scanner/parser divides lexical
elements into only five fundamental categories: integers, non-integer numbers,
elements into five fundamental categories: integers, non-integer numbers,
strings, identifiers, and key words. Constants of most non-numeric types are
first classified as strings. The <acronym>SQL</acronym> language definition
allows specifying type names with strings, and this mechanism can be used in
<productname>PostgreSQL</productname> to start the parser down the correct
path. For example, the query
path. For example, the query:
<screen>
SELECT text 'Origin' AS "label", point '(0,0)' AS "value";
......@@ -99,7 +99,7 @@ Operators
<productname>PostgreSQL</productname> allows expressions with
prefix and postfix unary (one-argument) operators,
as well as binary (two-argument) operators. Like functions, operators can
be overloaded, and so the same problem of selecting the right operator
be overloaded, so the same problem of selecting the right operator
exists.
</para>
</listitem>
......@@ -136,13 +136,13 @@ and for the <function>GREATEST</> and <function>LEAST</> functions.
</para>
<para>
The system catalogs store information about which conversions, called
<firstterm>casts</firstterm>, between data types are valid, and how to
The system catalogs store information about which conversions, or
<firstterm>casts</firstterm>, exist between which data types, and how to
perform those conversions. Additional casts can be added by the user
with the <xref linkend="sql-createcast" endterm="sql-createcast-title">
command. (This is usually
done in conjunction with defining new data types. The set of casts
between the built-in types has been carefully crafted and is best not
between built-in types has been carefully crafted and is best not
altered.)
</para>
......@@ -152,8 +152,8 @@ altered.)
</indexterm>
<para>
An additional heuristic is provided in the parser to allow better guesses
at proper casting behavior among groups of types that have implicit casts.
An additional heuristic provided by the parser allows improved determination
of the proper casting behavior among groups of types that have implicit casts.
Data types are divided into several basic <firstterm>type
categories</firstterm>, including <type>boolean</type>, <type>numeric</type>,
<type>string</type>, <type>bitstring</type>, <type>datetime</type>,
......@@ -161,7 +161,7 @@ categories</firstterm>, including <type>boolean</type>, <type>numeric</type>,
user-defined. (For a list see <xref linkend="catalog-typcategory-table">;
but note it is also possible to create custom type categories.) Within each
category there can be one or more <firstterm>preferred types</firstterm>, which
are preferentially selected when there is ambiguity. With careful selection
are selected when there is ambiguity. With careful selection
of preferred types and available implicit casts, it is possible to ensure that
ambiguous expressions (those with multiple candidate parsing solutions) can be
resolved in a useful way.
......@@ -179,17 +179,17 @@ Implicit conversions should never have surprising or unpredictable outcomes.
<listitem>
<para>
There should be no extra overhead from the parser or executor
There should be no extra overhead in the parser or executor
if a query does not need implicit type conversion.
That is, if a query is well formulated and the types already match up, then the query should proceed
That is, if a query is well-formed and the types already match, then the query should execute
without spending extra time in the parser and without introducing unnecessary implicit conversion
calls into the query.
calls in the query.
</para>
<para>
Additionally, if a query usually requires an implicit conversion for a function, and
if then the user defines a new function with the correct argument types, the parser
should use this new function and will no longer do the implicit conversion using the old function.
should use this new function and no longer do implicit conversion using the old function.
</para>
</listitem>
</itemizedlist>
......@@ -206,9 +206,8 @@ should use this new function and will no longer do the implicit conversion using
</indexterm>
<para>
The specific operator to be used in an operator invocation is determined
by following
the procedure below. Note that this procedure is indirectly affected
The specific operator invoked is determined by the following
steps. Note that this procedure is affected
by the precedence of the involved operators. See <xref
linkend="sql-precedence"> for more information.
</para>
......@@ -219,9 +218,9 @@ should use this new function and will no longer do the implicit conversion using
<step performance="required">
<para>
Select the operators to be considered from the
<classname>pg_operator</classname> system catalog. If an unqualified
<classname>pg_operator</classname> system catalog. If a non-schema-qualified
operator name was used (the usual case), the operators
considered are those of the right name and argument count that are
considered are those with a matching name and argument count that are
visible in the current search path (see <xref linkend="ddl-schemas-path">).
If a qualified operator name was given, only operators in the specified
schema are considered.
......@@ -230,8 +229,8 @@ schema are considered.
<substeps>
<step performance="optional">
<para>
If the search path finds multiple operators of identical argument types,
only the one appearing earliest in the path is considered. But operators of
If the search path finds multiple operators with identical argument types,
only the one appearing earliest in the path is considered. Operators with
different argument types are considered on an equal footing regardless of
search path position.
</para>
......@@ -251,7 +250,7 @@ operators considered), use it.
<para>
If one argument of a binary operator invocation is of the <type>unknown</type> type,
then assume it is the same type as the other argument for this check.
Other cases involving <type>unknown</type> will never find a match at
Cases involving two <type>unknown</type> types will never find a match at
this step.
</para>
</step>
......@@ -276,7 +275,7 @@ candidate remains, use it; else continue to the next step.
<para>
Run through all candidates and keep those with the most exact matches
on input types. (Domains are considered the same as their base type
for this purpose.) Keep all candidates if none have any exact matches.
for this purpose.) Keep all candidates if none have exact matches.
If only one candidate remains, use it; else continue to the next step.
</para>
</step>
......@@ -296,7 +295,7 @@ categories accepted at those argument positions by the remaining
candidates. At each position, select the <type>string</type> category
if any
candidate accepts that category. (This bias towards string is appropriate
since an unknown-type literal does look like a string.) Otherwise, if
since an unknown-type literal looks like a string.) Otherwise, if
all the remaining candidates accept the same type category, select that
category; otherwise fail because the correct choice cannot be deduced
without more clues. Now discard
......@@ -339,7 +338,7 @@ SELECT 40 ! AS "40 factorial";
</screen>
So the parser does a type conversion on the operand and the query
is equivalent to
is equivalent to:
<screen>
SELECT CAST(40 AS bigint) ! AS "40 factorial";
......@@ -351,7 +350,7 @@ SELECT CAST(40 AS bigint) ! AS "40 factorial";
<title>String Concatenation Operator Type Resolution</title>
<para>
A string-like syntax is used for working with string types as well as for
A string-like syntax is used for working with string types and for
working with complex extension types.
Strings with unspecified type are matched with likely operator candidates.
</para>
......@@ -371,7 +370,7 @@ SELECT text 'abc' || 'def' AS "text and unknown";
<para>
In this case the parser looks to see if there is an operator taking <type>text</type>
for both arguments. Since there is, it assumes that the second argument should
be interpreted as of type <type>text</type>.
be interpreted as type <type>text</type>.
</para>
<para>
......@@ -391,9 +390,9 @@ In this case there is no initial hint for which type to use, since no types
are specified in the query. So, the parser looks for all candidate operators
and finds that there are candidates accepting both string-category and
bit-string-category inputs. Since string category is preferred when available,
that category is selected, and then the
that category is selected, and the
preferred type for strings, <type>text</type>, is used as the specific
type to resolve the unknown literals to.
type to resolve the unknown literals.
</para>
</example>
......@@ -460,7 +459,7 @@ SELECT ~ CAST('20' AS int8) AS "negation";
</indexterm>
<para>
The specific function to be used in a function invocation is determined
The specific function to be invoked is determined
according to the following steps.
</para>
......@@ -470,9 +469,9 @@ SELECT ~ CAST('20' AS int8) AS "negation";
<step performance="required">
<para>
Select the functions to be considered from the
<classname>pg_proc</classname> system catalog. If an unqualified
<classname>pg_proc</classname> system catalog. If a non-schema-qualified
function name was used, the functions
considered are those of the right name and argument count that are
considered are those with a matching name and argument count that are
visible in the current search path (see <xref linkend="ddl-schemas-path">).
If a qualified function name was given, only functions in the specified
schema are considered.
......@@ -482,7 +481,7 @@ schema are considered.
<step performance="optional">
<para>
If the search path finds multiple functions of identical argument types,
only the one appearing earliest in the path is considered. But functions of
only the one appearing earliest in the path is considered. Functions of
different argument types are considered on an equal footing regardless of
search path position.
</para>
......@@ -527,7 +526,7 @@ this step.)
<step performance="required">
<para>
If no exact match is found, see whether the function call appears
If no exact match is found, see if the function call appears
to be a special type conversion request. This happens if the function call
has just one argument and the function name is the same as the (internal)
name of some data type. Furthermore, the function argument must be either
......@@ -555,7 +554,7 @@ Look for the best match.
<substeps>
<step performance="required">
<para>
Discard candidate functions for which the input types do not match
Discard candidate functions in which the input types do not match
and cannot be converted (using an implicit conversion) to match.
<type>unknown</type> literals are
assumed to be convertible to anything for this purpose. If only one
......@@ -566,7 +565,7 @@ candidate remains, use it; else continue to the next step.
<para>
Run through all candidates and keep those with the most exact matches
on input types. (Domains are considered the same as their base type
for this purpose.) Keep all candidates if none have any exact matches.
for this purpose.) Keep all candidates if none have exact matches.
If only one candidate remains, use it; else continue to the next step.
</para>
</step>
......@@ -586,7 +585,7 @@ accepted
at those argument positions by the remaining candidates. At each position,
select the <type>string</type> category if any candidate accepts that category.
(This bias towards string
is appropriate since an unknown-type literal does look like a string.)
is appropriate since an unknown-type literal looks like a string.)
Otherwise, if all the remaining candidates accept the same type category,
select that category; otherwise fail because
the correct choice cannot be deduced without more clues.
......@@ -616,9 +615,9 @@ Some examples follow.
<title>Rounding Function Argument Type Resolution</title>
<para>
There is only one <function>round</function> function with two
arguments. (The first is <type>numeric</type>, the second is
<type>integer</type>.) So the following query automatically converts
There is only one <function>round</function> function which takes two
arguments; it takes a first argument of <type>numeric</type> and
a second argument of <type>integer</type>. So the following query automatically converts
the first argument of type <type>integer</type> to
<type>numeric</type>:
......@@ -631,7 +630,7 @@ SELECT round(4, 4);
(1 row)
</screen>
That query is actually transformed by the parser to
That query is actually transformed by the parser to:
<screen>
SELECT round(CAST (4 AS numeric), 4);
</screen>
......@@ -640,7 +639,7 @@ SELECT round(CAST (4 AS numeric), 4);
<para>
Since numeric constants with decimal points are initially assigned the
type <type>numeric</type>, the following query will require no type
conversion and might therefore be slightly more efficient:
conversion and therefore might be slightly more efficient:
<screen>
SELECT round(4.0, 4);
</screen>
......@@ -679,7 +678,7 @@ SELECT substr(varchar '1234', 3);
(1 row)
</screen>
This is transformed by the parser to effectively become
This is transformed by the parser to effectively become:
<screen>
SELECT substr(CAST (varchar '1234' AS text), 3);
</screen>
......@@ -863,7 +862,7 @@ their underlying base types.
<para>
If all inputs are of type <type>unknown</type>, resolve as type
<type>text</type> (the preferred type of the string category).
Otherwise, the <type>unknown</type> inputs will be ignored.
Otherwise, <type>unknown</type> inputs are ignored.
</para>
</step>
......@@ -914,7 +913,7 @@ SELECT text 'a' AS "text" UNION SELECT 'b';
b
(2 rows)
</screen>
Here, the unknown-type literal <literal>'b'</literal> will be resolved as type <type>text</type>.
Here, the unknown-type literal <literal>'b'</literal> will be resolved to type <type>text</type>.
</para>
</example>
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/xfunc.sgml,v 1.136 2008/12/18 18:20:33 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/xfunc.sgml,v 1.137 2009/04/27 16:27:36 momjian Exp $ -->
<sect1 id="xfunc">
<title>User-Defined Functions</title>
......@@ -2866,7 +2866,7 @@ typedef struct
/*
* OPTIONAL pointer to struct containing tuple description
*
* tuple_desc is for use when returning tuples (i.e. composite data types)
* tuple_desc is for use when returning tuples (i.e., composite data types)
* and is only needed if you are going to build the tuples with
* heap_form_tuple() rather than with BuildTupleFromCStrings(). Note that
* the TupleDesc pointer stored here should usually have been run through
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/xml2.sgml,v 1.5 2008/05/08 16:49:37 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/xml2.sgml,v 1.6 2009/04/27 16:27:36 momjian Exp $ -->
<sect1 id="xml2">
<title>xml2</title>
......@@ -173,7 +173,7 @@
<entry>
<para>
the name of the <quote>key</> field &mdash; this is just a field to be used as
the first column of the output table, i.e. it identifies the record from
the first column of the output table, i.e., it identifies the record from
which each output row came (see note below about multiple values)
</para>
</entry>
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment