Commit c3d583dd authored by Tom Lane's avatar Tom Lane

More updates and copy-editing. Rearrange order of sections a little bit

to put more widely useful info before less widely useful info.
parent 1ade4b33
<!--
$PostgreSQL: pgsql/doc/src/sgml/extend.sgml,v 1.28 2004/06/07 04:04:47 tgl Exp $
$PostgreSQL: pgsql/doc/src/sgml/extend.sgml,v 1.29 2004/12/30 03:13:56 tgl Exp $
-->
<chapter id="extend">
......@@ -152,8 +152,8 @@ $PostgreSQL: pgsql/doc/src/sgml/extend.sgml,v 1.28 2004/06/07 04:04:47 tgl Exp $
<para>
Domains can be created using the <acronym>SQL</> command
<command>CREATE DOMAIN</command>. Their creation and use is not
discussed in this chapter.
<xref linkend="sql-createdomain" endterm="sql-createdomain-title">.
Their creation and use is not discussed in this chapter.
</para>
</sect2>
......@@ -221,7 +221,7 @@ $PostgreSQL: pgsql/doc/src/sgml/extend.sgml,v 1.28 2004/06/07 04:04:47 tgl Exp $
Thus, when more than one argument position is declared with a polymorphic
type, the net effect is that only certain combinations of actual argument
types are allowed. For example, a function declared as
<literal>foo(anyelement, anyelement)</> will take any two input values,
<literal>equal(anyelement, anyelement)</> will take any two input values,
so long as they are of the same data type.
</para>
......
<!--
$PostgreSQL: pgsql/doc/src/sgml/postgres.sgml,v 1.71 2004/12/29 23:36:47 tgl Exp $
$PostgreSQL: pgsql/doc/src/sgml/postgres.sgml,v 1.72 2004/12/30 03:13:56 tgl Exp $
-->
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook V4.2//EN" [
......@@ -192,18 +192,19 @@ $PostgreSQL: pgsql/doc/src/sgml/postgres.sgml,v 1.71 2004/12/29 23:36:47 tgl Exp
user-defined functions, data types, triggers, etc. These are
advanced topics which should probably be approached only after all
the other user documentation about <productname>PostgreSQL</> has
been understood. This part also describes the server-side
been understood. Later chapters in this part describe the server-side
programming languages available in the
<productname>PostgreSQL</productname> distribution as well as
general issues concerning server-side programming languages. This
information is only useful to readers that have read at least the
first few chapters of this part.
general issues concerning server-side programming languages. It
is essential to read at least the earlier sections of <xref
linkend="extend"> (covering functions) before diving into the
material about server-side programming languages.
</para>
</partintro>
&extend;
&rules;
&trigger;
&rules;
&xplang;
&plsql;
......
<!-- $PostgreSQL: pgsql/doc/src/sgml/rules.sgml,v 1.36 2004/11/15 06:32:14 neilc Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/rules.sgml,v 1.37 2004/12/30 03:13:56 tgl Exp $ -->
<Chapter Id="rules">
<Title>The Rule System</Title>
......@@ -104,19 +104,19 @@
<ListItem>
<Para>
The range table is a list of relations that are used in the query.
In a <command>SELECT</command> statement these are the relations given after
the <literal>FROM</literal> key word.
In a <command>SELECT</command> statement these are the relations given after
the <literal>FROM</literal> key word.
</Para>
<Para>
Every range table entry identifies a table or view and tells
by which name it is called in the other parts of the query.
In the query tree, the range table entries are referenced by
number rather than by name, so here it doesn't matter if there
are duplicate names as it would in an <Acronym>SQL</Acronym>
statement. This can happen after the range tables of rules
have been merged in. The examples in this chapter will not have
this situation.
by which name it is called in the other parts of the query.
In the query tree, the range table entries are referenced by
number rather than by name, so here it doesn't matter if there
are duplicate names as it would in an <Acronym>SQL</Acronym>
statement. This can happen after the range tables of rules
have been merged in. The examples in this chapter will not have
this situation.
</Para>
</ListItem>
</VarListEntry>
......@@ -128,21 +128,21 @@
<ListItem>
<Para>
This is an index into the range table that identifies the
relation where the results of the query go.
relation where the results of the query go.
</Para>
<Para>
<command>SELECT</command> queries normally don't have a result
relation. The special case of a <command>SELECT INTO</command> is
mostly identical to a <command>CREATE TABLE</command> followed by a
<literal>INSERT ... SELECT</literal> and is not discussed
separately here.
<command>SELECT</command> queries normally don't have a result
relation. The special case of a <command>SELECT INTO</command> is
mostly identical to a <command>CREATE TABLE</command> followed by a
<literal>INSERT ... SELECT</literal> and is not discussed
separately here.
</Para>
<Para>
For <command>INSERT</command>, <command>UPDATE</command>, and
<command>DELETE</command> commands, the result relation is the table
(or view!) where the changes take effect.
<command>DELETE</command> commands, the result relation is the table
(or view!) where the changes are to take effect.
</Para>
</ListItem>
</VarListEntry>
......@@ -167,39 +167,39 @@
<Para>
<command>DELETE</command> commands don't need a target list
because they don't produce any result. In fact, the planner will
add a special <acronym>CTID</> entry to the empty target list, but
this is after the rule system and will be discussed later; for the
rule system, the target list is empty.
because they don't produce any result. In fact, the planner will
add a special <acronym>CTID</> entry to the empty target list, but
this is after the rule system and will be discussed later; for the
rule system, the target list is empty.
</Para>
<Para>
For <command>INSERT</command> commands, the target list describes
the new rows that should go into the result relation. It consists of the
expressions in the <literal>VALUES</> clause or the ones from the
<command>SELECT</command> clause in <literal>INSERT
... SELECT</literal>. The first step of the rewrite process adds
target list entries for any columns that were not assigned to by
the original command but have defaults. Any remaining columns (with
neither a given value nor a default) will be filled in by the
planner with a constant null expression.
the new rows that should go into the result relation. It consists of the
expressions in the <literal>VALUES</> clause or the ones from the
<command>SELECT</command> clause in <literal>INSERT
... SELECT</literal>. The first step of the rewrite process adds
target list entries for any columns that were not assigned to by
the original command but have defaults. Any remaining columns (with
neither a given value nor a default) will be filled in by the
planner with a constant null expression.
</Para>
<Para>
For <command>UPDATE</command> commands, the target list
describes the new rows that should replace the old ones. In the
rule system, it contains just the expressions from the <literal>SET
column = expression</literal> part of the command. The planner will handle
missing columns by inserting expressions that copy the values from
the old row into the new one. And it will add the special
<acronym>CTID</> entry just as for <command>DELETE</command>, too.
describes the new rows that should replace the old ones. In the
rule system, it contains just the expressions from the <literal>SET
column = expression</literal> part of the command. The planner will handle
missing columns by inserting expressions that copy the values from
the old row into the new one. And it will add the special
<acronym>CTID</> entry just as for <command>DELETE</command>, too.
</Para>
<Para>
Every entry in the target list contains an expression that can
be a constant value, a variable pointing to a column of one
of the relations in the range table, a parameter, or an expression
tree made of function calls, constants, variables, operators, etc.
be a constant value, a variable pointing to a column of one
of the relations in the range table, a parameter, or an expression
tree made of function calls, constants, variables, operators, etc.
</Para>
</ListItem>
</VarListEntry>
......@@ -211,12 +211,12 @@
<ListItem>
<Para>
The query's qualification is an expression much like one of
those contained in the target list entries. The result value of
this expression is a Boolean that tells whether the operation
(<command>INSERT</command>, <command>UPDATE</command>,
<command>DELETE</command>, or <command>SELECT</command>) for the
final result row should be executed or not. It corresponds to the <literal>WHERE</> clause
of an <Acronym>SQL</Acronym> statement.
those contained in the target list entries. The result value of
this expression is a Boolean that tells whether the operation
(<command>INSERT</command>, <command>UPDATE</command>,
<command>DELETE</command>, or <command>SELECT</command>) for the
final result row should be executed or not. It corresponds to the <literal>WHERE</> clause
of an <Acronym>SQL</Acronym> statement.
</Para>
</ListItem>
</VarListEntry>
......@@ -228,17 +228,17 @@
<ListItem>
<Para>
The query's join tree shows the structure of the <literal>FROM</> clause.
For a simple query like <literal>SELECT ... FROM a, b, c</literal>, the join tree is just
a list of the <literal>FROM</> items, because we are allowed to join them in
any order. But when <literal>JOIN</> expressions, particularly outer joins,
are used, we have to join in the order shown by the joins.
In that case, the join tree shows the structure of the <literal>JOIN</> expressions. The
restrictions associated with particular <literal>JOIN</> clauses (from <literal>ON</> or
<literal>USING</> expressions) are stored as qualification expressions attached
to those join-tree nodes. It turns out to be convenient to store
the top-level <literal>WHERE</> expression as a qualification attached to the
top-level join-tree item, too. So really the join tree represents
both the <literal>FROM</> and <literal>WHERE</> clauses of a <command>SELECT</command>.
For a simple query like <literal>SELECT ... FROM a, b, c</literal>, the join tree is just
a list of the <literal>FROM</> items, because we are allowed to join them in
any order. But when <literal>JOIN</> expressions, particularly outer joins,
are used, we have to join in the order shown by the joins.
In that case, the join tree shows the structure of the <literal>JOIN</> expressions. The
restrictions associated with particular <literal>JOIN</> clauses (from <literal>ON</> or
<literal>USING</> expressions) are stored as qualification expressions attached
to those join-tree nodes. It turns out to be convenient to store
the top-level <literal>WHERE</> expression as a qualification attached to the
top-level join-tree item, too. So really the join tree represents
both the <literal>FROM</> and <literal>WHERE</> clauses of a <command>SELECT</command>.
</Para>
</ListItem>
</VarListEntry>
......@@ -250,10 +250,10 @@
<ListItem>
<Para>
The other parts of the query tree like the <literal>ORDER BY</>
clause aren't of interest here. The rule system
substitutes some entries there while applying rules, but that
doesn't have much to do with the fundamentals of the rule
system.
clause aren't of interest here. The rule system
substitutes some entries there while applying rules, but that
doesn't have much to do with the fundamentals of the rule
system.
</Para>
</ListItem>
</VarListEntry>
......@@ -322,7 +322,7 @@ CREATE RULE "_RETURN" AS ON SELECT TO myview DO INSTEAD
Currently, there can be only one action in an <literal>ON SELECT</> rule, and it must
be an unconditional <command>SELECT</> action that is <literal>INSTEAD</>. This restriction was
required to make rules safe enough to open them for ordinary users, and
it restricts <literal>ON SELECT</> rules to real view rules.
it restricts <literal>ON SELECT</> rules to act like views.
</Para>
<Para>
......@@ -695,29 +695,29 @@ UPDATE t1 SET b = t2.b WHERE t1.a = t2.a;
<ItemizedList>
<ListItem>
<Para>
The range tables contain entries for the tables <literal>t1</> and <literal>t2</>.
</Para>
<Para>
The range tables contain entries for the tables <literal>t1</> and <literal>t2</>.
</Para>
</ListItem>
<ListItem>
<Para>
The target lists contain one variable that points to column
<literal>b</> of the range table entry for table <literal>t2</>.
</Para>
<Para>
The target lists contain one variable that points to column
<literal>b</> of the range table entry for table <literal>t2</>.
</Para>
</ListItem>
<ListItem>
<Para>
The qualification expressions compare the columns <literal>a</> of both
range-table entries for equality.
</Para>
<Para>
The qualification expressions compare the columns <literal>a</> of both
range-table entries for equality.
</Para>
</ListItem>
<ListItem>
<Para>
The join trees show a simple join between <literal>t1</> and <literal>t2</>.
</Para>
<Para>
The join trees show a simple join between <literal>t1</> and <literal>t2</>.
</Para>
</ListItem>
</ItemizedList>
</para>
......@@ -860,34 +860,34 @@ SELECT t1.a, t2.b, t1.ctid FROM t1, t2 WHERE t1.a = t2.a;
<ItemizedList>
<ListItem>
<Para>
They are allowed to have no action.
</Para>
</ListItem>
<Para>
They are allowed to have no action.
</Para>
</ListItem>
<ListItem>
<Para>
They can have multiple actions.
</Para>
</ListItem>
<Para>
They can have multiple actions.
</Para>
</ListItem>
<ListItem>
<Para>
They can be <literal>INSTEAD</> or <literal>ALSO</> (default).
</Para>
</ListItem>
<Para>
They can be <literal>INSTEAD</> or <literal>ALSO</> (default).
</Para>
</ListItem>
<ListItem>
<Para>
The pseudorelations <literal>NEW</> and <literal>OLD</> become useful.
</Para>
</ListItem>
<Para>
The pseudorelations <literal>NEW</> and <literal>OLD</> become useful.
</Para>
</ListItem>
<ListItem>
<Para>
They can have rule qualifications.
</Para>
</ListItem>
<Para>
They can have rule qualifications.
</Para>
</ListItem>
</ItemizedList>
Second, they don't modify the query tree in place. Instead they
......@@ -1875,14 +1875,15 @@ GRANT SELECT ON phone_number TO secretary;
</Para>
<Para>
For the things that can be implemented by both,
it depends on the usage of the database, which is the best.
For the things that can be implemented by both, which is best
depends on the usage of the database.
A trigger is fired for any affected row once. A rule manipulates
the query tree or generates an additional one. So if many
the query or generates an additional query. So if many
rows are affected in one statement, a rule issuing one extra
command would usually do a better job than a trigger that is
command is likely to be faster than a trigger that is
called for every single row and must execute its operations
many times.
many times. However, the trigger approach is conceptually far
simpler than the rule approach, and is easier for novices to get right.
</Para>
<Para>
......
<!--
$PostgreSQL: pgsql/doc/src/sgml/trigger.sgml,v 1.38 2004/12/13 18:05:09 petere Exp $
$PostgreSQL: pgsql/doc/src/sgml/trigger.sgml,v 1.39 2004/12/30 03:13:56 tgl Exp $
-->
<chapter id="triggers">
......@@ -58,6 +58,15 @@ $PostgreSQL: pgsql/doc/src/sgml/trigger.sgml,v 1.38 2004/12/13 18:05:09 petere E
respectively.
</para>
<para>
Statement-level <quote>before</> triggers naturally fire before the
statement starts to do anything, while statement-level <quote>after</>
triggers fire at the very end of the statement. Row-level <quote>before</>
triggers fire immediately before a particular row is operated on,
while row-level <quote>after</> triggers fire at the end of the statement
(but before any statement-level <quote>after</> triggers).
</para>
<para>
Trigger functions invoked by per-statement triggers should always
return <symbol>NULL</symbol>. Trigger functions invoked by per-row
......@@ -110,6 +119,21 @@ $PostgreSQL: pgsql/doc/src/sgml/trigger.sgml,v 1.38 2004/12/13 18:05:09 petere E
triggers are not fired.
</para>
<para>
Typically, row before triggers are used for checking or
modifying the data that will be inserted or updated. For example,
a before trigger might be used to insert the current time into a
timestamp column, or to check that two elements of the row are
consistent. Row after triggers are most sensibly
used to propagate the updates to other tables, or make consistency
checks against other tables. The reason for this division of labor is
that an after trigger can be certain it is seeing the final value of the
row, while a before trigger cannot; there might be other before triggers
firing after it. If you have no specific reason to make a trigger before
or after, the before case is more efficient, since the information about
the operation doesn't have to be saved until end of statement.
</para>
<para>
If a trigger function executes SQL commands then these
commands may fire triggers again. This is known as cascading
......@@ -140,6 +164,20 @@ $PostgreSQL: pgsql/doc/src/sgml/trigger.sgml,v 1.38 2004/12/13 18:05:09 petere E
trigger.
</para>
<para>
Each programming language that supports triggers has its own method
for making the trigger input data available to the trigger function.
This input data includes the type of trigger event (e.g.,
<command>INSERT</command> or <command>UPDATE</command>) as well as any
arguments that were listed in <command>CREATE TRIGGER</>.
For a row-level trigger, the input data also includes the
<varname>NEW</varname> row for <command>INSERT</command> and
<command>UPDATE</command> triggers, and/or the <varname>OLD</varname> row
for <command>UPDATE</command> and <command>DELETE</command> triggers.
Statement-level triggers do not currently have any way to examine the
individual row(s) modified by the statement.
</para>
</sect1>
<sect1 id="trigger-datachanges">
......@@ -277,73 +315,73 @@ typedef struct TriggerData
<term><structfield>tg_event</></term>
<listitem>
<para>
Describes the event for which the function is called. You may use the
following macros to examine <literal>tg_event</literal>:
<variablelist>
<varlistentry>
<term><literal>TRIGGER_FIRED_BEFORE(tg_event)</literal></term>
<listitem>
<para>
Returns true if the trigger fired before the operation.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>TRIGGER_FIRED_AFTER(tg_event)</literal></term>
<listitem>
<para>
Returns true if the trigger fired after the operation.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>TRIGGER_FIRED_FOR_ROW(tg_event)</literal></term>
<listitem>
<para>
Returns true if the trigger fired for a row-level event.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>TRIGGER_FIRED_FOR_STATEMENT(tg_event)</literal></term>
<listitem>
<para>
Returns true if the trigger fired for a statement-level event.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>TRIGGER_FIRED_BY_INSERT(tg_event)</literal></term>
<listitem>
<para>
Returns true if the trigger was fired by an <command>INSERT</command> command.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>TRIGGER_FIRED_BY_UPDATE(tg_event)</literal></term>
<listitem>
<para>
Returns true if the trigger was fired by an <command>UPDATE</command> command.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>TRIGGER_FIRED_BY_DELETE(tg_event)</literal></term>
<listitem>
<para>
Returns true if the trigger was fired by a <command>DELETE</command> command.
</para>
</listitem>
</varlistentry>
</variablelist>
Describes the event for which the function is called. You may use the
following macros to examine <literal>tg_event</literal>:
<variablelist>
<varlistentry>
<term><literal>TRIGGER_FIRED_BEFORE(tg_event)</literal></term>
<listitem>
<para>
Returns true if the trigger fired before the operation.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>TRIGGER_FIRED_AFTER(tg_event)</literal></term>
<listitem>
<para>
Returns true if the trigger fired after the operation.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>TRIGGER_FIRED_FOR_ROW(tg_event)</literal></term>
<listitem>
<para>
Returns true if the trigger fired for a row-level event.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>TRIGGER_FIRED_FOR_STATEMENT(tg_event)</literal></term>
<listitem>
<para>
Returns true if the trigger fired for a statement-level event.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>TRIGGER_FIRED_BY_INSERT(tg_event)</literal></term>
<listitem>
<para>
Returns true if the trigger was fired by an <command>INSERT</command> command.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>TRIGGER_FIRED_BY_UPDATE(tg_event)</literal></term>
<listitem>
<para>
Returns true if the trigger was fired by an <command>UPDATE</command> command.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>TRIGGER_FIRED_BY_DELETE(tg_event)</literal></term>
<listitem>
<para>
Returns true if the trigger was fired by a <command>DELETE</command> command.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
</listitem>
</varlistentry>
......@@ -352,15 +390,15 @@ typedef struct TriggerData
<term><structfield>tg_relation</></term>
<listitem>
<para>
A pointer to a structure describing the relation that the trigger fired for.
Look at <filename>utils/rel.h</> for details about
this structure. The most interesting things are
<literal>tg_relation->rd_att</> (descriptor of the relation
tuples) and <literal>tg_relation->rd_rel->relname</>
(relation name; the type is not <type>char*</> but
<type>NameData</>; use
<literal>SPI_getrelname(tg_relation)</> to get a <type>char*</> if you
need a copy of the name).
A pointer to a structure describing the relation that the trigger fired for.
Look at <filename>utils/rel.h</> for details about
this structure. The most interesting things are
<literal>tg_relation->rd_att</> (descriptor of the relation
tuples) and <literal>tg_relation->rd_rel->relname</>
(relation name; the type is not <type>char*</> but
<type>NameData</>; use
<literal>SPI_getrelname(tg_relation)</> to get a <type>char*</> if you
need a copy of the name).
</para>
</listitem>
</varlistentry>
......@@ -369,13 +407,13 @@ typedef struct TriggerData
<term><structfield>tg_trigtuple</></term>
<listitem>
<para>
A pointer to the row for which the trigger was fired. This is
the row being inserted, updated, or deleted. If this trigger
was fired for an <command>INSERT</command> or
<command>DELETE</command> then this is what you should return
to from the function if you don't want to replace the row with
a different one (in the case of <command>INSERT</command>) or
skip the operation.
A pointer to the row for which the trigger was fired. This is
the row being inserted, updated, or deleted. If this trigger
was fired for an <command>INSERT</command> or
<command>DELETE</command> then this is what you should return
from the function if you don't want to replace the row with
a different one (in the case of <command>INSERT</command>) or
skip the operation.
</para>
</listitem>
</varlistentry>
......@@ -384,13 +422,13 @@ typedef struct TriggerData
<term><structfield>tg_newtuple</></term>
<listitem>
<para>
A pointer to the new version of the row, if the trigger was
fired for an <command>UPDATE</command>, and <symbol>NULL</> if
it is for an <command>INSERT</command> or a
<command>DELETE</command>. This is what you have to return
from the function if the event is an <command>UPDATE</command>
and you don't want to replace this row by a different one or
skip the operation.
A pointer to the new version of the row, if the trigger was
fired for an <command>UPDATE</command>, and <symbol>NULL</> if
it is for an <command>INSERT</command> or a
<command>DELETE</command>. This is what you have to return
from the function if the event is an <command>UPDATE</command>
and you don't want to replace this row by a different one or
skip the operation.
</para>
</listitem>
</varlistentry>
......@@ -399,8 +437,8 @@ typedef struct TriggerData
<term><structfield>tg_trigger</></term>
<listitem>
<para>
A pointer to a structure of type <structname>Trigger</>,
defined in <filename>utils/rel.h</>:
A pointer to a structure of type <structname>Trigger</>,
defined in <filename>utils/rel.h</>:
<programlisting>
typedef struct Trigger
......
<!--
$PostgreSQL: pgsql/doc/src/sgml/xfunc.sgml,v 1.90 2004/12/13 18:05:09 petere Exp $
$PostgreSQL: pgsql/doc/src/sgml/xfunc.sgml,v 1.91 2004/12/30 03:13:56 tgl Exp $
-->
<sect1 id="xfunc">
......@@ -24,7 +24,7 @@ $PostgreSQL: pgsql/doc/src/sgml/xfunc.sgml,v 1.90 2004/12/13 18:05:09 petere Exp
<listitem>
<para>
procedural language functions (functions written in, for
example, <application>PL/Tcl</> or <application>PL/pgSQL</>)
example, <application>PL/pgSQL</> or <application>PL/Tcl</>)
(<xref linkend="xfunc-pl">)
</para>
</listitem>
......@@ -44,9 +44,10 @@ $PostgreSQL: pgsql/doc/src/sgml/xfunc.sgml,v 1.90 2004/12/13 18:05:09 petere Exp
<para>
Every kind
of function can take base types, composite types, or
combinations of these as arguments (parameters). In addition,
combinations of these as arguments (parameters). In addition,
every kind of function can return a base type or
a composite type.
a composite type. Functions may also be defined to return
sets of base or composite values.
</para>
<para>
......@@ -64,7 +65,8 @@ $PostgreSQL: pgsql/doc/src/sgml/xfunc.sgml,v 1.90 2004/12/13 18:05:09 petere Exp
<para>
Throughout this chapter, it can be useful to look at the reference
page of the <xref linkend="sql-createfunction"> command to
page of the <xref linkend="sql-createfunction"
endterm="sql-createfunction-title"> command to
understand the examples better. Some examples from this chapter
can be found in <filename>funcs.sql</filename> and
<filename>funcs.c</filename> in the <filename>src/tutorial</>
......@@ -141,7 +143,7 @@ CREATE FUNCTION one() RETURNS integer AS $$
SELECT 1 AS result;
$$ LANGUAGE SQL;
-- Alternative syntax:
-- Alternative syntax for string literal:
CREATE FUNCTION one() RETURNS integer AS '
SELECT 1 AS result;
' LANGUAGE SQL;
......@@ -335,16 +337,16 @@ $$ LANGUAGE SQL;
<itemizedlist>
<listitem>
<para>
The select list order in the query must be exactly the same as
that in which the columns appear in the table associated
with the composite type. (Naming the columns, as we did above,
is irrelevant to the system.)
The select list order in the query must be exactly the same as
that in which the columns appear in the table associated
with the composite type. (Naming the columns, as we did above,
is irrelevant to the system.)
</para>
</listitem>
<listitem>
<para>
You must typecast the expressions to match the
definition of the composite type, or you will get errors like this:
You must typecast the expressions to match the
definition of the composite type, or you will get errors like this:
<screen>
<computeroutput>
ERROR: function declared to return emp returns varchar instead of text at column 1
......@@ -356,15 +358,9 @@ ERROR: function declared to return emp returns varchar instead of text at colum
</para>
<para>
A function that returns a row (composite type) can be used as a table
function, as described below. It can also be called in the context
of an SQL expression, but only when you
extract a single attribute out of the row or pass the entire row into
another function that accepts the same composite type.
</para>
<para>
This is an example of extracting an attribute out of a row type:
When you call a function that returns a row (composite type) in a
SQL expression, you might want only one field (attribute) from its
result. You can do that with syntax like this:
<screen>
SELECT (new_emp()).name;
......@@ -374,11 +370,14 @@ SELECT (new_emp()).name;
None
</screen>
We need the extra parentheses to keep the parser from getting confused:
The extra parentheses are needed to keep the parser from getting
confused. If you try to do it without them, you get something like this:
<screen>
SELECT new_emp().name;
ERROR: syntax error at or near "." at character 17
LINE 1: SELECT new_emp().name;
^
</screen>
</para>
......@@ -412,9 +411,8 @@ SELECT name(emp) AS youngster
</para>
<para>
The other way to use a function returning a row result is to declare a
second function accepting a row type argument and pass the
result of the first function to it:
Another way to use a function returning a row result is to pass the
result to another function that accepts the correct row type as input:
<screen>
CREATE FUNCTION getname(emp) RETURNS text AS $$
......@@ -428,6 +426,11 @@ SELECT getname(new_emp());
(1 row)
</screen>
</para>
<para>
Another way to use a function that returns a composite type is to
call it as a table function, as described below.
</para>
</sect2>
<sect2>
......@@ -469,7 +472,7 @@ SELECT *, upper(fooname) FROM getfoo(1) AS t1;
<para>
Note that we only got one row out of the function. This is because
we did not use <literal>SETOF</>. This is described in the next section.
we did not use <literal>SETOF</>. That is described in the next section.
</para>
</sect2>
......@@ -598,7 +601,7 @@ ERROR: could not determine "anyarray"/"anyelement" type because input has type
</para>
<para>
It is permitted to have polymorphic arguments with a deterministic
It is permitted to have polymorphic arguments with a fixed
return type, but the converse is not. For example:
<screen>
CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS boolean AS $$
......@@ -621,6 +624,201 @@ DETAIL: A function returning "anyarray" or "anyelement" must have at least one
</sect2>
</sect1>
<sect1 id="xfunc-overload">
<title>Function Overloading</title>
<indexterm zone="xfunc-overload">
<primary>overloading</primary>
<secondary>functions</secondary>
</indexterm>
<para>
More than one function may be defined with the same SQL name, so long
as the arguments they take are different. In other words,
function names can be <firstterm>overloaded</firstterm>. When a
query is executed, the server will determine which function to
call from the data types and the number of the provided arguments.
Overloading can also be used to simulate functions with a variable
number of arguments, up to a finite maximum number.
</para>
<para>
When creating a family of overloaded functions, one should be
careful not to create ambiguities. For instance, given the
functions
<programlisting>
CREATE FUNCTION test(int, real) RETURNS ...
CREATE FUNCTION test(smallint, double precision) RETURNS ...
</programlisting>
it is not immediately clear which function would be called with
some trivial input like <literal>test(1, 1.5)</literal>. The
currently implemented resolution rules are described in
<xref linkend="typeconv">, but it is unwise to design a system that subtly
relies on this behavior.
</para>
<para>
A function that takes a single argument of a composite type should
generally not have the same name as any attribute (field) of that type.
Recall that <literal>attribute(table)</literal> is considered equivalent
to <literal>table.attribute</literal>. In the case that there is an
ambiguity between a function on a composite type and an attribute of
the composite type, the attribute will always be used. It is possible
to override that choice by schema-qualifying the function name
(that is, <literal>schema.func(table)</literal>) but it's better to
avoid the problem by not choosing conflicting names.
</para>
<para>
When overloading C-language functions, there is an additional
constraint: The C name of each function in the family of
overloaded functions must be different from the C names of all
other functions, either internal or dynamically loaded. If this
rule is violated, the behavior is not portable. You might get a
run-time linker error, or one of the functions will get called
(usually the internal one). The alternative form of the
<literal>AS</> clause for the SQL <command>CREATE
FUNCTION</command> command decouples the SQL function name from
the function name in the C source code. For instance,
<programlisting>
CREATE FUNCTION test(int) RETURNS int
AS '<replaceable>filename</>', 'test_1arg'
LANGUAGE C;
CREATE FUNCTION test(int, int) RETURNS int
AS '<replaceable>filename</>', 'test_2arg'
LANGUAGE C;
</programlisting>
The names of the C functions here reflect one of many possible conventions.
</para>
</sect1>
<sect1 id="xfunc-volatility">
<title>Function Volatility Categories</title>
<indexterm zone="xfunc-volatility">
<primary>volatility</primary>
<secondary>functions</secondary>
</indexterm>
<para>
Every function has a <firstterm>volatility</> classification, with
the possibilities being <literal>VOLATILE</>, <literal>STABLE</>, or
<literal>IMMUTABLE</>. <literal>VOLATILE</> is the default if the
<command>CREATE FUNCTION</command> command does not specify a category.
The volatility category is a promise to the optimizer about the behavior
of the function:
<itemizedlist>
<listitem>
<para>
A <literal>VOLATILE</> function can do anything, including modifying
the database. It can return different results on successive calls with
the same arguments. The optimizer makes no assumptions about the
behavior of such functions. A query using a volatile function will
re-evaluate the function at every row where its value is needed.
</para>
</listitem>
<listitem>
<para>
A <literal>STABLE</> function cannot modify the database and is
guaranteed to return the same results given the same arguments
for all calls within a single surrounding query. This category
allows the optimizer to optimize away multiple calls of the function
within a single query. In particular, it is safe to use an expression
containing such a function in an index scan condition. (Since an
index scan will evaluate the comparison value only once, not once at
each row, it is not valid to use a <literal>VOLATILE</> function in
an index scan condition.)
</para>
</listitem>
<listitem>
<para>
An <literal>IMMUTABLE</> function cannot modify the database and is
guaranteed to return the same results given the same arguments forever.
This category allows the optimizer to pre-evaluate the function when
a query calls it with constant arguments. For example, a query like
<literal>SELECT ... WHERE x = 2 + 2</> can be simplified on sight to
<literal>SELECT ... WHERE x = 4</>, because the function underlying
the integer addition operator is marked <literal>IMMUTABLE</>.
</para>
</listitem>
</itemizedlist>
</para>
<para>
For best optimization results, you should label your functions with the
strictest volatility category that is valid for them.
</para>
<para>
Any function with side-effects <emphasis>must</> be labeled
<literal>VOLATILE</>, so that calls to it cannot be optimized away.
Even a function with no side-effects needs to be labeled
<literal>VOLATILE</> if its value can change within a single query;
some examples are <literal>random()</>, <literal>currval()</>,
<literal>timeofday()</>.
</para>
<para>
There is relatively little difference between <literal>STABLE</> and
<literal>IMMUTABLE</> categories when considering simple interactive
queries that are planned and immediately executed: it doesn't matter
a lot whether a function is executed once during planning or once during
query execution startup. But there is a big difference if the plan is
saved and reused later. Labeling a function <literal>IMMUTABLE</> when
it really isn't may allow it to be prematurely folded to a constant during
planning, resulting in a stale value being re-used during subsequent uses
of the plan. This is a hazard when using prepared statements or when
using function languages that cache plans (such as
<application>PL/pgSQL</>).
</para>
<para>
Because of the snapshotting behavior of MVCC (see <xref linkend="mvcc">)
a function containing only <command>SELECT</> commands can safely be
marked <literal>STABLE</>, even if it selects from tables that might be
undergoing modifications by concurrent queries.
<productname>PostgreSQL</productname> will execute a <literal>STABLE</>
function using the snapshot established for the calling query, and so it
will see a fixed view of the database throughout that query.
Also note
that the <function>current_timestamp</> family of functions qualify
as stable, since their values do not change within a transaction.
</para>
<para>
The same snapshotting behavior is used for <command>SELECT</> commands
within <literal>IMMUTABLE</> functions. It is generally unwise to select
from database tables within an <literal>IMMUTABLE</> function at all,
since the immutability will be broken if the table contents ever change.
However, <productname>PostgreSQL</productname> does not enforce that you
do not do that.
</para>
<para>
A common error is to label a function <literal>IMMUTABLE</> when its
results depend on a configuration parameter. For example, a function
that manipulates timestamps might well have results that depend on the
<xref linkend="guc-timezone"> setting. For safety, such functions should
be labeled <literal>STABLE</> instead.
</para>
<note>
<para>
Before <productname>PostgreSQL</productname> release 8.0, the requirement
that <literal>STABLE</> and <literal>IMMUTABLE</> functions cannot modify
the database was not enforced by the system. Release 8.0 enforces it
by requiring SQL functions and procedural language functions of these
categories to contain no SQL commands other than <command>SELECT</>.
(This is not a completely bulletproof test, since such functions could
still call <literal>VOLATILE</> functions that modify the database.
If you do that, you will find that the <literal>STABLE</> or
<literal>IMMUTABLE</> function does not notice the database changes
applied by the called function.)
</para>
</note>
</sect1>
<sect1 id="xfunc-pl">
<title>Procedural Language Functions</title>
......@@ -754,7 +952,7 @@ CREATE FUNCTION square_root(double precision) RETURNS double precision
<para>
If the name starts with the string <literal>$libdir</literal>,
that part is replaced by the <productname>PostgreSQL</> package
library directory
library directory
name, which is determined at build time.<indexterm><primary>$libdir</></>
</para>
</listitem>
......@@ -864,17 +1062,17 @@ CREATE FUNCTION square_root(double precision) RETURNS double precision
<itemizedlist>
<listitem>
<para>
pass by value, fixed-length
pass by value, fixed-length
</para>
</listitem>
<listitem>
<para>
pass by reference, fixed-length
pass by reference, fixed-length
</para>
</listitem>
<listitem>
<para>
pass by reference, variable-length
pass by reference, variable-length
</para>
</listitem>
</itemizedlist>
......@@ -993,169 +1191,169 @@ memcpy(destination-&gt;data, buffer, 40);
<title>Equivalent C Types for Built-In SQL Types</title>
<tgroup cols="3">
<thead>
<row>
<entry>
SQL Type
</entry>
<entry>
C Type
</entry>
<entry>
Defined In
</entry>
</row>
<row>
<entry>
SQL Type
</entry>
<entry>
C Type
</entry>
<entry>
Defined In
</entry>
</row>
</thead>
<tbody>
<row>
<entry><type>abstime</type></entry>
<entry><type>AbsoluteTime</type></entry>
<entry><filename>utils/nabstime.h</filename></entry>
</row>
<row>
<entry><type>boolean</type></entry>
<entry><type>bool</type></entry>
<entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
</row>
<row>
<entry><type>box</type></entry>
<entry><type>BOX*</type></entry>
<entry><filename>utils/geo_decls.h</filename></entry>
</row>
<row>
<entry><type>bytea</type></entry>
<entry><type>bytea*</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>"char"</type></entry>
<entry><type>char</type></entry>
<entry>(compiler built-in)</entry>
</row>
<row>
<entry><type>character</type></entry>
<entry><type>BpChar*</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>cid</type></entry>
<entry><type>CommandId</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>date</type></entry>
<entry><type>DateADT</type></entry>
<entry><filename>utils/date.h</filename></entry>
</row>
<row>
<entry><type>smallint</type> (<type>int2</type>)</entry>
<entry><type>int2</type> or <type>int16</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>int2vector</type></entry>
<entry><type>int2vector*</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>integer</type> (<type>int4</type>)</entry>
<entry><type>int4</type> or <type>int32</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>real</type> (<type>float4</type>)</entry>
<entry><type>float4*</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>double precision</type> (<type>float8</type>)</entry>
<entry><type>float8*</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>interval</type></entry>
<entry><type>Interval*</type></entry>
<entry><filename>utils/timestamp.h</filename></entry>
</row>
<row>
<entry><type>lseg</type></entry>
<entry><type>LSEG*</type></entry>
<entry><filename>utils/geo_decls.h</filename></entry>
</row>
<row>
<entry><type>name</type></entry>
<entry><type>Name</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>oid</type></entry>
<entry><type>Oid</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>oidvector</type></entry>
<entry><type>oidvector*</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>path</type></entry>
<entry><type>PATH*</type></entry>
<entry><filename>utils/geo_decls.h</filename></entry>
</row>
<row>
<entry><type>point</type></entry>
<entry><type>POINT*</type></entry>
<entry><filename>utils/geo_decls.h</filename></entry>
</row>
<row>
<entry><type>regproc</type></entry>
<entry><type>regproc</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>reltime</type></entry>
<entry><type>RelativeTime</type></entry>
<entry><filename>utils/nabstime.h</filename></entry>
</row>
<row>
<entry><type>text</type></entry>
<entry><type>text*</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>tid</type></entry>
<entry><type>ItemPointer</type></entry>
<entry><filename>storage/itemptr.h</filename></entry>
</row>
<row>
<entry><type>time</type></entry>
<entry><type>TimeADT</type></entry>
<entry><filename>utils/date.h</filename></entry>
</row>
<row>
<entry><type>time with time zone</type></entry>
<entry><type>TimeTzADT</type></entry>
<entry><filename>utils/date.h</filename></entry>
</row>
<row>
<entry><type>timestamp</type></entry>
<entry><type>Timestamp*</type></entry>
<entry><filename>utils/timestamp.h</filename></entry>
</row>
<row>
<entry><type>tinterval</type></entry>
<entry><type>TimeInterval</type></entry>
<entry><filename>utils/nabstime.h</filename></entry>
</row>
<row>
<entry><type>varchar</type></entry>
<entry><type>VarChar*</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>xid</type></entry>
<entry><type>TransactionId</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>abstime</type></entry>
<entry><type>AbsoluteTime</type></entry>
<entry><filename>utils/nabstime.h</filename></entry>
</row>
<row>
<entry><type>boolean</type></entry>
<entry><type>bool</type></entry>
<entry><filename>postgres.h</filename> (maybe compiler built-in)</entry>
</row>
<row>
<entry><type>box</type></entry>
<entry><type>BOX*</type></entry>
<entry><filename>utils/geo_decls.h</filename></entry>
</row>
<row>
<entry><type>bytea</type></entry>
<entry><type>bytea*</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>"char"</type></entry>
<entry><type>char</type></entry>
<entry>(compiler built-in)</entry>
</row>
<row>
<entry><type>character</type></entry>
<entry><type>BpChar*</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>cid</type></entry>
<entry><type>CommandId</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>date</type></entry>
<entry><type>DateADT</type></entry>
<entry><filename>utils/date.h</filename></entry>
</row>
<row>
<entry><type>smallint</type> (<type>int2</type>)</entry>
<entry><type>int2</type> or <type>int16</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>int2vector</type></entry>
<entry><type>int2vector*</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>integer</type> (<type>int4</type>)</entry>
<entry><type>int4</type> or <type>int32</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>real</type> (<type>float4</type>)</entry>
<entry><type>float4*</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>double precision</type> (<type>float8</type>)</entry>
<entry><type>float8*</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>interval</type></entry>
<entry><type>Interval*</type></entry>
<entry><filename>utils/timestamp.h</filename></entry>
</row>
<row>
<entry><type>lseg</type></entry>
<entry><type>LSEG*</type></entry>
<entry><filename>utils/geo_decls.h</filename></entry>
</row>
<row>
<entry><type>name</type></entry>
<entry><type>Name</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>oid</type></entry>
<entry><type>Oid</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>oidvector</type></entry>
<entry><type>oidvector*</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>path</type></entry>
<entry><type>PATH*</type></entry>
<entry><filename>utils/geo_decls.h</filename></entry>
</row>
<row>
<entry><type>point</type></entry>
<entry><type>POINT*</type></entry>
<entry><filename>utils/geo_decls.h</filename></entry>
</row>
<row>
<entry><type>regproc</type></entry>
<entry><type>regproc</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>reltime</type></entry>
<entry><type>RelativeTime</type></entry>
<entry><filename>utils/nabstime.h</filename></entry>
</row>
<row>
<entry><type>text</type></entry>
<entry><type>text*</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>tid</type></entry>
<entry><type>ItemPointer</type></entry>
<entry><filename>storage/itemptr.h</filename></entry>
</row>
<row>
<entry><type>time</type></entry>
<entry><type>TimeADT</type></entry>
<entry><filename>utils/date.h</filename></entry>
</row>
<row>
<entry><type>time with time zone</type></entry>
<entry><type>TimeTzADT</type></entry>
<entry><filename>utils/date.h</filename></entry>
</row>
<row>
<entry><type>timestamp</type></entry>
<entry><type>Timestamp*</type></entry>
<entry><filename>utils/timestamp.h</filename></entry>
</row>
<row>
<entry><type>tinterval</type></entry>
<entry><type>TimeInterval</type></entry>
<entry><filename>utils/nabstime.h</filename></entry>
</row>
<row>
<entry><type>varchar</type></entry>
<entry><type>VarChar*</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
<row>
<entry><type>xid</type></entry>
<entry><type>TransactionId</type></entry>
<entry><filename>postgres.h</filename></entry>
</row>
</tbody>
</tgroup>
</table>
......@@ -1567,9 +1765,9 @@ concat_text(PG_FUNCTION_ARGS)
<listitem>
<para>
Always zero the bytes of your structures using
<function>memset</function>. Without this, it's difficult to
support hash indexes or hash joins, as you must pick out only
the significant bits of your data structure to compute a hash.
<function>memset</function>. Without this, it's difficult to
support hash indexes or hash joins, as you must pick out only
the significant bits of your data structure to compute a hash.
Even if you initialize all fields of your structure, there may be
alignment padding (holes in the structure) that may contain
garbage values.
......@@ -1618,7 +1816,7 @@ concat_text(PG_FUNCTION_ARGS)
&dfunc;
<sect2 id="xfunc-c-pgxs">
<title>Extension build infrastructure</title>
<title>Extension Building Infrastructure</title>
<indexterm zone="xfunc-c-pgxs">
<primary>pgxs</primary>
......@@ -1868,14 +2066,14 @@ c_overpaid(PG_FUNCTION_ARGS)
HeapTupleHeader t = PG_GETARG_HEAPTUPLEHEADER(0);
int32 limit = PG_GETARG_INT32(1);
bool isnull;
int32 salary;
Datum salary;
salary = DatumGetInt32(GetAttributeByName(t, "salary", &amp;isnull));
salary = GetAttributeByName(t, "salary", &amp;isnull);
if (isnull)
PG_RETURN_BOOL(false);
/* Alternatively, we might prefer to do PG_RETURN_NULL() for null salary. */
PG_RETURN_BOOL(salary &gt; limit);
PG_RETURN_BOOL(DatumGetInt32(salary) &gt; limit);
}
</programlisting>
</para>
......@@ -1890,7 +2088,10 @@ c_overpaid(PG_FUNCTION_ARGS)
return parameter that tells whether the attribute
is null. <function>GetAttributeByName</function> returns a <type>Datum</type>
value that you can convert to the proper data type by using the
appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function> macro.
appropriate <function>DatumGet<replaceable>XXX</replaceable>()</function>
macro. Note that the return value is meaningless if the null flag is
set; always check the null flag before trying to do anything with the
result.
</para>
<para>
......@@ -2222,7 +2423,7 @@ testpassbyval(PG_FUNCTION_ARGS)
/* stuff done only on the first call of the function */
if (SRF_IS_FIRSTCALL())
{
MemoryContext oldcontext;
MemoryContext oldcontext;
/* create a function context for cross-call persistence */
funcctx = SRF_FIRSTCALL_INIT();
......@@ -2393,196 +2594,6 @@ CREATE FUNCTION make_array(anyelement) RETURNS anyarray
</sect2>
</sect1>
<sect1 id="xfunc-overload">
<title>Function Overloading</title>
<indexterm zone="xfunc-overload">
<primary>overloading</primary>
<secondary>functions</secondary>
</indexterm>
<para>
More than one function may be defined with the same SQL name, so long
as the arguments they take are different. In other words,
function names can be <firstterm>overloaded</firstterm>. When a
query is executed, the server will determine which function to
call from the data types and the number of the provided arguments.
Overloading can also be used to simulate functions with a variable
number of arguments, up to a finite maximum number.
</para>
<para>
When creating a family of overloaded functions, one should be
careful not to create ambiguities. For instance, given the
functions
<programlisting>
CREATE FUNCTION test(int, real) RETURNS ...
CREATE FUNCTION test(smallint, double precision) RETURNS ...
</programlisting>
it is not immediately clear which function would be called with
some trivial input like <literal>test(1, 1.5)</literal>. The
currently implemented resolution rules are described in
<xref linkend="typeconv">, but it is unwise to design a system that subtly
relies on this behavior.
</para>
<para>
A function that takes a single argument of a composite type should
generally not have the same name as any attribute (field) of that type.
Recall that <literal>attribute(table)</literal> is considered equivalent
to <literal>table.attribute</literal>. In the case that there is an
ambiguity between a function on a composite type and an attribute of
the composite type, the attribute will always be used. It is possible
to override that choice by schema-qualifying the function name
(that is, <literal>schema.func(table)</literal>) but it's better to
avoid the problem by not choosing conflicting names.
</para>
<para>
When overloading C-language functions, there is an additional
constraint: The C name of each function in the family of
overloaded functions must be different from the C names of all
other functions, either internal or dynamically loaded. If this
rule is violated, the behavior is not portable. You might get a
run-time linker error, or one of the functions will get called
(usually the internal one). The alternative form of the
<literal>AS</> clause for the SQL <command>CREATE
FUNCTION</command> command decouples the SQL function name from
the function name in the C source code. For instance,
<programlisting>
CREATE FUNCTION test(int) RETURNS int
AS '<replaceable>filename</>', 'test_1arg'
LANGUAGE C;
CREATE FUNCTION test(int, int) RETURNS int
AS '<replaceable>filename</>', 'test_2arg'
LANGUAGE C;
</programlisting>
The names of the C functions here reflect one of many possible conventions.
</para>
</sect1>
<sect1 id="xfunc-volatility">
<title>Function Volatility Categories</title>
<indexterm zone="xfunc-volatility">
<primary>volatility</primary>
<secondary>functions</secondary>
</indexterm>
<para>
Every function has a <firstterm>volatility</> classification, with
the possibilities being <literal>VOLATILE</>, <literal>STABLE</>, or
<literal>IMMUTABLE</>. <literal>VOLATILE</> is the default if the
<command>CREATE FUNCTION</command> command does not specify a category.
The volatility category is a promise to the optimizer about the behavior
of the function:
<itemizedlist>
<listitem>
<para>
A <literal>VOLATILE</> function can do anything, including modifying
the database. It can return different results on successive calls with
the same arguments. The optimizer makes no assumptions about the
behavior of such functions. A query using a volatile function will
re-evaluate the function at every row where its value is needed.
</para>
</listitem>
<listitem>
<para>
A <literal>STABLE</> function cannot modify the database and is
guaranteed to return the same results given the same arguments
for all calls within a single surrounding query. This category
allows the optimizer to optimize away multiple calls of the function
within a single query. In particular, it is safe to use an expression
containing such a function in an index scan condition. (Since an
index scan will evaluate the comparison value only once, not once at
each row, it is not valid to use a <literal>VOLATILE</> function in
an index scan condition.)
</para>
</listitem>
<listitem>
<para>
An <literal>IMMUTABLE</> function cannot modify the database and is
guaranteed to return the same results given the same arguments forever.
This category allows the optimizer to pre-evaluate the function when
a query calls it with constant arguments. For example, a query like
<literal>SELECT ... WHERE x = 2 + 2</> can be simplified on sight to
<literal>SELECT ... WHERE x = 4</>, because the function underlying
the integer addition operator is marked <literal>IMMUTABLE</>.
</para>
</listitem>
</itemizedlist>
</para>
<para>
For best optimization results, you should label your functions with the
strictest volatility category that is valid for them.
</para>
<para>
Any function with side-effects <emphasis>must</> be labeled
<literal>VOLATILE</>, so that calls to it cannot be optimized away.
Even a function with no side-effects needs to be labeled
<literal>VOLATILE</> if its value can change within a single query;
some examples are <literal>random()</>, <literal>currval()</>,
<literal>timeofday()</>.
</para>
<para>
There is relatively little difference between <literal>STABLE</> and
<literal>IMMUTABLE</> categories when considering simple interactive
queries that are planned and immediately executed: it doesn't matter
a lot whether a function is executed once during planning or once during
query execution startup. But there is a big difference if the plan is
saved and reused later. Labeling a function <literal>IMMUTABLE</> when
it really isn't may allow it to be prematurely folded to a constant during
planning, resulting in a stale value being re-used during subsequent uses
of the plan. This is a hazard when using prepared statements or when
using function languages that cache plans (such as
<application>PL/pgSQL</>).
</para>
<para>
Because of the snapshotting behavior of MVCC (see <xref linkend="mvcc">)
a function containing only <command>SELECT</> commands can safely be
marked <literal>STABLE</>, even if it selects from tables that might be
undergoing modifications by concurrent queries.
<productname>PostgreSQL</productname> will execute a <literal>STABLE</>
function using the snapshot established for the calling query, and so it
will see a fixed view of the database throughout that query.
Also note
that the <function>current_timestamp</> family of functions qualify
as stable, since their values do not change within a transaction.
</para>
<para>
The same snapshotting behavior is used for <command>SELECT</> commands
within <literal>IMMUTABLE</> functions. It is generally unwise to select
from database tables within an <literal>IMMUTABLE</> function at all,
since the immutability will be broken if the table contents ever change.
However, <productname>PostgreSQL</productname> does not enforce that you
do not do that.
</para>
<para>
A common error is to label a function <literal>IMMUTABLE</> when its
results depend on a configuration parameter. For example, a function
that manipulates timestamps might well have results that depend on the
<xref linkend="guc-timezone"> setting. For safety, such functions should
be labeled <literal>STABLE</> instead.
</para>
<note>
<para>
Before <productname>PostgreSQL</productname> release 8.0, the requirement
that <literal>STABLE</> and <literal>IMMUTABLE</> functions cannot modify
the database was not enforced by the system. Release 8.0 enforces it
by requiring SQL functions and procedural language functions of these
categories to contain no SQL commands other than <command>SELECT</>.
</para>
</note>
</sect1>
<!-- Keep this comment at the end of the file
Local variables:
mode:sgml
......
<!--
$PostgreSQL: pgsql/doc/src/sgml/xplang.sgml,v 1.26 2003/11/29 19:51:38 pgsql Exp $
$PostgreSQL: pgsql/doc/src/sgml/xplang.sgml,v 1.27 2004/12/30 03:13:56 tgl Exp $
-->
<chapter id="xplang">
......@@ -29,10 +29,16 @@ $PostgreSQL: pgsql/doc/src/sgml/xplang.sgml,v 1.26 2003/11/29 19:51:38 pgsql Exp
<para>
Writing a handler for a new procedural language is described in
<xref linkend="plhandler">. Several procedural languages are
available in the standard <productname>PostgreSQL</productname>
available in the core <productname>PostgreSQL</productname>
distribution, which can serve as examples.
</para>
<para>
There are additional procedural languages available that are not
included in the core distribution. <xref linkend="external-projects">
has information about finding them.
</para>
<sect1 id="xplang-install">
<title>Installing Procedural Languages</title>
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment