Commit 5bfb0540 authored by Tom Lane's avatar Tom Lane

Update docs for 7.4 array features and polymorphic functions.

This is Joe Conway's patch of 7-Aug plus further editorializing
of my own.
parent 329a1b72
This diff is collapsed.
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.121 2003/07/29 00:03:17 tgl Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/datatype.sgml,v 1.122 2003/08/09 22:50:21 tgl Exp $
-->
<chapter id="datatype">
......@@ -2993,6 +2993,10 @@ SELECT * FROM test;
<primary>anyarray</primary>
</indexterm>
<indexterm zone="datatype-pseudo">
<primary>anyelement</primary>
</indexterm>
<indexterm zone="datatype-pseudo">
<primary>void</primary>
</indexterm>
......@@ -3053,7 +3057,14 @@ SELECT * FROM test;
<row>
<entry><type>anyarray</></entry>
<entry>Indicates that a function accepts any array data type.</entry>
<entry>Indicates that a function accepts any array data type
(see <xref linkend="types-polymorphic">).</entry>
</row>
<row>
<entry><type>anyelement</></entry>
<entry>Indicates that a function accepts any data type
(see <xref linkend="types-polymorphic">).</entry>
</row>
<row>
......@@ -3101,8 +3112,10 @@ SELECT * FROM test;
Functions coded in procedural languages may use pseudo-types only as
allowed by their implementation languages. At present the procedural
languages all forbid use of a pseudo-type as argument type, and allow
only <type>void</> as a result type (plus <type>trigger</> when the
function is used as a trigger).
only <type>void</> and <type>record</> as a result type (plus
<type>trigger</> when the function is used as a trigger). Some also
support polymorphic functions using the types <type>anyarray</> and
<type>anyelement</>.
</para>
<para>
......
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/extend.sgml,v 1.22 2003/04/13 09:57:35 petere Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/extend.sgml,v 1.23 2003/08/09 22:50:21 tgl Exp $
-->
<chapter id="extend">
......@@ -20,6 +20,11 @@ $Header: /cvsroot/pgsql/doc/src/sgml/extend.sgml,v 1.22 2003/04/13 09:57:35 pete
functions (starting in <xref linkend="xfunc">)
</para>
</listitem>
<listitem>
<para>
aggregates (starting in <xref linkend="xaggr">)
</para>
</listitem>
<listitem>
<para>
data types (starting in <xref linkend="xtypes">)
......@@ -32,7 +37,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/extend.sgml,v 1.22 2003/04/13 09:57:35 pete
</listitem>
<listitem>
<para>
aggregates (starting in <xref linkend="xaggr">)
operator classes for indexes (starting in <xref linkend="xindex">)
</para>
</listitem>
</itemizedlist>
......@@ -47,7 +52,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/extend.sgml,v 1.22 2003/04/13 09:57:35 pete
relational database systems, you know that they store information
about databases, tables, columns, etc., in what are
commonly known as system catalogs. (Some systems call
this the data dictionary). The catalogs appear to the
this the data dictionary.) The catalogs appear to the
user as tables like any other, but the <acronym>DBMS</acronym> stores
its internal bookkeeping in them. One key difference
between <productname>PostgreSQL</productname> and standard relational database systems is
......@@ -88,24 +93,113 @@ $Header: /cvsroot/pgsql/doc/src/sgml/extend.sgml,v 1.22 2003/04/13 09:57:35 pete
</indexterm>
<para>
Data types are divided into base types and composite types.
<productname>PostgreSQL</productname> data types are divided into base
types, composite types, domain types, and pseudo-types.
</para>
<para>
Base types are those, like <type>int4</type>, that are implemented
in a language such as C. They generally correspond to
what are often known as abstract data types. <productname>PostgreSQL</productname>
can only operate on such types through methods provided
below the level of the <acronym>SQL</> language (typically in a low-level
language such as C). They generally correspond to
what are often known as abstract data types.
<productname>PostgreSQL</productname>
can only operate on such types through functions provided
by the user and only understands the behavior of such
types to the extent that the user describes them.
Composite types are created whenever the user creates a
table. The
user can <quote>look inside</quote> at the attributes of these types
from the query language.
types to the extent that the user describes them. Base types are
further subdivided into scalar and array types. For each scalar type,
a corresponding array type is automatically created that can hold
variable-size arrays of that scalar type.
</para>
<para>
Composite types, or row types, are created whenever the user creates a
table; it's also possible to define a <quote>stand-alone</> composite
type with no associated table. A composite type is simply a list of
base types with associated field names. A value of a composite type
is a row or record of field values. The user can access the component
fields from <acronym>SQL</> queries.
</para>
<para>
A domain type is based on a particular base
type and for many purposes is interchangeable with its base type.
However, a domain may have constraints that restrict its valid values
to a subset of what the underlying base type would allow. Domains can
be created by simple <acronym>SQL</> commands.
</para>
<para>
Finally, there are a few <quote>pseudo-types</> for special purposes.
Pseudo-types cannot appear as fields of tables or composite types, but
they can be used to declare the argument and result types of functions.
This provides a mechanism within the type system to identify special
classes of functions. <xref
linkend="datatype-pseudotypes-table"> lists the existing
pseudo-types.
</para>
<sect2 id="types-polymorphic">
<title>Polymorphic Types and Functions</title>
<indexterm>
<primary>polymorphic types</primary>
</indexterm>
<indexterm>
<primary>polymorphic functions</primary>
</indexterm>
<para>
Two pseudo-types of special interest are <type>anyelement</> and
<type>anyarray</>, which are collectively called <firstterm>polymorphic
types</>. Any function declared using these types is said to be
a <firstterm>polymorphic function</>. A polymorphic function can
operate on many different data types, with the specific data type(s)
being determined by the data types actually passed to it in a particular
call.
</para>
<para>
Polymorphic arguments and results are tied to each other and are resolved
to a specific data type when a query calling a polymorphic function is
parsed. Each position (either argument or return value) declared as
<type>anyelement</type> is allowed to have any specific actual
data type, but in any given call they must all be the
<emphasis>same</emphasis> actual type. Each
position declared as <type>anyarray</type> can have any array data type,
but similarly they must all be the same type. If there are
positions declared <type>anyarray</type> and others declared
<type>anyelement</type>, the actual array type in the
<type>anyarray</type> positions must be an array whose elements are
the same type appearing in the <type>anyelement</type> positions.
</para>
<para>
Thus, when more than one argument position is declared with a polymorphic
type, the net effect is that only certain combinations of actual argument
types are allowed. For example, a function declared as
<literal>foo(anyelement, anyelement)</> will take any two input values,
so long as they are of the same data type.
</para>
<para>
When the return value of a function is declared as a polymorphic type,
there must be at least one argument position that is also polymorphic,
and the actual data type supplied as the argument determines the actual
result type for that call. For example, if there were not already
an array subscripting mechanism, one could define a function that
implements subscripting as <literal>subscript(anyarray, integer)
returns anyelement</>. This declaration constrains the actual first
argument to be an array type, and allows the parser to infer the correct
result type from the actual first argument's type.
</para>
</sect2>
</sect1>
&xfunc;
&xaggr;
&xtypes;
&xoper;
&xaggr;
&xindex;
</chapter>
......
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/func.sgml,v 1.164 2003/08/04 14:00:13 petere Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/func.sgml,v 1.165 2003/08/09 22:50:21 tgl Exp $
PostgreSQL documentation
-->
......@@ -7044,28 +7044,67 @@ SELECT pg_type_is_visible('myschema.widget'::regtype);
<tbody>
<row>
<entry> <literal>=</literal> </entry>
<entry>equals</entry>
<entry>equal</entry>
<entry><literal>ARRAY[1.1,2.1,3.1]::int[] = ARRAY[1,2,3]</literal></entry>
<entry><literal>t</literal></entry>
</row>
<row>
<entry> <literal>&lt;&gt;</literal> </entry>
<entry>not equal</entry>
<entry><literal>ARRAY[1,2,3] &lt;&gt; ARRAY[1,2,4]</literal></entry>
<entry><literal>t</literal></entry>
</row>
<row>
<entry> <literal>&lt;</literal> </entry>
<entry>less than</entry>
<entry><literal>ARRAY[1,2,3] &lt; ARRAY[1,2,4]</literal></entry>
<entry><literal>t</literal></entry>
</row>
<row>
<entry> <literal>&gt;</literal> </entry>
<entry>greater than</entry>
<entry><literal>ARRAY[1,4,3] &gt; ARRAY[1,2,4]</literal></entry>
<entry><literal>t</literal></entry>
</row>
<row>
<entry> <literal>&lt;=</literal> </entry>
<entry>less than or equal</entry>
<entry><literal>ARRAY[1,2,3] &lt;= ARRAY[1,2,3]</literal></entry>
<entry><literal>t</literal></entry>
</row>
<row>
<entry> <literal>&gt;=</literal> </entry>
<entry>greater than or equal</entry>
<entry><literal>ARRAY[1,4,3] &gt;= ARRAY[1,4,3]</literal></entry>
<entry><literal>t</literal></entry>
</row>
<row>
<entry> <literal>||</literal> </entry>
<entry>array-to-array concatenation</entry>
<entry><literal>ARRAY[1,2,3] || ARRAY[4,5,6]</literal></entry>
<entry><literal>{{1,2,3},{4,5,6}}</literal></entry>
</row>
<row>
<entry> <literal>||</literal> </entry>
<entry>array-to-array concatenation</entry>
<entry><literal>ARRAY[1,2,3] || ARRAY[[4,5,6],[7,8,9]]</literal></entry>
<entry><literal>{{1,2,3},{4,5,6},{7,8,9}}</literal></entry>
</row>
<row>
<entry> <literal>||</literal> </entry>
<entry>element-to-array concatenation</entry>
<entry><literal>3 || ARRAY[4,5,6]</literal></entry>
<entry><literal>{3,4,5,6}</literal></entry>
</row>
<row>
<entry> <literal>||</literal> </entry>
<entry>array-to-element concatenation</entry>
......
This diff is collapsed.
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v 1.80 2003/08/04 14:00:14 petere Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v 1.81 2003/08/09 22:50:22 tgl Exp $
-->
<chapter id="sql-syntax">
......@@ -867,7 +867,8 @@ SELECT 3 OPERATOR(pg_catalog.+) 4;
<listitem>
<para>
A positional parameter reference, in the body of a function definition.
A positional parameter reference, in the body of a function definition
or prepared statement.
</para>
</listitem>
......@@ -901,6 +902,12 @@ SELECT 3 OPERATOR(pg_catalog.+) 4;
</para>
</listitem>
<listitem>
<para>
An array constructor.
</para>
</listitem>
<listitem>
<para>
Another value expression in parentheses, useful to group subexpressions and override precedence.
......@@ -1216,8 +1223,86 @@ SELECT name, (SELECT max(pop) FROM cities WHERE cities.state = states.name)
</para>
</sect2>
<sect2 id="sql-syntax-array-constructors">
<title>Array Constructors</title>
<indexterm>
<primary>arrays</primary>
<secondary>constructors</secondary>
</indexterm>
<para>
An <firstterm>array constructor</> is an expression that builds an
array value from values for its member elements. A simple array
constructor
consists of the keyword <literal>ARRAY</literal>, a left square bracket
<literal>[</>, one or more expressions (separated by commas) for the
array element values, and finally a right square bracket <literal>]</>.
For example,
<programlisting>
SELECT ARRAY[1,2,3+4];
array
---------
{1,2,7}
(1 row)
</programlisting>
The array element type is the common type of the member expressions,
determined using the same rules as for <literal>UNION</> or
<literal>CASE</> constructs (see <xref linkend="typeconv-union-case">).
</para>
<para>
Multidimensional array values can be built by nesting array
constructors.
In the inner constructors, the keyword <literal>ARRAY</literal> may
be omitted. For example, these produce the same result:
<programlisting>
SELECT ARRAY[ARRAY[1,2], ARRAY[3,4]];
array
---------------
{{1,2},{3,4}}
(1 row)
SELECT ARRAY[[1,2],[3,4]];
array
---------------
{{1,2},{3,4}}
(1 row)
</programlisting>
Since multidimensional arrays must be rectangular, inner constructors
at the same level must produce sub-arrays of identical dimensions.
</para>
<para>
It is also possible to construct an array from the results of a
subquery. In this form, the array constructor is written with the
keyword <literal>ARRAY</literal> followed by a parenthesized (not
bracketed) subquery. For example:
<programlisting>
SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%');
?column?
-------------------------------------------------------------
{2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31}
(1 row)
</programlisting>
The sub-select must return a single column. The
resulting one-dimensional array will have an element for each row in the
sub-select result, with an element type matching that of the sub-select's
output column.
</para>
<para>
The subscripts of an array value built with <literal>ARRAY</literal>
always begin with one. For more information about arrays, see
<xref linkend="arrays">.
</para>
</sect2>
<sect2 id="syntax-express-eval">
<title>Expression Evaluation</title>
<title>Expression Evaluation Rules</title>
<para>
The order of evaluation of subexpressions is not defined. In
......
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/xaggr.sgml,v 1.20 2003/04/10 01:22:44 petere Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/xaggr.sgml,v 1.21 2003/08/09 22:50:22 tgl Exp $
-->
<sect1 id="xaggr">
......@@ -72,8 +72,9 @@ SELECT complex_sum(a) FROM test_complex;
omitting the <literal>initcond</literal> phrase, so that the initial state
condition is null. Ordinarily this would mean that the <literal>sfunc</literal>
would need to check for a null state-condition input, but for
<function>sum</function> and some other simple aggregates like <function>max</> and <function>min</>,
it would be sufficient to insert the first nonnull input value into
<function>sum</function> and some other simple aggregates like
<function>max</> and <function>min</>,
it is sufficient to insert the first nonnull input value into
the state variable and then start applying the transition function
at the second nonnull input value. <productname>PostgreSQL</productname>
will do that automatically if the initial condition is null and
......@@ -111,8 +112,55 @@ CREATE AGGREGATE avg (
</para>
<para>
For further details see the description of the <command>CREATE
AGGREGATE</command> command in <xref linkend="reference">.
Aggregate functions may use polymorphic
state transition functions or final functions, so that the same functions
can be used to implement multiple aggregates.
See <xref linkend="types-polymorphic">
for an explanation of polymorphic functions.
Going a step further, the aggregate function itself may be specified
with a polymorphic base type and state type, allowing a single
aggregate definition to serve for multiple input data types.
Here is an example of a polymorphic aggregate:
<programlisting>
CREATE AGGREGATE array_accum (
sfunc = array_append,
basetype = anyelement,
stype = anyarray,
initcond = '{}'
);
</programlisting>
Here, the actual state type for any aggregate call is the array type
having the actual input type as elements.
</para>
<para>
Here's the output using two different actual data types as arguments:
<programlisting>
SELECT attrelid::regclass, array_accum(attname)
FROM pg_attribute WHERE attnum > 0
AND attrelid = 'pg_user'::regclass GROUP BY attrelid;
attrelid | array_accum
----------+-----------------------------------------------------------------------------
pg_user | {usename,usesysid,usecreatedb,usesuper,usecatupd,passwd,valuntil,useconfig}
(1 row)
SELECT attrelid::regclass, array_accum(atttypid)
FROM pg_attribute WHERE attnum > 0
AND attrelid = 'pg_user'::regclass GROUP BY attrelid;
attrelid | array_accum
----------+------------------------------
pg_user | {19,23,16,16,16,25,702,1009}
(1 row)
</programlisting>
</para>
<para>
For further details see the
<xref linkend="sql-createaggregate" endterm="sql-createaggregate-title">
command.
</para>
</sect1>
......
<!--
$Header: /cvsroot/pgsql/doc/src/sgml/xfunc.sgml,v 1.70 2003/07/25 20:17:49 tgl Exp $
$Header: /cvsroot/pgsql/doc/src/sgml/xfunc.sgml,v 1.71 2003/08/09 22:50:22 tgl Exp $
-->
<sect1 id="xfunc">
......@@ -41,22 +41,29 @@ $Header: /cvsroot/pgsql/doc/src/sgml/xfunc.sgml,v 1.70 2003/07/25 20:17:49 tgl E
<para>
Every kind
of function can take base types, composite types, or
some combination as arguments (parameters). In addition,
combinations of these as arguments (parameters). In addition,
every kind of function can return a base type or
a composite type.
</para>
<para>
Many kinds of functions can take or return certain pseudo-types
(such as polymorphic types), but the available facilities vary.
Consult the description of each kind of function for more details.
</para>
<para>
It's easiest to define <acronym>SQL</acronym>
functions, so we'll start with those. Examples in this section
can also be found in <filename>funcs.sql</filename>
and <filename>funcs.c</filename> in the tutorial directory.
functions, so we'll start by discussing those.
</para>
<para>
Throughout this chapter, it can be useful to look at the reference
page of the <command>CREATE FUNCTION</command> command to
understand the examples better.
Some examples from this chapter
can be found in <filename>funcs.sql</filename>
and <filename>funcs.c</filename> in the tutorial directory.
</para>
</sect1>
......@@ -67,8 +74,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/xfunc.sgml,v 1.70 2003/07/25 20:17:49 tgl E
<para>
SQL functions execute an arbitrary list of SQL statements, returning
the result of the last query in the list, which must be a
<literal>SELECT</>.
the result of the last query in the list.
In the simple (non-set)
case, the first row of the last query's result will be returned.
(Bear in mind that <quote>the first row</quote> of a multirow
......@@ -276,7 +282,7 @@ CREATE FUNCTION new_emp() RETURNS emp AS '
' LANGUAGE SQL;
</programlisting>
In this case we have specified each of the attributes
In this example we have specified each of the attributes
with a constant value, but any computation
could have been substituted for these constants.
</para>
......@@ -316,7 +322,7 @@ ERROR: function declared to return emp returns varchar instead of text at colum
</para>
<para>
This is an example for how to extract an attribute out of a row type:
This is an example of extracting an attribute out of a row type:
<screen>
SELECT (new_emp()).name;
......@@ -330,7 +336,7 @@ SELECT (new_emp()).name;
<screen>
SELECT new_emp().name;
ERROR: syntax error at or near "."
ERROR: syntax error at or near "." at character 17
</screen>
</para>
......@@ -509,6 +515,68 @@ SELECT name, listchildren(name) FROM nodes;
for those arguments, so no result rows are generated.
</para>
</sect2>
<sect2>
<title>Polymorphic <acronym>SQL</acronym> Functions</title>
<para>
<acronym>SQL</acronym> functions may be declared to accept and
return the <quote>polymorphic</> types
<type>anyelement</type> and <type>anyarray</type>.
See <xref linkend="types-polymorphic"> for a more detailed explanation
of polymorphic functions. Here is a polymorphic function
<function>make_array</function> that builds up an array from two
arbitrary data type elements:
<screen>
CREATE FUNCTION make_array(anyelement, anyelement) RETURNS anyarray AS '
SELECT ARRAY[$1, $2];
' LANGUAGE SQL;
SELECT make_array(1, 2) AS intarray, make_array('a'::text, 'b') AS textarray;
intarray | textarray
----------+-----------
{1,2} | {a,b}
(1 row)
</screen>
</para>
<para>
Notice the use of the typecast <literal>'a'::text</literal>
to specify that the argument is of type <type>text</type>. This is
required if the argument is just a string literal, since otherwise
it would be treated as type
<type>unknown</type>, and array of <type>unknown</type> is not a valid
type.
Without the typecast, you will get errors like this:
<screen>
<computeroutput>
ERROR: could not determine ANYARRAY/ANYELEMENT type because input is UNKNOWN
</computeroutput>
</screen>
</para>
<para>
It is permitted to have polymorphic arguments with a deterministic
return type, but the converse is not. For example:
<screen>
CREATE FUNCTION is_greater(anyelement, anyelement) RETURNS bool AS '
SELECT $1 > $2;
' LANGUAGE SQL;
SELECT is_greater(1, 2);
is_greater
------------
f
(1 row)
CREATE FUNCTION invalid_func() RETURNS anyelement AS '
SELECT 1;
' LANGUAGE SQL;
ERROR: cannot determine result datatype
DETAIL: A function returning ANYARRAY or ANYELEMENT must have at least one argument of either type.
</screen>
</para>
</sect2>
</sect1>
<sect1 id="xfunc-pl">
......@@ -1999,6 +2067,89 @@ CREATE OR REPLACE FUNCTION testpassbyval(integer, integer) RETURNS SETOF __testp
distribution contains more examples of set-returning functions.
</para>
</sect2>
<sect2>
<title>Polymorphic Arguments and Return Types</title>
<para>
C-language functions may be declared to accept and
return the <quote>polymorphic</> types
<type>anyelement</type> and <type>anyarray</type>.
See <xref linkend="types-polymorphic"> for a more detailed explanation
of polymorphic functions. When function arguments or return types
are defined as polymorphic types, the function author cannot know
in advance what data type it will be called with, or
need to return. There are two routines provided in <filename>fmgr.h</>
to allow a version-1 C function to discover the actual data types
of its arguments and the type it is expected to return. The routines are
called <literal>get_fn_expr_rettype(FmgrInfo *flinfo)</> and
<literal>get_fn_expr_argtype(FmgrInfo *flinfo, int argnum)</>.
They return the result or argument type OID, or InvalidOid if the
information is not available.
The structure <literal>flinfo</> is normally accessed as
<literal>fcinfo-&gt;flinfo</>. The parameter <literal>argnum</>
is zero based.
</para>
<para>
For example, suppose we want to write a function to accept a single
element of any type, and return a one-dimensional array of that type:
<programlisting>
PG_FUNCTION_INFO_V1(make_array);
Datum
make_array(PG_FUNCTION_ARGS)
{
ArrayType *result;
Oid element_type = get_fn_expr_argtype(fcinfo-&gt;flinfo, 0);
Datum element;
int16 typlen;
bool typbyval;
char typalign;
int ndims;
int dims[MAXDIM];
int lbs[MAXDIM];
if (!OidIsValid(element_type))
elog(ERROR, "could not determine data type of input");
/* get the provided element */
element = PG_GETARG_DATUM(0);
/* we have one dimension */
ndims = 1;
/* and one element */
dims[0] = 1;
/* and lower bound is 1 */
lbs[0] = 1;
/* get required info about the element type */
get_typlenbyvalalign(element_type, &amp;typlen, &amp;typbyval, &amp;typalign);
/* now build the array */
result = construct_md_array(&amp;element, ndims, dims, lbs,
element_type, typlen, typbyval, typalign);
PG_RETURN_ARRAYTYPE_P(result);
}
</programlisting>
</para>
<para>
The following command declares the function
<function>make_array</function> in SQL:
<programlisting>
CREATE FUNCTION make_array(anyelement)
RETURNS anyarray
AS '<replaceable>DIRECTORY</replaceable>/funcs', 'make_array'
LANGUAGE 'C' STRICT;
</programlisting>
Note the use of STRICT; this is essential since the code is not
bothering to test for a NULL input.
</para>
</sect2>
</sect1>
<sect1 id="xfunc-overload">
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment