Commit 3d5ddc0b authored by Tom Lane's avatar Tom Lane

Clean up wrong, misleading, or obsolete documentation about array types,

particularly in the CREATE TYPE reference page.  Fix some other errors
in the CREATE TYPE page, too.
parent f008976b
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/array.sgml,v 1.12 2001/09/09 17:21:44 petere Exp $ --> <!-- $Header: /cvsroot/pgsql/doc/src/sgml/array.sgml,v 1.13 2001/11/03 21:42:47 tgl Exp $ -->
<chapter id="arrays"> <chapter id="arrays">
<title>Arrays</title> <title>Arrays</title>
...@@ -23,15 +23,15 @@ CREATE TABLE sal_emp ( ...@@ -23,15 +23,15 @@ CREATE TABLE sal_emp (
<structname>sal_emp</structname> with a <type>text</type> string <structname>sal_emp</structname> with a <type>text</type> string
(<structfield>name</structfield>), a one-dimensional array of type (<structfield>name</structfield>), a one-dimensional array of type
<type>integer</type> (<structfield>pay_by_quarter</structfield>), <type>integer</type> (<structfield>pay_by_quarter</structfield>),
which shall represent the employee's salary by quarter, and a which represents the employee's salary by quarter, and a
two-dimensional array of <type>text</type> two-dimensional array of <type>text</type>
(<structfield>schedule</structfield>), which represents the (<structfield>schedule</structfield>), which represents the
employee's weekly schedule. employee's weekly schedule.
</para> </para>
<para> <para>
Now we do some <command>INSERT</command>s; note that when appending Now we do some <command>INSERT</command>s. Observe that to write an array
to an array, we enclose the values within braces and separate them value, we enclose the element values within braces and separate them
by commas. If you know C, this is not unlike the syntax for by commas. If you know C, this is not unlike the syntax for
initializing structures. initializing structures.
...@@ -200,8 +200,7 @@ SELECT * FROM sal_emp WHERE pay_by_quarter[1] = 10000 OR ...@@ -200,8 +200,7 @@ SELECT * FROM sal_emp WHERE pay_by_quarter[1] = 10000 OR
However, this quickly becomes tedious for large arrays, and is not However, this quickly becomes tedious for large arrays, and is not
helpful if the size of the array is unknown. Although it is not part helpful if the size of the array is unknown. Although it is not part
of the primary <productname>PostgreSQL</productname> distribution, of the primary <productname>PostgreSQL</productname> distribution,
in the contributions directory, there is an extension to there is an extension available that defines new functions and
<productname>PostgreSQL</productname> that defines new functions and
operators for iterating over array values. Using this, the above operators for iterating over array values. Using this, the above
query could be: query could be:
......
<!-- <!--
Documentation of the system catalogs, directed toward PostgreSQL developers Documentation of the system catalogs, directed toward PostgreSQL developers
$Header: /cvsroot/pgsql/doc/src/sgml/catalogs.sgml,v 2.26 2001/10/15 22:47:47 tgl Exp $ $Header: /cvsroot/pgsql/doc/src/sgml/catalogs.sgml,v 2.27 2001/11/03 21:42:47 tgl Exp $
--> -->
<chapter id="catalogs"> <chapter id="catalogs">
...@@ -420,7 +420,9 @@ ...@@ -420,7 +420,9 @@
<entry><type>int4</type></entry> <entry><type>int4</type></entry>
<entry></entry> <entry></entry>
<entry> <entry>
Number of dimensions, if the column is an array; otherwise 0. Number of dimensions, if the column is an array type; otherwise 0.
(Presently, the number of dimensions of an array is not enforced,
so any nonzero value effectively means <quote>it's an array</>.)
</entry> </entry>
</row> </row>
...@@ -1064,7 +1066,7 @@ ...@@ -1064,7 +1066,7 @@
<entry><type>int2vector</type></entry> <entry><type>int2vector</type></entry>
<entry>pg_attribute.attnum</entry> <entry>pg_attribute.attnum</entry>
<entry> <entry>
This is an vector (array) of up to This is a vector (array) of up to
<symbol>INDEX_MAX_KEYS</symbol> values that indicate which <symbol>INDEX_MAX_KEYS</symbol> values that indicate which
table columns this index pertains to. For example a value of table columns this index pertains to. For example a value of
<literal>1 3</literal> would mean that the first and the third <literal>1 3</literal> would mean that the first and the third
...@@ -2336,7 +2338,9 @@ ...@@ -2336,7 +2338,9 @@
<entry>typdelim</entry> <entry>typdelim</entry>
<entry><type>char</type></entry> <entry><type>char</type></entry>
<entry></entry> <entry></entry>
<entry>Character that separates two values of this type when parsing array input</entry> <entry>Character that separates two values of this type when parsing
array input. Note that the delimiter is associated with the array
element datatype, not the array datatype.</entry>
</row> </row>
<row> <row>
...@@ -2360,14 +2364,17 @@ ...@@ -2360,14 +2364,17 @@
If <structfield>typelem</structfield> is not 0 then it If <structfield>typelem</structfield> is not 0 then it
identifies another row in <structname>pg_type</structname>. identifies another row in <structname>pg_type</structname>.
The current type can then be subscripted like an array yielding The current type can then be subscripted like an array yielding
values of type <structfield>typelem</structfield>. A non-zero values of type <structfield>typelem</structfield>. A
<structfield>typelem</structfield> does not guarantee this type <quote>true</quote> array type is variable length
to be a <quote>real</quote> array type; some ordinary (<structfield>typlen</structfield> = -1),
fixed-length types can also be subscripted (e.g., but some fixed-length (<structfield>typlen</structfield> &gt; 0) types
<type>oidvector</type>). Variable-length types can also have nonzero <structfield>typelem</structfield>, for example
<emphasis>not</emphasis> be turned into pseudo-arrays like <type>name</type> and <type>oidvector</type>.
that. Hence, the way to determine whether a type is a If a fixed-length type has a <structfield>typelem</structfield> then
<quote>true</quote> array type is typelem != 0 and typlen < 0. its internal representation must be N values of the
<structfield>typelem</structfield> datatype with no other data.
Variable-length array types have a header defined by the array
subroutines.
</entry> </entry>
</row> </row>
......
<!-- <!--
$Header: /cvsroot/pgsql/doc/src/sgml/ref/create_type.sgml,v 1.23 2001/09/13 19:05:29 petere Exp $ $Header: /cvsroot/pgsql/doc/src/sgml/ref/create_type.sgml,v 1.24 2001/11/03 21:42:47 tgl Exp $
Postgres documentation Postgres documentation
--> -->
...@@ -27,7 +27,7 @@ CREATE TYPE <replaceable class="parameter">typename</replaceable> ( INPUT = <rep ...@@ -27,7 +27,7 @@ CREATE TYPE <replaceable class="parameter">typename</replaceable> ( INPUT = <rep
, INTERNALLENGTH = { <replaceable , INTERNALLENGTH = { <replaceable
class="parameter">internallength</replaceable> | VARIABLE } class="parameter">internallength</replaceable> | VARIABLE }
[ , EXTERNALLENGTH = { <replaceable class="parameter">externallength</replaceable> | VARIABLE } ] [ , EXTERNALLENGTH = { <replaceable class="parameter">externallength</replaceable> | VARIABLE } ]
[ , DEFAULT = "<replaceable class="parameter">default</replaceable>" ] [ , DEFAULT = <replaceable class="parameter">default</replaceable> ]
[ , ELEMENT = <replaceable class="parameter">element</replaceable> ] [ , DELIMITER = <replaceable class="parameter">delimiter</replaceable> ] [ , ELEMENT = <replaceable class="parameter">element</replaceable> ] [ , DELIMITER = <replaceable class="parameter">delimiter</replaceable> ]
[ , SEND = <replaceable class="parameter">send_function</replaceable> ] [ , RECEIVE = <replaceable class="parameter">receive_function</replaceable> ] [ , SEND = <replaceable class="parameter">send_function</replaceable> ] [ , RECEIVE = <replaceable class="parameter">receive_function</replaceable> ]
[ , PASSEDBYVALUE ] [ , PASSEDBYVALUE ]
...@@ -113,7 +113,8 @@ CREATE TYPE <replaceable class="parameter">typename</replaceable> ( INPUT = <rep ...@@ -113,7 +113,8 @@ CREATE TYPE <replaceable class="parameter">typename</replaceable> ( INPUT = <rep
<term><replaceable class="parameter">delimiter</replaceable></term> <term><replaceable class="parameter">delimiter</replaceable></term>
<listitem> <listitem>
<para> <para>
The delimiter character for the array elements. The delimiter character to be used between values in arrays made
of this type.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
...@@ -219,82 +220,101 @@ CREATE ...@@ -219,82 +220,101 @@ CREATE
<para> <para>
<command>CREATE TYPE</command> requires the registration of two functions <command>CREATE TYPE</command> requires the registration of two functions
(using create function) before defining the type. The (using CREATE FUNCTION) before defining the type. The
representation of a new base type is determined by representation of a new base type is determined by
<replaceable class="parameter">input_function</replaceable>, which <replaceable class="parameter">input_function</replaceable>, which
converts the type's external representation to an internal converts the type's external representation to an internal
representation usable by the representation usable by the
operators and functions defined for the type. Naturally, operators and functions defined for the type. Naturally,
<replaceable class="parameter">output_function</replaceable> <replaceable class="parameter">output_function</replaceable>
performs the reverse transformation. Both performs the reverse transformation. The input function may be
the input and output functions must be declared to take declared as taking one argument of type <type>opaque</type>,
one or two arguments of type <type>opaque</type>. or as taking three arguments of types
<type>opaque</type>, <type>OID</type>, <type>int4</type>.
(The first argument is the input text as a C string, the second
argument is the element type in case this is an array type,
and the third is the typmod of the destination column, if known.)
The output function may be
declared as taking one argument of type <type>opaque</type>,
or as taking two arguments of types
<type>opaque</type>, <type>OID</type>.
(The first argument is actually of the datatype itself, but since the
output function must be declared first, it's easier to declare it as
accepting type <type>opaque</type>. The second argument is again
the array element type for array types.)
</para> </para>
<para> <para>
New base data types can be fixed length, in which case New base data types can be fixed length, in which case
<replaceable class="parameter">internallength</replaceable> is a <replaceable class="parameter">internallength</replaceable> is a
positive integer, or variable length, positive integer, or variable length, indicated by setting
in which case PostgreSQL assumes that the new type has the
same format
as the PostgreSQL-supplied data type, <type>text</type>.
To indicate that a type is variable length, set
<replaceable class="parameter">internallength</replaceable> <replaceable class="parameter">internallength</replaceable>
to <option>VARIABLE</option>. to <option>VARIABLE</option>. (Internally, this is represented
The external representation is similarly specified using the by setting typlen to -1.) The internal representation of all
variable-length types must start with an integer giving the total
length of this value of the type.
</para>
<para>
The external representation length is similarly specified using the
<replaceable class="parameter">externallength</replaceable> <replaceable class="parameter">externallength</replaceable>
keyword. keyword. (This value is not presently used, and is typically omitted,
letting it default to <option>VARIABLE</option>.)
</para> </para>
<para> <para>
To indicate that a type is an array and to indicate that a To indicate that a type is an array,
type has array elements, indicate the type of the array specify the type of the array
element using the element keyword. For example, to define elements using the <option>ELEMENT</> keyword. For example, to define
an array of 4-byte integers ("int4"), specify an array of 4-byte integers ("int4"), specify
<programlisting>ELEMENT = int4</programlisting> <programlisting>ELEMENT = int4</programlisting>
More details about array types appear below.
</para> </para>
<para> <para>
To indicate the delimiter to be used on arrays of this To indicate the delimiter to be used between values in the external
type, <replaceable class="parameter">delimiter</replaceable> representation of arrays of this type, <replaceable
can be class="parameter">delimiter</replaceable> can be
set to a specific character. The default delimiter is the comma set to a specific character. The default delimiter is the comma
("<literal>,</literal>"). ('<literal>,</literal>'). Note that the delimiter is associated
with the array element type, not the array type itself.
</para> </para>
<para> <para>
A default value is optionally available in case a user A default value may be specified, in case a user wants columns of the
wants some specific bit pattern to mean <quote>data not present</quote>. datatype to default to something other than NULL.
Specify the default with the <literal>DEFAULT</literal> keyword. Specify the default with the <option>DEFAULT</option> keyword.
<comment>How does the user specify that bit pattern and associate (Such a default may be overridden by an explicit <option>DEFAULT</option>
it with the fact that the data is not present></comment> clause attached to a particular column.)
</para> </para>
<para> <para>
The optional arguments The optional arguments
<replaceable class="parameter">send_function</replaceable> and <replaceable class="parameter">send_function</replaceable> and
<replaceable class="parameter">receive_function</replaceable> <replaceable class="parameter">receive_function</replaceable>
are used when the application program requesting PostgreSQL are not currently used, and are usually omitted (allowing them
services resides on a different machine. In this case, to default to the
the machine on which PostgreSQL runs may use a format for the data <replaceable class="parameter">output_function</replaceable> and
type different from that used on the remote machine. <replaceable class="parameter">input_function</replaceable>
In this case it is appropriate to convert data items to a respectively). These functions may someday be resurrected for use
standard form when sending from the server to the client in specifying machine-independent binary representations.
and converting from the standard format to the machine </para>
specific format when the server receives the data from the
client. If these functions are not specified, then it is <para>
assumed that the internal format of the type is acceptable The optional flag, <option>PASSEDBYVALUE</option>, indicates that
on all relevant machine architectures. For example, single values of this data type are passed
characters do not have to be converted if passed from by value rather than by reference. Note that you
a Sun-4 to a DECstation, but many other types do.
</para>
<para>
The optional flag, <option>PASSEDBYVALUE</option>, indicates that operators
and functions which use this data type should be passed an
argument by value rather than by reference. Note that you
may not pass by value types whose internal representation is may not pass by value types whose internal representation is
more than four bytes. longer than the width of the <type>Datum</> type (four bytes on
most machines, eight bytes on a few).
</para>
<para>
The <replaceable class="parameter">alignment</replaceable> keyword
specifies the storage alignment required for the datatype. The
allowed values equate to alignment on 1, 2, 4, or 8 byte boundaries.
Note that variable-length types must have an alignment of at least
4, since they necessarily contain an <type>int4</> as their first component.
</para> </para>
<para> <para>
...@@ -315,19 +335,40 @@ CREATE ...@@ -315,19 +335,40 @@ CREATE
<literal>extended</literal> and <literal>external</literal> items.) <literal>extended</literal> and <literal>external</literal> items.)
</para> </para>
<refsect2>
<title>Array Types</title>
<para> <para>
For new base types, a user can define operators, functions Whenever a user-defined datatype is created,
and aggregates using the appropriate facilities described <productname>PostgreSQL</productname> automatically creates an
in this section. associated array type, whose name consists of the base type's
name prepended with an underscore. The parser understands this
naming convention, and translates requests for columns of type
<literal>foo[]</> into requests for type <literal>_foo</>.
The implicitly-created array type is variable length and uses the
built-in input and output functions <literal>array_in</> and
<literal>array_out</>.
</para> </para>
<refsect2>
<title>Array Types</title>
<para> <para>
Two generalized built-in functions, array_in and You might reasonably ask <quote>why is there an <option>ELEMENT</>
array_out, exist for quick creation of variable-length option, if the system makes the correct array type automatically?</quote>
array types. These functions operate on arrays of any The only case where it's useful to use <option>ELEMENT</> is when you are
existing PostgreSQL type. making a fixed-length type that happens to be internally an array of N
identical things, and you want to allow the N things to be accessed
directly by subscripting, in addition to whatever operations you plan
to provide for the type as a whole. For example, type <type>name</>
allows its constitutent <type>char</>s to be accessed this way.
A 2-D <type>point</> type could allow its two component floats to be
accessed like <literal>point[0]</> and <literal>point[1]</>.
Note that
this facility only works for fixed-length types whose internal form
is exactly a sequence of N identical fields. A subscriptable
variable-length type must have the generalized internal representation
used by <literal>array_in</> and <literal>array_out</>.
For historical reasons (i.e., this is clearly wrong but it's far too
late to change it), subscripting of fixed-length array types starts from
zero, rather than from one as for variable-length arrays.
</para> </para>
</refsect2> </refsect2>
</refsect1> </refsect1>
...@@ -336,41 +377,42 @@ CREATE ...@@ -336,41 +377,42 @@ CREATE
<title>Notes</title> <title>Notes</title>
<para> <para>
Type names cannot begin with the underscore character User-defined type names cannot begin with the underscore character
(<quote><literal>_</literal></quote>) and can only be 31 (<quote><literal>_</literal></quote>) and can only be 30
characters long. This is because PostgreSQL silently creates an characters long (or in general <literal>NAMEDATALEN-2</>, rather than
array type for each base type with a name consisting of the base the <literal>NAMEDATALEN-1</> characters allowed for other names).
type's name prepended with an underscore. Type names beginning with underscore are
reserved for internally-created array type names.
</para> </para>
</refsect1> </refsect1>
<refsect1> <refsect1>
<title>Examples</title> <title>Examples</title>
<para> <para>
This command creates the <type>box</type> data type and then uses the This example creates the <type>box</type> data type and then uses the
type in a table definition: type in a table definition:
<programlisting> <programlisting>
CREATE TYPE box (INTERNALLENGTH = 8, CREATE TYPE box (INTERNALLENGTH = 16,
INPUT = my_procedure_1, OUTPUT = my_procedure_2); INPUT = my_procedure_1, OUTPUT = my_procedure_2);
CREATE TABLE myboxes (id INT4, description box); CREATE TABLE myboxes (id INT4, description box);
</programlisting> </programlisting>
</para> </para>
<para> <para>
This command creates a variable length array type with If <type>box</type>'s internal structure were an array of four
<type>integer</type> elements: <type>float4</>s, we might instead say
<programlisting> <programlisting>
CREATE TYPE int4array (INPUT = array_in, OUTPUT = array_out, CREATE TYPE box (INTERNALLENGTH = 16,
INTERNALLENGTH = VARIABLE, ELEMENT = int4); INPUT = my_procedure_1, OUTPUT = my_procedure_2,
CREATE TABLE myarrays (id int4, numbers int4array); ELEMENT = float4);
</programlisting> </programlisting>
which would allow a box value's component floats to be accessed
by subscripting. Otherwise the type behaves the same as before.
</para> </para>
<para> <para>
This command creates a large object type and uses it in This example creates a large object type and uses it in
a table definition: a table definition:
<programlisting> <programlisting>
CREATE TYPE bigobj (INPUT = lo_filein, OUTPUT = lo_fileout, CREATE TYPE bigobj (INPUT = lo_filein, OUTPUT = lo_fileout,
INTERNALLENGTH = VARIABLE); INTERNALLENGTH = VARIABLE);
......
<!-- <!--
$Header: /cvsroot/pgsql/doc/src/sgml/ref/drop_type.sgml,v 1.11 2001/09/13 19:05:29 petere Exp $ $Header: /cvsroot/pgsql/doc/src/sgml/ref/drop_type.sgml,v 1.12 2001/11/03 21:42:47 tgl Exp $
Postgres documentation Postgres documentation
--> -->
...@@ -105,7 +105,9 @@ ERROR: RemoveType: type '<replaceable class="parameter">typename</replaceable>' ...@@ -105,7 +105,9 @@ ERROR: RemoveType: type '<replaceable class="parameter">typename</replaceable>'
<para> <para>
It is the user's responsibility to remove any operators, It is the user's responsibility to remove any operators,
functions, aggregates, access methods, subtypes, and tables that functions, aggregates, access methods, subtypes, and tables that
use a deleted type. use a deleted type. However, the associated array datatype
(which was automatically created by <command>CREATE TYPE</command>)
will be removed automatically.
</para> </para>
</listitem> </listitem>
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment