extend.sgml 11.7 KB
Newer Older
1
<!-- $PostgreSQL: pgsql/doc/src/sgml/extend.sgml,v 1.39 2010/06/01 02:31:36 momjian Exp $ -->
2

3
 <chapter id="extend">
4
  <title>Extending <acronym>SQL</acronym></title>
5

6 7 8
  <indexterm zone="extend">
   <primary>extending SQL</primary>
  </indexterm>
9

10 11
  <para>
   In  the  sections  that follow, we will discuss how you
12
   can extend the <productname>PostgreSQL</productname> 
13 14 15 16 17
   <acronym>SQL</acronym> query language by adding:

   <itemizedlist spacing="compact" mark="bullet">
    <listitem>
     <para>
18
      functions (starting in <xref linkend="xfunc">)
19 20
     </para>
    </listitem>
21 22 23 24 25
    <listitem>
     <para>
      aggregates (starting in <xref linkend="xaggr">)
     </para>
    </listitem>
26 27
    <listitem>
     <para>
28
      data types (starting in <xref linkend="xtypes">)
29 30 31 32
     </para>
    </listitem>
    <listitem>
     <para>
33
      operators (starting in <xref linkend="xoper">)
34 35 36 37
     </para>
    </listitem>
    <listitem>
     <para>
38
      operator classes for indexes (starting in <xref linkend="xindex">)
39 40 41 42 43
     </para>
    </listitem>
   </itemizedlist>
  </para>

44
  <sect1 id="extend-how">
45 46 47
   <title>How Extensibility Works</title>

   <para>
48
    <productname>PostgreSQL</productname> is extensible because its operation  is  
49
    catalog-driven.   If  you  are familiar with standard 
50
    relational database systems, you know that  they  store  information
51 52
    about  databases,  tables,  columns,  etc., in what are
    commonly known as system catalogs.  (Some systems  call
53
    this  the data dictionary.)  The catalogs appear to the
54
    user as tables like any other, but  the  <acronym>DBMS</acronym>  stores
55
    its  internal  bookkeeping in them.  One key difference
56
    between <productname>PostgreSQL</productname> and  standard  relational database systems  is
57
    that <productname>PostgreSQL</productname> stores much more information in its 
58 59
    catalogs: not only information about tables and  columns,
    but also information about data types, functions, access
60
    methods, and so on.  These tables can be  modified  by
61
    the  user, and since <productname>PostgreSQL</productname> bases its operation 
62
    on these tables, this means that <productname>PostgreSQL</productname> can  be
63 64
    extended   by   users.    By  comparison,  conventional
    database systems can only be extended by changing hardcoded  
65
    procedures in the source code or by loading modules
Peter Eisentraut's avatar
Peter Eisentraut committed
66
    specially written by the <acronym>DBMS</acronym> vendor.
67 68 69
   </para>

   <para>
70 71 72 73 74 75 76 77 78 79
    The <productname>PostgreSQL</productname> server can moreover
    incorporate user-written code into itself through dynamic loading.
    That is, the user can specify an object code file (e.g., a shared
    library) that implements a new type or function, and
    <productname>PostgreSQL</productname> will load it as required.
    Code written in <acronym>SQL</acronym> is even more trivial to add
    to the server.  This ability to modify its operation <quote>on the
    fly</quote> makes <productname>PostgreSQL</productname> uniquely
    suited for rapid prototyping of new applications and storage
    structures.
80 81 82
   </para>
  </sect1>

Peter Eisentraut's avatar
Peter Eisentraut committed
83
  <sect1 id="extend-type-system">
84
   <title>The <productname>PostgreSQL</productname> Type System</title>
85

Peter Eisentraut's avatar
Peter Eisentraut committed
86 87 88 89 90 91 92 93 94 95 96
   <indexterm zone="extend-type-system">
    <primary>base type</primary>
   </indexterm>

   <indexterm zone="extend-type-system">
    <primary>data type</primary>
    <secondary>base</secondary>
   </indexterm>

   <indexterm zone="extend-type-system">
    <primary>composite type</primary>
97 98
   </indexterm>

Peter Eisentraut's avatar
Peter Eisentraut committed
99 100 101
   <indexterm zone="extend-type-system">
    <primary>data type</primary>
    <secondary>composite</secondary>
102 103
   </indexterm>

104
   <para>
105
    <productname>PostgreSQL</productname> data types are divided into base
Peter Eisentraut's avatar
Peter Eisentraut committed
106
    types, composite types, domains, and pseudo-types.
107 108
   </para>

Peter Eisentraut's avatar
Peter Eisentraut committed
109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131
   <sect2>
    <title>Base Types</title>

    <para>
     Base types are those, like <type>int4</type>, that are
     implemented below the level of the <acronym>SQL</> language
     (typically in a low-level language such as C).  They generally
     correspond to what are often known as abstract data types.
     <productname>PostgreSQL</productname> can only operate on such
     types through functions provided by the user and only understands
     the behavior of such types to the extent that the user describes
     them.  Base types are further subdivided into scalar and array
     types.  For each scalar type, a corresponding array type is
     automatically created that can hold variable-size arrays of that
     scalar type.
    </para>
   </sect2>

   <sect2>
    <title>Composite Types</title>

    <para>
     Composite types, or row types, are created whenever the user
132
     creates a table. It is also possible to use <xref
133
     linkend="sql-createtype"> to
134 135 136 137 138 139
     define a <quote>stand-alone</> composite type with no associated
     table.  A composite type is simply a list of types with
     associated field names.  A value of a composite type is a row or
     record of field values.  The user can access the component fields
     from <acronym>SQL</> queries. Refer to <xref linkend="rowtypes">
     for more information on composite types.
Peter Eisentraut's avatar
Peter Eisentraut committed
140 141 142 143 144 145 146 147
    </para>
   </sect2>

   <sect2>
    <title>Domains</title>

    <para>
     A domain is based on a particular base type and for many purposes
148
     is interchangeable with its base type.  However, a domain can
Peter Eisentraut's avatar
Peter Eisentraut committed
149 150 151 152 153
     have constraints that restrict its valid values to a subset of
     what the underlying base type would allow.
    </para>

    <para>
154
     Domains can be created using the <acronym>SQL</> command
155
     <xref linkend="sql-createdomain">.
156
     Their creation and use is not discussed in this chapter.
Peter Eisentraut's avatar
Peter Eisentraut committed
157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172
    </para>
   </sect2>

   <sect2>
    <title>Pseudo-Types</title>

    <para>
     There are a few <quote>pseudo-types</> for special purposes.
     Pseudo-types cannot appear as columns of tables or attributes of
     composite types, but they can be used to declare the argument and
     result types of functions.  This provides a mechanism within the
     type system to identify special classes of functions.  <xref
     linkend="datatype-pseudotypes-table"> lists the existing
     pseudo-types.
    </para>
   </sect2>
173

Peter Eisentraut's avatar
Peter Eisentraut committed
174
   <sect2 id="extend-types-polymorphic">
Peter Eisentraut's avatar
Peter Eisentraut committed
175
    <title>Polymorphic Types</title>
176

Peter Eisentraut's avatar
Peter Eisentraut committed
177 178 179 180 181 182 183 184 185 186 187
   <indexterm zone="extend-types-polymorphic">
    <primary>polymorphic type</primary>
   </indexterm>

   <indexterm zone="extend-types-polymorphic">
    <primary>polymorphic function</primary>
   </indexterm>

   <indexterm zone="extend-types-polymorphic">
    <primary>type</primary>
    <secondary>polymorphic</secondary>
188 189
   </indexterm>

Peter Eisentraut's avatar
Peter Eisentraut committed
190 191 192
   <indexterm zone="extend-types-polymorphic">
    <primary>function</primary>
    <secondary>polymorphic</secondary>
193 194 195
   </indexterm>

    <para>
196 197
     Four pseudo-types of special interest are <type>anyelement</>,
     <type>anyarray</>, <type>anynonarray</>, and <type>anyenum</>,
198 199
     which are collectively called <firstterm>polymorphic types</>.
     Any function declared using these types is said to be
200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218
     a <firstterm>polymorphic function</>.  A polymorphic function can
     operate on many different data types, with the specific data type(s)
     being determined by the data types actually passed to it in a particular
     call.
    </para>

    <para>
     Polymorphic arguments and results are tied to each other and are resolved
     to a specific data type when a query calling a polymorphic function is
     parsed.  Each position (either argument or return value) declared as
     <type>anyelement</type> is allowed to have any specific actual
     data type, but in any given call they must all be the
     <emphasis>same</emphasis> actual type. Each 
     position declared as <type>anyarray</type> can have any array data type,
     but similarly they must all be the same type. If there are
     positions declared <type>anyarray</type> and others declared
     <type>anyelement</type>, the actual array type in the
     <type>anyarray</type> positions must be an array whose elements are
     the same type appearing in the <type>anyelement</type> positions.
219 220 221
     <type>anynonarray</> is treated exactly the same as <type>anyelement</>,
     but adds the additional constraint that the actual type must not be
     an array type.
222 223 224
     <type>anyenum</> is treated exactly the same as <type>anyelement</>,
     but adds the additional constraint that the actual type must
     be an enum type.
225 226 227 228 229 230
    </para>

    <para>
     Thus, when more than one argument position is declared with a polymorphic
     type, the net effect is that only certain combinations of actual argument
     types are allowed.  For example, a function declared as
231
     <literal>equal(anyelement, anyelement)</> will take any two input values,
232 233 234 235 236 237 238 239 240 241 242 243
     so long as they are of the same data type.
    </para>

    <para>
     When the return value of a function is declared as a polymorphic type,
     there must be at least one argument position that is also polymorphic,
     and the actual data type supplied as the argument determines the actual
     result type for that call.  For example, if there were not already
     an array subscripting mechanism, one could define a function that
     implements subscripting as <literal>subscript(anyarray, integer)
     returns anyelement</>.  This declaration constrains the actual first
     argument to be an array type, and allows the parser to infer the correct
244 245 246
     result type from the actual first argument's type.  Another example
     is that a function declared as <literal>f(anyarray) returns anyenum</>
     will only accept arrays of enum types.
247
    </para>
248 249 250 251 252 253 254 255 256

    <para>
     Note that <type>anynonarray</> and <type>anyenum</> do not represent
     separate type variables; they are the same type as
     <type>anyelement</type>, just with an additional constraint.  For
     example, declaring a function as <literal>f(anyelement, anyenum)</>
     is equivalent to declaring it as <literal>f(anyenum, anyenum)</>:
     both actual arguments have to be the same enum type.
    </para>
257 258 259 260 261 262 263 264 265 266

    <para>
     A variadic function (one taking a variable number of arguments, as in
     <xref linkend="xfunc-sql-variadic-functions">) can be
     polymorphic: this is accomplished by declaring its last parameter as
     <literal>VARIADIC</> <type>anyarray</>.  For purposes of argument
     matching and determining the actual result type, such a function behaves
     the same as if you had written the appropriate number of
     <type>anynonarray</> parameters.
    </para>
267
   </sect2>
268 269
  </sect1>

270
  &xfunc;
271
  &xaggr;
272 273
  &xtypes;
  &xoper;
274
  &xindex;
275

276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321
  <sect1 id="extend-how">
   <title>Using C++ for Extensibility</title>

   <indexterm zone="extend-Cpp">
    <primary>C++</primary>
   </indexterm>

   <para>
    It is possible to use a compiler in C++ mode to build
    <productname>PostgreSQL</productname> extensions;  you must simply
    follow the standard methods for dynamically linking to C executables:

    <itemizedlist>
     <listitem>
      <para>
        Use <literal>extern C</> linkage for all functions that must
        be accessible by <function>dlopen()</>.  This is also necessary
        for any functions that might be passed as pointers between
        the backend and C++ code.
      </para>
     </listitem>
     <listitem>
      <para>
       Use <function>malloc()</> to allocate any memory that might be
       freed by the backend C code (don't pass <function>new()</>-allocated
       memory).
      </para>
     </listitem>
     <listitem>
      <para>
       Use <function>free()</> to free memory allocated by the backend
       C code (do not use <function>delete()</> for such cases).
      </para>
     </listitem>
     <listitem>
      <para>
       Prevent exceptions from propagating into the C code (use a
       catch-all block at the top level of all <literal>extern C</>
       functions).
      </para>
     </listitem>
    </itemizedlist>
   </para>

  </sect1>

322
 </chapter>