PL/pgSQL
             A procedural language for the PostgreSQL RDBMS

                      Jan Wieck <jwieck@debis.com>



    Preface

        PL/pgSQL  is  a procedural language based on SQL designed for
        the PostgreSQL database system.

        The extensibility features of PostgreSQL are mostly based  on
        the  ability  to  define  functions  for  various operations.
        Functions could have been written in PostgreSQL's SQL dialect
        or  in the C programming language. Functions written in C are
        compiled into a shared object  and  loaded  by  the  database
        backend  process  on  demand.   Also  the trigger features of
        PostgreSQL are based on functions but required the use of the
        C language.

        Since  version  6.3  PostgreSQL  supports  the  definition of
        procedural languages. In the case of a  function  or  trigger
        procedure  defined in a procedural language, the database has
        no builtin knowlege how to  interpret  the  functions  source
        text. Instead, the function and trigger calls are passed into
        a handler that  knows  the  details  of  the  language.   The
        handler  itself  is  a function compiled into a shared object
        and loaded on demand.


    Overview

        The PL/pgSQL language is case insensitive. All  keywords  and
        identifiers can be used in upper-/lowercase mixed.

        PL/pgSQL is a block oriented language. A block is defined as

            [<<label>>]
            [DECLARE
                -- declarations]
            BEGIN
                -- statements
            END;

        There  can  be  any  number  of  subblocks  in the statements
        section of a block. Subblocks can be used to  hide  variables
        from  outside a block of statements (see Scope and visability
        below). The variables declared in  the  declarations  section
        preceding  a  block  are  initialized to their default values
        every time the block is entered, not only once  per  function
        call.

        It is important not to misunderstand the meaning of BEGIN/END
        for grouping statements in PL/pgSQL and the database commands
        for  transaction  control.  Functions  or  trigger procedures
        cannot  start   or   commit   transactions   and   PostgreSQL
        transactions cannot have subtransactions.


    Comments

        There  are  two  types of comments in PL/pgSQL. A double dash
        '--' starts a comment that extends to the end of the line.  A
        '/*'  starts  a  block comment that extends to the next '*/'.
        Block comments cannot be nested, but double dash comments can
        be  enclosed  into a block comment and double dashes can hide
        '/*' and '*/'.


    Declarations

        All variables, rows and records  used  in  a  block  or  it's
        subblocks must be declared in the declarations section of the
        block except for the loop variable of a  FOR  loop  iterating
        over  a range of integer values.  The parameters given to the
        function  are   automatically   declared   with   the   usual
        identifiers $n.  The declarations have the following syntax:

            <name> [CONSTANT] <type> [NOT NULL]
                                     [DEFAULT | := <value>];

                Declares  a  variable  of  the specified type. If the
                variable is declared as CONSTANT, the value cannot be
                changed. If NOT NULL is specified, an assignment of a
                NULL value results in  a  runtime  error.  Since  the
                default  value  of  a variable is the SQL NULL value,
                all variables declared as NOT NULL must also  have  a
                default value.

                The default value is evaluated at the actual function
                call. So  assigning  'now'  to  an  abstime  variable
                causes  the  variable  to have the time of the actual
                function call, not when  the  function  was  compiled
                (during  it's  first  call  since the lifetime of the
                database connection).

            <name> <class>%ROWTYPE;

                Declares a row with the structure of the given class.
                Class  must  be an existing table- or viewname of the
                database. The fields of the row are accessed  in  the
                dot  notation.  Parameters  to  a  procedure could be
                tuple  types.  In   that   case   the   corresponding
                identifier  $n  will  be  a  rowtype.  Only  the user
                attributes of a tuple  are  accessible  in  the  row.
                There  must  be no whitespaces between the classname,
                the percent and the ROWTYPE keyword.

                The  fields  of  the  rowtype  inherit   the   tables
                fieldsizes  for  char()  etc.   data types (atttypmod
                from pg_attribute).

            <name> RECORD;

                Records are similar to rowtypes,  but  they  have  no
                predefined  structure They are used in selections and
                FOR loops to hold one actual database  tuple  from  a
                select operation. One and the same record can be used
                in different selections. Accessing  a  record  or  an
                attempt  to  assign  a  value  to a record field when
                there's no actual tuple in it results  in  a  runtime
                error.

                The  new  and old tuples in triggers are given to the
                trigger procedure  as  records.  This  is  necessary,
                because  under  PostgreSQL  one  and the same trigger
                procedure can handle  trigger  events  for  different
                tables.

            <name> ALIAS FOR $n;

                For  better  readability of the code it's possible to
                define an alias for a  positional  parameter  to  the
                function.

            RENAME <oldname> TO <newname>;

                Change  the  name  of  a variable, record or rowtype.
                This is useful if new or old should be referenced  by
                another name inside a trigger procedure.

    Datatypes

        The  type of a variable can be any of the existing data types
        of the database. <type> above is defined as:

                postgesql-basetype
            or  variable%TYPE
            or  rowtype.field%TYPE
            or  class.field%TYPE

        As for the rowtype declaration, there must be no  whitespaces
        between the classname, the percent and the TYPE keyword.

    Expressions

        All  expressions  used  in  PL/pgSQL statements are processed
        using the backends executor. Since even  a  constant  looking
        expression  can  have  a  totally  different  meaning  for  a
        particular data type (as 'now' for abstime), it is impossible
        for  the  PL/pgSQL  parser  to  identify real constant values
        other than the NULL keyword. The expressions are evaluated by
        internally executing a query

            SELECT <expr>

        over  the  SPI  manager.  In  the  expression,  occurences of
        variable identifiers are substituted by  parameters  and  the
        actual  values  from the variables are passed to the executor
        as query parameters. All the expressions used in  a  PL/pgSQL
        function are only prepared and saved once.

        If   record  fields  are  used  in  expressions  or  database
        statements, the data types of the fields  should  not  change
        between  calls  of  one and the same expression. Keep this in
        mind when writing trigger procedures that handle  events  for
        more than one table.

    Statements

        Anything not understood by the parser as specified below will
        be put into a query and sent down to the database  engine  to
        execute.   The  resulting  query  should  not return any data
        (insert, update, delete queries and all utility  statements).

        Assignment

            An  assignment  of a value to a variable or rowtype field
            is written as:

                <identifier> := <expr>;

            If the expressions result data  type  doesn't  match  the
            variables  data type, or the variables atttypmod value is
            known  (as  for  char(20)),  the  result  value  will  be
            implicitly  casted  by  the  PL/pgSQL  executor using the
            result  types  output-  and  the  variables  type  input-
            functions.   Note  that  this could potentially result in
            runtime errors generated by the types input functions.

            An assignment of a complete selection into  a  record  or
            rowtype can be done as:

                SELECT expressions INTO <target> FROM fromlist;

            Target  can  be  a record or rowtype variable, or a comma
            separated list of variables and record/row fields.

            If a rowtype or a variable list is used  as  target,  the
            selected  values  must exactly match the structure of the
            target(s) or a runtime error occurs.  The fromlist can be
            followed  by  any  valid qualification, grouping, sorting
            etc.

            There is a special condition [NOT] FOUND that can be used
            immediately  after a SELECT INTO to check if the data has
            been found.

                SELECT * INTO myrec FROM EMP WHERE empname = myname;
                IF NOT FOUND THEN
                    RAISE EXCEPTION 'employee % not found', myname;
                END IF;

            If the selection returns multiple rows, only the first is
            moved into the target fields. All others are discarded.


        Calling another function

            If  a function should be called, this is normally done by
            a SELECT query. But there are cases where  someone  isn't
            interested in the functions result.

                PERFORM querystring;

            executes  a 'SELECT querystring' over the SPI manager and
            discards the result.


        Returning from the function

                RETURN <expr>;

            The function terminates and the value of <expr>  will  be
            returned  to  the  upper executor.  The return value of a
            function cannot be undefined.  If control reaches the end
            of  the  toplevel block of the function without hitting a
            RETURN statement, a runtime error will occur.


        Aborting and messages

            As indicated above there is an RAISE statement  that  can
            throw messages into the PostgreSQL elog mechanism.

                RAISE level 'format' [, identifier [...]];

            Inside the format, % can be used as a placeholder for the
            following, comma separated identifiers.  The  identifiers
            must specify an existing variable or row/record field.


        Conditionals

                IF <expr> THEN
                    -- statements
                [ELSE
                    -- statements]
                END IF;

            The  expression  <expr> must return a value that at least
            can be casted into a boolean.


        Loops

            There are multiple types of loops.

                [<<label>>]
                LOOP
                    -- statements
                END LOOP;

            An unconditional loop that must be terminated  explicitly
            by  an  EXIT statement. The optional label can be used by
            EXIT statements of nested loops to specify which level of
            nesting should be terminated.

                [<<label>>]
                WHILE <expr> LOOP
                    -- statements
                END LOOP;

            A  conditional  loop  that  is  executed  as  long as the
            evaluation of <expr> returns true.

                [<<label>>]
                FOR <name> IN [REVERSE] <expr>..<expr> LOOP
                    -- statements
                END LOOP.

            A loop that iterates over a range of integer values.  The
            variable  <name> is automatically created as type integer
            and exists only inside  the  loop.  The  two  expressions
            giving  the  lower  and  upper  bound  of  the  range are
            evaluated only when entering the loop. The iteration step
            is 1.

                FOR <recname|rowname> IN <select_clause> LOOP
                    -- statements
                END LOOP;

            The record or row is assigned all the rows resulting from
            the select clause and the statements executed  for  each.
            If  the  loop  is  terminated with an EXIT statement, the
            last accessed row is still accessible in  the  record  or
            rowtype.

                EXIT [label] [WHEN <expr>];

            If  no  label given, the innermost loop is terminated and
            the statement following END LOOP  is  executed  next.  If
            label is given, it must be the label of the current or an
            upper level of nested loops or blocks.   Then  the  named
            loop  or  block  is terminated and control continues with
            the statement after the loops/blocks corresponding END.

    Trigger procedures

        PL/pgSQL can also be used to define trigger procedures.  They
        are  created  using  CREATE  FUNCTION  as  a function with no
        arguments and a return type of opaque.

        There are some PostgreSQL specific details in functions  used
        as trigger procedures.

        First  they  have  some  special  variables created above the
        toplevel statement block. These are:

            new (record)
                The new database tuple on INSERT/UPDATE operations at
                ROW level.

            old (record)
                The old database tuple on UPDATE/DELETE operations at
                ROW level.

            tg_name (type name)
                The triggers name from pg_trigger.

            tg_when (type text)
                A string of either 'BEFORE' or 'AFTER'  depending  on
                the triggers definition.

            tg_level (type text)
                A  string of either 'ROW' or 'STATEMENT' depending on
                the triggers definition.

            tg_op (type text)
                A string of 'INSERT', 'UPDATE'  or  'DELETE'  telling
                for which operation the trigger is actually fired.

            tg_relid (type oid)
                The  Oid  of  the  relation  for which the trigger is
                actually fired.

            tg_relname (type name)
                The relations name for which the trigger is  actually
                fired.

            tg_nargs (type integer)
                The   number   of  arguments  given  to  the  trigger
                procedure in the CREATE TRIGGER statement.

            tg_argv[] (types text)
                The arguments from the CREATE TRIGGER statement.  The
                index  counts  from 0 and can be given as expression.
                Invalid indices (< 0 or >= tg_nargs) result in a NULL
                value.

        Second,  they  must  return  either NULL, or a record/rowtype
        containing exactly the structure of the table the trigger was
        fired  for.  Triggers  fired  AFTER might allways return NULL
        with no effect. Triggers  fired  BEFORE  signal  the  trigger
        manager  to  skip  the  operation  for  this  actual row when
        returning  NULL.  Otherwise,  the   returned   record/rowtype
        replaces  the  inserted/updated tuple in the operation. It is
        possible to replace single values directly in new and  return
        that, or to build a complete new record/rowtype to return.

    Exceptions

        PostgreSQL  doesn't  have  a  very  smart  exception handling
        model. Whenever the  parser,  planner/optimizer  or  executor
        decide  that  a statement cannot be processed any longer, the
        whole transaction gets aborted and the the system jumps  back
        into  the mainloop using longjmp() to get the next query from
        the client application.

        It is possible to hook into the longjmp() mechanism to notice
        that this happens. But currently it's impossible to tell what
        really  caused  the  abort  (input/output  conversion  error,
        floating point error, parse error) And it's possible that the
        backend  is  in  an  inconsistent  state  at  this  point  so
        returning  to  the  upper  executor  or issuing more commands
        might corrupt the whole database.

        Thus,  the  only  thing  PL/pgSQL  currently  does  when   it
        encounters an abort during execution of a function or trigger
        procedure is to write  some  additional  DEBUG  log  messages
        telling  in which function and where (line number and type of
        statement) this happened.

        This might change in the future.