advanced.sgml 16.3 KB
Newer Older
1
<!--
2
$PostgreSQL: pgsql/doc/src/sgml/advanced.sgml,v 1.46 2004/11/15 06:32:13 neilc Exp $
3 4
-->

5 6 7 8 9 10 11 12
 <chapter id="tutorial-advanced">
  <title>Advanced Features</title>

  <sect1 id="tutorial-advanced-intro">
   <title>Introduction</title>

   <para>
    In the previous chapter we have covered the basics of using
13
    <acronym>SQL</acronym> to store and access your data in
14
    <productname>PostgreSQL</productname>.  We will now discuss some
15
    more advanced features of <acronym>SQL</acronym> that simplify
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43
    management and prevent loss or corruption of your data.  Finally,
    we will look at some <productname>PostgreSQL</productname>
    extensions.
   </para>

   <para>
    This chapter will on occasion refer to examples found in <xref
    linkend="tutorial-sql"> to change or improve them, so it will be
    of advantage if you have read that chapter.  Some examples from
    this chapter can also be found in
    <filename>advanced.sql</filename> in the tutorial directory.  This
    file also contains some example data to load, which is not
    repeated here.  (Refer to <xref linkend="tutorial-sql-intro"> for
    how to use the file.)
   </para>
  </sect1>


  <sect1 id="tutorial-views">
   <title>Views</title>

   <indexterm zone="tutorial-views">
    <primary>view</primary>
   </indexterm>

   <para>
    Refer back to the queries in <xref linkend="tutorial-join">.
    Suppose the combined listing of weather records and city location
44
    is of particular interest to your application, but you do not want
45 46 47 48
    to type the query each time you need it.  You can create a
    <firstterm>view</firstterm> over the query, which gives a name to
    the query that you can refer to like an ordinary table.

49
<programlisting>
50 51 52 53 54 55
CREATE VIEW myview AS
    SELECT city, temp_lo, temp_hi, prcp, date, location
        FROM weather, cities
        WHERE city = name;

SELECT * FROM myview;
56
</programlisting>
57 58 59 60 61 62 63 64 65 66 67
   </para>

   <para>
    Making liberal use of views is a key aspect of good SQL database
    design.  Views allow you to encapsulate the details of the
    structure of your tables, which may change as your application
    evolves, behind consistent interfaces.
   </para>

   <para>
    Views can be used in almost any place a real table can be used.
Bruce Momjian's avatar
Bruce Momjian committed
68
    Building views upon other views is not uncommon.
69
   </para>
70
  </sect1>
71

Bruce Momjian's avatar
Bruce Momjian committed
72

73 74 75 76 77 78 79 80 81 82 83 84
  <sect1 id="tutorial-fk">
   <title>Foreign Keys</title>

   <indexterm zone="tutorial-fk">
    <primary>foreign key</primary>
   </indexterm>

   <indexterm zone="tutorial-fk">
    <primary>referential integrity</primary>
   </indexterm>

   <para>
85
    Recall the <classname>weather</classname> and
86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103
    <classname>cities</classname> tables from <xref
    linkend="tutorial-sql">.  Consider the following problem:  You
    want to make sure that no one can insert rows in the
    <classname>weather</classname> table that do not have a matching
    entry in the <classname>cities</classname> table.  This is called
    maintaining the <firstterm>referential integrity</firstterm> of
    your data.  In simplistic database systems this would be
    implemented (if at all) by first looking at the
    <classname>cities</classname> table to check if a matching record
    exists, and then inserting or rejecting the new
    <classname>weather</classname> records.  This approach has a
    number of problems and is very inconvenient, so
    <productname>PostgreSQL</productname> can do this for you.
   </para>

   <para>
    The new declaration of the tables would look like this:

104
<programlisting>
105
CREATE TABLE cities (
106 107
        city     varchar(80) primary key,
        location point
108 109 110
);

CREATE TABLE weather (
111 112 113 114 115
        city      varchar(80) references cities(city),
        temp_lo   int,
        temp_hi   int,
        prcp      real,
        date      date
116
);
117
</programlisting>
118 119 120

    Now try inserting an invalid record:

121
<programlisting>
122
INSERT INTO weather VALUES ('Berkeley', 45, 53, 0.0, '1994-11-28');
123
</programlisting>
124

125
<screen>
126
ERROR:  insert or update on table "weather" violates foreign key constraint "weather_city_fkey"
127
DETAIL:  Key (city)=(Berkeley) is not present in table "cities".
128
</screen>
129 130 131 132 133
   </para>

   <para>
    The behavior of foreign keys can be finely tuned to your
    application.  We will not go beyond this simple example in this
134
    tutorial, but just refer you to <xref linkend="ddl">
135
    for more information.  Making correct use of
136 137 138 139 140 141 142 143 144
    foreign keys will definitely improve the quality of your database
    applications, so you are strongly encouraged to learn about them.
   </para>
  </sect1>


  <sect1 id="tutorial-transactions">
   <title>Transactions</title>

145
   <indexterm zone="tutorial-transactions">
Peter Eisentraut's avatar
Peter Eisentraut committed
146
    <primary>transaction</primary>
147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163
   </indexterm>

   <para>
    <firstterm>Transactions</> are a fundamental concept of all database
    systems.  The essential point of a transaction is that it bundles
    multiple steps into a single, all-or-nothing operation.  The intermediate
    states between the steps are not visible to other concurrent transactions,
    and if some failure occurs that prevents the transaction from completing,
    then none of the steps affect the database at all.
   </para>

   <para>
    For example, consider a bank database that contains balances for various
    customer accounts, as well as total deposit balances for branches.
    Suppose that we want to record a payment of $100.00 from Alice's account
    to Bob's account.  Simplifying outrageously, the SQL commands for this
    might look like
164

165
<programlisting>
166 167 168 169 170 171 172 173
UPDATE accounts SET balance = balance - 100.00
    WHERE name = 'Alice';
UPDATE branches SET balance = balance - 100.00
    WHERE name = (SELECT branch_name FROM accounts WHERE name = 'Alice');
UPDATE accounts SET balance = balance + 100.00
    WHERE name = 'Bob';
UPDATE branches SET balance = balance + 100.00
    WHERE name = (SELECT branch_name FROM accounts WHERE name = 'Bob');
174
</programlisting>
175 176 177
   </para>

   <para>
178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218
    The details of these commands are not important here; the important
    point is that there are several separate updates involved to accomplish
    this rather simple operation.  Our bank's officers will want to be
    assured that either all these updates happen, or none of them happen.
    It would certainly not do for a system failure to result in Bob
    receiving $100.00 that was not debited from Alice.  Nor would Alice long
    remain a happy customer if she was debited without Bob being credited.
    We need a guarantee that if something goes wrong partway through the
    operation, none of the steps executed so far will take effect.  Grouping
    the updates into a <firstterm>transaction</> gives us this guarantee.
    A transaction is said to be <firstterm>atomic</>: from the point of
    view of other transactions, it either happens completely or not at all.
   </para>

   <para>
    We also want a
    guarantee that once a transaction is completed and acknowledged by
    the database system, it has indeed been permanently recorded
    and won't be lost even if a crash ensues shortly thereafter.
    For example, if we are recording a cash withdrawal by Bob,
    we do not want any chance that the debit to his account will
    disappear in a crash just as he walks out the bank door.
    A transactional database guarantees that all the updates made by
    a transaction are logged in permanent storage (i.e., on disk) before
    the transaction is reported complete.
   </para>

   <para>
    Another important property of transactional databases is closely
    related to the notion of atomic updates: when multiple transactions
    are running concurrently, each one should not be able to see the
    incomplete changes made by others.  For example, if one transaction
    is busy totalling all the branch balances, it would not do for it
    to include the debit from Alice's branch but not the credit to
    Bob's branch, nor vice versa.  So transactions must be all-or-nothing
    not only in terms of their permanent effect on the database, but
    also in terms of their visibility as they happen.  The updates made
    so far by an open transaction are invisible to other transactions
    until the transaction completes, whereupon all the updates become
    visible simultaneously.
   </para>
219 220

   <para>
221
    In <productname>PostgreSQL</>, a transaction is set up by surrounding
222 223 224
    the SQL commands of the transaction with
    <command>BEGIN</> and <command>COMMIT</> commands.  So our banking
    transaction would actually look like
225

226
<programlisting>
227 228 229 230 231
BEGIN;
UPDATE accounts SET balance = balance - 100.00
    WHERE name = 'Alice';
-- etc etc
COMMIT;
232
</programlisting>
233 234 235
   </para>

   <para>
236
    If, partway through the transaction, we decide we do not want to
237 238 239 240
    commit (perhaps we just noticed that Alice's balance went negative),
    we can issue the command <command>ROLLBACK</> instead of
    <command>COMMIT</>, and all our updates so far will be canceled.
   </para>
241

242
   <para>
243
    <productname>PostgreSQL</> actually treats every SQL statement as being
244
    executed within a transaction.  If you do not issue a <command>BEGIN</>
245 246 247 248 249
    command, 
    then each individual statement has an implicit <command>BEGIN</> and
    (if successful) <command>COMMIT</> wrapped around it.  A group of
    statements surrounded by <command>BEGIN</> and <command>COMMIT</>
    is sometimes called a <firstterm>transaction block</>.
250
   </para>
251 252 253 254 255 256 257 258 259

   <note>
    <para>
     Some client libraries issue <command>BEGIN</> and <command>COMMIT</>
     commands automatically, so that you may get the effect of transaction
     blocks without asking.  Check the documentation for the interface
     you are using.
    </para>
   </note>
260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292

   <para>
    It's possible to control the statements in a transaction in a more
    granular fashion through the use of <firstterm>savepoints</>.  Savepoints
    allow you to selectively discard parts of the transaction, while
    committing the rest.  After defining a savepoint with
    <command>SAVEPOINT</>, you can if needed roll back to the savepoint
    with <command>ROLLBACK TO</>.  All the transaction's database changes
    between defining the savepoint and rolling back to it are discarded, but
    changes earlier than the savepoint are kept.
   </para> 

   <para>
    After rolling back to a savepoint, it continues to be defined, so you can
    roll back to it several times.  Conversely, if you are sure you won't need
    to roll back to a particular savepoint again, it can be released, so the
    system can free some resources.  Keep in mind that either releasing or
    rolling back to a savepoint
    will automatically release all savepoints that were defined after it.
   </para> 

   <para>
    All this is happening within the transaction block, so none of it
    is visible to other database sessions.  When and if you commit the
    transaction block, the committed actions become visible as a unit
    to other sessions, while the rolled-back actions never become visible
    at all.
   </para> 

   <para>
    Remembering the bank database, suppose we debit $100.00 from Alice's
    account, and credit Bob's account, only to find later that we should
    have credited Wally's account.  We could do it using savepoints like
293
    this:
294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318

<programlisting>
BEGIN;
UPDATE accounts SET balance = balance - 100.00
    WHERE name = 'Alice';
SAVEPOINT my_savepoint;
UPDATE accounts SET balance = balance + 100.00
    WHERE name = 'Bob';
-- oops ... forget that and use Wally's account
ROLLBACK TO my_savepoint;
UPDATE accounts SET balance = balance + 100.00
    WHERE name = 'Wally';
COMMIT;
</programlisting>
   </para>

   <para>
    This example is, of course, oversimplified, but there's a lot of control
    to be had over a transaction block through the use of savepoints.
    Moreover, <command>ROLLBACK TO</> is the only way to regain control of a
    transaction block that was put in aborted state by the
    system due to an error, short of rolling it back completely and starting
    again.
   </para>

319 320 321 322
  </sect1>


  <sect1 id="tutorial-inheritance">
323 324
   <title>Inheritance</title>

325 326 327 328 329
   <indexterm zone="tutorial-inheritance">
    <primary>inheritance</primary>
   </indexterm>

   <para>
330 331
    Inheritance is a concept from object-oriented databases.  It opens
    up interesting new possibilities of database design.
332 333 334
   </para>

   <para>
335 336 337 338 339
    Let's create two tables:  A table <classname>cities</classname>
    and a table <classname>capitals</classname>.  Naturally, capitals
    are also cities, so you want some way to show the capitals
    implicitly when you list all cities.  If you're really clever you
    might invent some scheme like this:
340

341
<programlisting>
342
CREATE TABLE capitals (
343 344 345 346
  name       text,
  population real,
  altitude   int,    -- (in ft)
  state      char(2)
347 348 349
);

CREATE TABLE non_capitals (
350 351 352
  name       text,
  population real,
  altitude   int     -- (in ft)
353 354 355
);

CREATE VIEW cities AS
356 357 358
  SELECT name, population, altitude FROM capitals
    UNION
  SELECT name, population, altitude FROM non_capitals;
359
</programlisting>
360 361

    This works OK as far as querying goes, but it gets ugly when you
362
    need to update several rows, for one thing.
363 364
   </para>

365
   <para>
366
    A better solution is this:
367

368
<programlisting>
369
CREATE TABLE cities (
370 371 372
  name       text,
  population real,
  altitude   int     -- (in ft)
373 374
);

375
CREATE TABLE capitals (
376
  state      char(2)
377
) INHERITS (cities);
378
</programlisting>
379
   </para>
380

381
   <para>
382 383 384 385 386
    In this case, a row of <classname>capitals</classname>
    <firstterm>inherits</firstterm> all columns (<structfield>name</>,
    <structfield>population</>, and <structfield>altitude</>) from its
    <firstterm>parent</firstterm>, <classname>cities</classname>.  The
    type of the column <structfield>name</structfield> is
387
    <type>text</type>, a native <productname>PostgreSQL</productname>
388 389 390 391
    type for variable length character strings.  State capitals have
    an extra column, state, that shows their state.  In
    <productname>PostgreSQL</productname>, a table can inherit from
    zero or more other tables.
392
   </para>
393

394 395 396
   <para>
    For example, the  following  query finds the  names  of  all  cities,
    including  state capitals, that are located at an altitude 
397
    over 500 ft.:
398

399
<programlisting>
400
SELECT name, altitude
401 402
  FROM cities
  WHERE altitude &gt; 500;
403
</programlisting>
404

405
    which returns:
406

407
<screen>
408 409 410 411 412 413
   name    | altitude
-----------+----------
 Las Vegas |     2174
 Mariposa  |     1953
 Madison   |      845
(3 rows)
414
</screen>
415
   </para>
416

417 418
   <para>
    On the other hand, the  following  query  finds
419
    all  the cities that are not state capitals and
420
    are situated at an altitude of 500 ft. or higher:
421

422
<programlisting>
423
SELECT name, altitude
424 425
    FROM ONLY cities
    WHERE altitude &gt; 500;
426
</programlisting>
427

428 429 430 431 432 433
<screen>
   name    | altitude
-----------+----------
 Las Vegas |     2174
 Mariposa  |     1953
(2 rows)
434
</screen>
435
   </para>
436

437
   <para>
438 439 440 441
    Here the <literal>ONLY</literal> before <literal>cities</literal>
    indicates that the query should be run over only the
    <classname>cities</classname> table, and not tables below
    <classname>cities</classname> in the inheritance hierarchy.  Many
442
    of the commands that we have already discussed &mdash;
443
    <command>SELECT</command>, <command>UPDATE</command>, and
444
    <command>DELETE</command> &mdash; support this <literal>ONLY</literal>
445
    notation.
446
   </para>
447 448 449 450 451 452 453 454

   <note>
    <para>
     Although inheritance is frequently useful, it has not been integrated
     with unique constraints or foreign keys, which limits its usefulness.
     See <xref linkend="ddl-inherit"> for more detail.
    </para>
   </note>
455 456 457
  </sect1>


458 459 460
  <sect1 id="tutorial-conclusion">
   <title>Conclusion</title>
 
461
   <para>
462 463 464
    <productname>PostgreSQL</productname> has many features not
    touched upon in this tutorial introduction, which has been
    oriented toward newer users of <acronym>SQL</acronym>.  These
465 466
    features are discussed in more detail in the remainder of this
    book.
467 468 469
   </para>

   <para>
470 471 472
    If you feel you need more introductory material, please visit the
    <ulink url="http://www.postgresql.org">PostgreSQL web
    site</ulink> for links to more resources.
473 474 475 476 477 478
   </para>
  </sect1>
 </chapter>

<!-- Keep this comment at the end of the file
Local variables:
479
mode:sgml
480 481 482 483 484 485 486 487 488
sgml-omittag:nil
sgml-shorttag:t
sgml-minimize-attributes:nil
sgml-always-quote-attributes:t
sgml-indent-step:1
sgml-indent-data:t
sgml-parent-document:nil
sgml-default-dtd-file:"./reference.ced"
sgml-exposed-tags:nil
489
sgml-local-catalogs:("/usr/lib/sgml/catalog")
490 491 492
sgml-local-ecat-files:nil
End:
-->