Commit 60ce9e92 authored by Peter Eisentraut's avatar Peter Eisentraut

New blood and fresh air for tutorial

parent 5608f130
<!-- <!--
$Header: /cvsroot/pgsql/doc/src/sgml/advanced.sgml,v 1.21 2001/01/13 23:58:55 petere Exp $ $Header: /cvsroot/pgsql/doc/src/sgml/advanced.sgml,v 1.22 2001/09/02 23:27:49 petere Exp $
--> -->
<chapter id="advanced"> <chapter id="tutorial-advanced">
<title>Advanced <productname>Postgres</productname> <acronym>SQL</acronym> Features</title> <title>Advanced Features</title>
<para> <sect1 id="tutorial-advanced-intro">
Having covered the basics of using <title>Introduction</title>
<productname>Postgres</productname> <acronym>SQL</acronym> to
access your data, we will now discuss those features of <para>
<productname>Postgres</productname> that distinguish it from conventional data In the previous chapter we have covered the basics of using
managers. These features include inheritance, time <acronym>SQL</acronym> to store and access your data in a
travel and non-atomic data values (array- and <productname>PostgreSQL</productname>. We will now discuss some
set-valued attributes). more advanced features of <acronym>SQL</acronym> that simplify the
Examples in this section can also be found in management and prevent loss or corruption of your data. Finally,
<filename>advance.sql</filename> in the tutorial directory. we will look at some <productname>PostgreSQL</productname>
(Refer to <xref linkend="QUERY"> for how to use it.) extensions.
</para> </para>
<sect1 id="inheritance"> <para>
This chapter will on occasion refer to examples found in <xref
linkend="tutorial-sql"> to change or improve them, so it will be
of advantage if you have read that chapter. Some examples from
this chapter can also be found in
<filename>advanced.sql</filename> in the tutorial directory. This
file also contains some example data to load, which is not
repeated here. (Refer to <xref linkend="tutorial-sql-intro"> for
how to use the file.)
</para>
</sect1>
<sect1 id="tutorial-views">
<title>Views</title>
<indexterm zone="tutorial-views">
<primary>view</primary>
</indexterm>
<para>
Refer back to the queries in <xref linkend="tutorial-join">.
Suppose the combined listing of weather records and city location
is of particular interest to your application, but you don't want
to type the query each time you need it. You can create a
<firstterm>view</firstterm> over the query, which gives a name to
the query that you can refer to like an ordinary table.
<programlisting>
CREATE VIEW myview AS
SELECT city, temp_lo, temp_hi, prcp, date, location
FROM weather, cities
WHERE city = name;
SELECT * FROM myview;
</programlisting>
</para>
<para>
Making liberal use of views is a key aspect of good SQL database
design. Views allow you to encapsulate the details of the
structure of your tables, which may change as your application
evolves, behind consistent interfaces.
</para>
<para>
Views can be used in almost any place a real table can be used.
Building views upon other views is not uncommon.
</para>
</sect1>
<sect1 id="tutorial-fk">
<title>Foreign Keys</title>
<indexterm zone="tutorial-fk">
<primary>foreign key</primary>
</indexterm>
<indexterm zone="tutorial-fk">
<primary>referential integrity</primary>
</indexterm>
<para>
Recall the <classname>weather</classname> and the
<classname>cities</classname> tables from <xref
linkend="tutorial-sql">. Consider the following problem: You
want to make sure that no one can insert rows in the
<classname>weather</classname> table that do not have a matching
entry in the <classname>cities</classname> table. This is called
maintaining the <firstterm>referential integrity</firstterm> of
your data. In simplistic database systems this would be
implemented (if at all) by first looking at the
<classname>cities</classname> table to check if a matching record
exists, and then inserting or rejecting the new
<classname>weather</classname> records. This approach has a
number of problems and is very inconvenient, so
<productname>PostgreSQL</productname> can do this for you.
</para>
<para>
The new declaration of the tables would look like this:
<programlisting>
CREATE TABLE cities (
name varchar(80) primary key,
location point
);
CREATE TABLE weather (
city varchar(80) references weather,
temp_lo int,
temp_hi int,
prcp real,
date date
);
</programlisting>
Now try inserting an invalid record:
<programlisting>
INSERT INTO weather VALUES ('Berkeley', 45, 53, 0.0, '1994-11-28');
</programlisting>
<screen>
ERROR: &lt;unnamed&gt; referential integrity violation - key referenced from weather not found in cities
</screen>
</para>
<para>
The behavior of foreign keys can be finely tuned to your
application. We will not go beyond this simple example in this
tutorial and refer you to the <citetitle>Reference
Manual</citetitle> for more information. Making correct use of
foreign keys will definitely improve the quality of your database
applications, so you are strongly encouraged to learn about them.
</para>
</sect1>
<sect1 id="tutorial-transactions">
<title>Transactions</title>
<comment>This section needs to be written.</comment>
<para>
</para>
</sect1>
<sect1 id="tutorial-inheritance">
<title>Inheritance</title> <title>Inheritance</title>
<indexterm zone="tutorial-inheritance">
<primary>inheritance</primary>
</indexterm>
<para>
Inheritance is a concept from object-oriented databases. It opens
up interesting new possibilities of database design.
</para>
<para>
Let's create two tables: A table <classname>cities</classname>
and a table <classname>capitals</classname>. Naturally, capitals
are also cities, so you want some way to show the capitals
implicitly when you list all cities. If you're really clever you
might invent some scheme like this:
<programlisting>
CREATE TABLE capitals (
name text,
population real,
altitude int, -- (in ft)
state char(2)
);
CREATE TABLE non_capitals (
name text,
population real,
altitude int -- (in ft)
);
CREATE VIEW cities AS
SELECT name, population, altitude FROM capitals
UNION
SELECT name, population, altitude FROM non_capitals;
</programlisting>
This works OK as far as querying goes, but it gets ugly when you
need to update several rows, to name one thing.
</para>
<para> <para>
Let's create two tables. The capitals table contains A better solution is this:
state capitals that are also cities. Naturally, the
capitals table should inherit from cities.
<programlisting> <programlisting>
CREATE TABLE cities ( CREATE TABLE cities (
name text, name text,
population real, population real,
...@@ -36,245 +207,93 @@ CREATE TABLE cities ( ...@@ -36,245 +207,93 @@ CREATE TABLE cities (
CREATE TABLE capitals ( CREATE TABLE capitals (
state char(2) state char(2)
) INHERITS (cities); ) INHERITS (cities);
</programlisting> </programlisting>
In this case, a row of capitals <firstterm>inherits</firstterm> all In this case, a row of <classname>capitals</classname>
columns (name, population, and altitude) from its <firstterm>inherits</firstterm> all columns (<structfield>name</>,
parent, cities. The type of the column name is <structfield>population</>, and <structfield>altitude</>) from its
<type>text</type>, a native <productname>Postgres</productname> <firstterm>parent</firstterm>, <classname>cities</classname>. The
type for variable length type of the column <structfield>name</structfield> is
ASCII strings. The type of the column population is <type>text</type>, a native <productname>Postgres</productname>
<type>real</type>, a type for single precision type for variable length character strings. State capitals have
floating point numbers. State capitals have an extra an extra column, state, that shows their state. In
column, state, that shows their state. <productname>PostgreSQL</productname>, a table can inherit from
In <productname>Postgres</productname>, zero or more other tables.
a table can inherit from zero or more other tables,
and a query can reference either all rows of a
table or all rows of a tables plus all of its
descendants.
<note>
<para>
The inheritance hierarchy is a directed acyclic graph.
</para>
</note>
</para> </para>
<para> <para>
For example, the following query finds the names of all cities, For example, the following query finds the names of all cities,
including state capitals, that are located at an altitude including state capitals, that are located at an altitude
over 500ft: over 500 ft.:
<programlisting> <programlisting>
SELECT name, altitude SELECT name, altitude
FROM cities FROM cities
WHERE altitude &gt; 500; WHERE altitude &gt; 500;
</programlisting> </programlisting>
which returns: which returns:
<programlisting> <screen>
+----------+----------+ name | altitude
|name | altitude | -----------+----------
+----------+----------+ Las Vegas | 2174
|Las Vegas | 2174 | Mariposa | 1953
+----------+----------+ Madison | 845
|Mariposa | 1953 | (3 rows)
+----------+----------+ </screen>
|Madison | 845 |
+----------+----------+
</programlisting>
</para> </para>
<para> <para>
On the other hand, the following query finds On the other hand, the following query finds
all the cities that are not state capitals and all the cities that are not state capitals and
are situated at an altitude of 500ft or higher: are situated at an altitude of 500 ft. or higher:
<programlisting> <programlisting>
SELECT name, altitude SELECT name, altitude
FROM ONLY cities FROM ONLY cities
WHERE altitude &gt; 500; WHERE altitude &gt; 500;
</programlisting>
+----------+----------+ <screen>
|name | altitude | name | altitude
+----------+----------+ -----------+----------
|Las Vegas | 2174 | Las Vegas | 2174
+----------+----------+ Mariposa | 1953
|Mariposa | 1953 | (2 rows)
+----------+----------+ </screen>
</programlisting>
</para> </para>
<para> <para>
Here the <quote>ONLY</quote> before cities indicates that the query should Here the <literal>ONLY</literal> before <literal>cities</literal>
be run over only the cities table, and not tables below cities in the indicates that the query should be run over only the
inheritance hierarchy. Many of the commands that we <classname>cities</classname> table, and not tables below
have already discussed -- <command>SELECT</command>, <classname>cities</classname> in the inheritance hierarchy. Many
<command>UPDATE</command> and <command>DELETE</command> -- of the commands that we have already discussed --
support this <quote>ONLY</quote> notation. <command>SELECT</command>, <command>UPDATE</command> and
<command>DELETE</command> -- support this <literal>ONLY</literal>
notation.
</para> </para>
<note>
<title>Deprecated</title>
<para>
In previous versions of <productname>Postgres</productname>, the
default was not to get access to child tables. This was found to
be error prone and is also in violation of SQL99. Under the old
syntax, to get the sub-tables you append "*" to the table name.
For example
<programlisting>
SELECT * from cities*;
</programlisting>
You can still explicitly specify scanning child tables by appending
"*", as well as explicitly specify not scanning child tables by
writing <quote>ONLY</quote>. But beginning in version 7.1, the default
behavior for an undecorated table name is to scan its child tables
too, whereas before the default was not to do so. To get the old
default behavior, set the configuration option
<literal>SQL_Inheritance</literal> to off, e.g.,
<programlisting>
SET SQL_Inheritance TO OFF;
</programlisting>
or add a line in your <filename>postgresql.conf</filename> file.
</para>
</note>
</sect1> </sect1>
<sect1 id="non-atomic-values">
<title>Non-Atomic Values</title>
<sect1 id="tutorial-conclusion">
<title>Conclusion</title>
<para> <para>
One of the tenets of the relational model is that the <productname>PostgreSQL</productname> has many features not
columns of a table are atomic. touched upon in this tutorial introduction, which has been
<productname>Postgres</productname> does not oriented toward newer users of <acronym>SQL</acronym>. These
have this restriction; columns can themselves contain features are discussed in more detail in both the
sub-values that can be accessed from the query <citetitle>User's Guide</citetitle> and the
language. For example, you can create columns that <citetitle>Programmer's Guide</citetitle>.
are arrays of base types.
</para> </para>
<sect2>
<title>Arrays</title>
<para>
<productname>Postgres</productname> allows columns of a
row to be defined
as fixed-length or variable-length multi-dimensional
arrays. Arrays of any base type or user-defined type
can be created. To illustrate their use, we first create a
table with arrays of base types.
<programlisting>
CREATE TABLE SAL_EMP (
name text,
pay_by_quarter integer[],
schedule text[][]
);
</programlisting>
</para>
<para>
The above query will create a table named SAL_EMP with
a <firstterm>text</firstterm> string (name), a one-dimensional
array of <firstterm>integer</firstterm>
(pay_by_quarter), which represents the employee's
salary by quarter and a two-dimensional array of
<firstterm>text</firstterm>
(schedule), which represents the employee's weekly
schedule. Now we do some <firstterm>INSERT</firstterm>s;
note that when
appending to an array, we enclose the values within
braces and separate them by commas. If you know
<firstterm>C</firstterm>,
this is not unlike the syntax for initializing structures.
<programlisting>
INSERT INTO SAL_EMP
VALUES ('Bill',
'{10000, 10000, 10000, 10000}',
'{{"meeting", "lunch"}, {}}');
INSERT INTO SAL_EMP
VALUES ('Carol',
'{20000, 25000, 25000, 25000}',
'{{"talk", "consult"}, {"meeting"}}');
</programlisting>
By default, <productname>Postgres</productname> uses the
"one-based" numbering
convention for arrays -- that is, an array of n elements
starts with array[1] and ends with array[n].
Now, we can run some queries on SAL_EMP. First, we
show how to access a single element of an array at a
time. This query retrieves the names of the employees
whose pay changed in the second quarter:
<programlisting>
SELECT name
FROM SAL_EMP
WHERE SAL_EMP.pay_by_quarter[1] &lt;&gt;
SAL_EMP.pay_by_quarter[2];
+------+
|name |
+------+
|Carol |
+------+
</programlisting>
</para>
<para>
This query retrieves the third quarter pay of all
employees:
<programlisting>
SELECT SAL_EMP.pay_by_quarter[3] FROM SAL_EMP;
+---------------+
|pay_by_quarter |
+---------------+
|10000 |
+---------------+
|25000 |
+---------------+
</programlisting>
</para>
<para>
We can also access arbitrary slices of an array (subarrays)
by specifying both lower and upper bounds for
each subscript. This query retrieves the first item on
Bill's schedule for the first two days of the week.
<programlisting>
SELECT SAL_EMP.schedule[1:2][1:1]
FROM SAL_EMP
WHERE SAL_EMP.name = 'Bill';
+-------------------+
|schedule |
+-------------------+
|{{"meeting"},{""}} |
+-------------------+
</programlisting>
</para>
</sect2>
</sect1>
<sect1 id="more-advanced">
<title>More Advanced Features</title>
<para> <para>
<productname>Postgres</productname> has many features not touched If you feel you need more introductory material, please visit the
upon in this <ulink url="http://www.postgresql.org">PostgreSQL web
tutorial introduction, which has been oriented toward newer users of site</ulink> for links to more resources.
<acronym>SQL</acronym>.
These are discussed in more detail in both the User's and
Programmer's Guides.
</para> </para>
</sect1> </sect1>
</chapter> </chapter>
......
<Chapter Id="arch">
<TITLE>Architecture</TITLE>
<Sect1 id="arch-concepts">
<Title><ProductName>Postgres</ProductName> Architectural Concepts</Title>
<Para>
Before we begin, you should understand the basic
<ProductName>Postgres</ProductName> system architecture. Understanding how the
parts of <ProductName>Postgres</ProductName> interact will make the next chapter
somewhat clearer.
In database jargon, <ProductName>Postgres</ProductName> uses a simple "process
per-user" client/server model. A <ProductName>Postgres</ProductName> session
consists of the following cooperating Unix processes (programs):
</Para>
<ItemizedList>
<ListItem>
<Para>
A supervisory daemon process (<Application>postmaster</Application>),
</Para>
</ListItem>
<ListItem>
<Para>
the user's frontend application (e.g., the <Application>psql</Application> program), and
</Para>
</ListItem>
<ListItem>
<Para>
the one or more backend database servers (the <Application>postgres</Application> process itself).
</Para>
</ListItem>
</ItemizedList>
<Para>
A single <Application>postmaster</Application> manages a given collection of
databases on a single host. Such a collection of
databases is called a cluster (of databases). Frontend
applications that wish to access a given database
within a cluster make calls to the library.
The library sends user requests over the network to the
<Application>postmaster</Application> (<XRef LinkEnd="ARCH-CLIENTSERVER">),
which in turn starts a new backend server process
<Figure Id="ARCH-CLIENTSERVER">
<Title>How a connection is established</Title>
<Graphic Align="center" FileRef="clientserver.gif" Format="GIF"></Graphic>
</Figure>
and connects the
frontend process to the new server. From
that point on, the frontend process and the backend
server communicate without intervention by the
<Application>postmaster</Application>. Hence, the <Application>postmaster</Application> is always running, waiting
for requests, whereas frontend and backend processes
come and go.
</Para>
<Para>
The <FileName>libpq</FileName> library allows a single
frontend to make multiple connections to backend processes.
However, the frontend application is still a
single-threaded process. Multithreaded frontend/backend
connections are not currently supported in <FileName>libpq</FileName>.
One implication of this architecture is that the
<Application>postmaster</Application> and the backend always run on the same
machine (the database server), while the frontend
application may run anywhere. You should keep this
in mind,
because the files that can be accessed on a client
machine may not be accessible (or may only be accessed
using a different filename) on the database server
machine.
</Para>
<Para>
You should also be aware that the <Application>postmaster</Application> and
postgres servers run with the user-id of the <ProductName>Postgres</ProductName>
"superuser." Note that the <ProductName>Postgres</ProductName> superuser does not
have to be a special user (e.g., a user named
"postgres"). Furthermore, the <ProductName>Postgres</ProductName> superuser
should
definitely not be the Unix superuser ("root")! In any
case, all files relating to a database should belong to
this <ProductName>Postgres</ProductName> superuser.
</Para>
</sect1>
</Chapter>
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/filelist.sgml,v 1.14 2001/08/26 21:17:12 tgl Exp $ --> <!-- $Header: /cvsroot/pgsql/doc/src/sgml/filelist.sgml,v 1.15 2001/09/02 23:27:49 petere Exp $ -->
<!entity history SYSTEM "history.sgml"> <!entity history SYSTEM "history.sgml">
<!entity info SYSTEM "info.sgml"> <!entity info SYSTEM "info.sgml">
<!entity intro SYSTEM "intro.sgml">
<!entity legal SYSTEM "legal.sgml"> <!entity legal SYSTEM "legal.sgml">
<!entity notation SYSTEM "notation.sgml"> <!entity notation SYSTEM "notation.sgml">
<!entity problems SYSTEM "problems.sgml"> <!entity problems SYSTEM "problems.sgml">
...@@ -12,12 +13,12 @@ ...@@ -12,12 +13,12 @@
<!-- tutorial --> <!-- tutorial -->
<!entity advanced SYSTEM "advanced.sgml"> <!entity advanced SYSTEM "advanced.sgml">
<!entity arch SYSTEM "arch.sgml">
<!entity intro SYSTEM "intro.sgml">
<!entity query SYSTEM "query.sgml"> <!entity query SYSTEM "query.sgml">
<!entity sql SYSTEM "sql.sgml">
<!entity start SYSTEM "start.sgml"> <!entity start SYSTEM "start.sgml">
<!-- currently unused, but contains some interesting information -->
<!entity sql SYSTEM "sql.sgml">
<!-- user's guide --> <!-- user's guide -->
<!entity array SYSTEM "array.sgml"> <!entity array SYSTEM "array.sgml">
<!entity datatype SYSTEM "datatype.sgml"> <!entity datatype SYSTEM "datatype.sgml">
......
<!-- <!--
$Header: /cvsroot/pgsql/doc/src/sgml/info.sgml,v 1.11 2001/02/03 19:03:26 petere Exp $ $Header: /cvsroot/pgsql/doc/src/sgml/info.sgml,v 1.12 2001/09/02 23:27:49 petere Exp $
--> -->
<sect1 id="resources"> <sect1 id="resources">
...@@ -13,7 +13,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/info.sgml,v 1.11 2001/02/03 19:03:26 petere ...@@ -13,7 +13,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/info.sgml,v 1.11 2001/02/03 19:03:26 petere
<term>Tutorial</term> <term>Tutorial</term>
<listitem> <listitem>
<para> <para>
An introduction for new users. Does not cover advanced features. An informal introduction for new users
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
......
<!-- <!--
$Header: /cvsroot/pgsql/doc/src/sgml/query.sgml,v 1.17 2001/01/13 23:58:55 petere Exp $ $Header: /cvsroot/pgsql/doc/src/sgml/query.sgml,v 1.18 2001/09/02 23:27:49 petere Exp $
--> -->
<chapter id="query"> <chapter id="tutorial-sql">
<title>The Query Language</title> <title>The <acronym>SQL</acronym> Language</title>
<para> <sect1 id="tutorial-sql-intro">
The <productname>Postgres</productname> query language is a variant of <title>Introduction</title>
the <acronym>SQL</acronym> standard. It
has many extensions to <acronym>SQL</acronym> such as an <para>
extensible type system, This chapter provides an overview of how to use
inheritance, functions and production rules. These are <acronym>SQL</acronym> to perform simple operations. This
features carried over from the original tutorial is only intended to give you an introduction and is in no
<productname>Postgres</productname> query way a complete tutorial on <acronym>SQL</acronym>. Numerous books
language, <productname>PostQuel</productname>. have been written on <acronym>SQL92</acronym>, including <xref
This section provides an overview linkend="MELT93" endterm="MELT93"> and <xref linkend="DATE97"
of how to use <productname>Postgres</productname> endterm="DATE97">. You should be aware that some language
<acronym>SQL</acronym> to perform simple operations. features are extensions to the standard.
This manual is only intended to give you an idea of our </para>
flavor of <acronym>SQL</acronym> and is in no way a complete tutorial on
<acronym>SQL</acronym>. Numerous books have been written on <para>
<acronym>SQL92</acronym>, including In the examples that follow, we assume that you have created a
<xref linkend="MELT93" endterm="MELT93"> and database named <quote>mydb</quote>, as described in the previous
<xref linkend="DATE97" endterm="DATE97">. chapter, and have started <application>psql</application>.
You should be aware that some language features </para>
are extensions to the standard.
</para> <para>
Examples in this manual can also be found in source distribution
<sect1 id="query-psql"> in the directory <filename>src/tutorial/</filename>. Refer to the
<title>Interactive Monitor</title> <filename>README</filename> file in that directory for how to use
them. To start the tutorial, do the following:
<para>
In the examples that follow, we assume that you have
created the mydb database as described in the previous
subsection and have started <application>psql</application>.
Examples in this manual can also be found in source distribution
in the directory <filename>src/tutorial/</filename>. Refer to the
<filename>README</filename> file in that directory for how to use them. To
start the tutorial, do the following:
<screen> <screen>
<prompt>$</prompt> <userinput>cd <replaceable>...</replaceable>/src/tutorial</userinput> <prompt>$</prompt> <userinput>cd <replaceable>....</replaceable>/src/tutorial</userinput>
<prompt>$</prompt> <userinput>psql -s mydb</userinput> <prompt>$</prompt> <userinput>psql -s mydb</userinput>
<computeroutput> <computeroutput>
Welcome to the POSTGRESQL interactive sql monitor: ...
Please read the file COPYRIGHT for copyright terms of POSTGRESQL
type \? for help on slash commands
type \q to quit
type \g or terminate with semicolon to execute query
You are currently connected to the database: postgres
</computeroutput> </computeroutput>
<prompt>mydb=&gt;</prompt> <userinput>\i basics.sql</userinput> <prompt>mydb=&gt;</prompt> <userinput>\i basics.sql</userinput>
</screen> </screen>
The <literal>\i</literal> command reads in commands from the
specified files. The <literal>-s</literal> option puts you in
single step mode which pauses before sending a query to the
server. The commands used in this section are in the file
<filename>basics.sql</filename>.
</para> </para>
</sect1>
<sect1 id="tutorial-concepts">
<title>Concepts</title>
<para> <para>
The <literal>\i</literal> command read in queries from the specified <indexterm><primary>relational database</primary></indexterm>
files. The <literal>-s</literal> option puts you in single step mode which <indexterm><primary>hierarchical database</primary></indexterm>
pauses before sending a query to the backend. Queries <indexterm><primary>object-oriented database</primary></indexterm>
in this section are in the file <filename>basics.sql</filename>. <indexterm><primary>relation</primary></indexterm>
<indexterm><primary>table</primary></indexterm>
<productname>PostgreSQL</productname> is a <firstterm>relational
database management system</firstterm> (<acronym>RDBMS</acronym>).
That means it is a system for managing data stored in
<firstterm>relations</firstterm>. Relation is essentially a
mathematical term for <firstterm>table</firstterm>. The notion of
storing data in tables is so commonplace today that it might
seem inherently obvious, but there are a number of other ways of
organizing databases. Files and directories on Unix-like
operating systems form an example of a hierarchical database. A
more modern development is the object-oriented database.
</para> </para>
<para> <para>
<application>psql</application> <indexterm><primary>row</primary></indexterm>
has a variety of <literal>\d</literal> commands for showing system information. <indexterm><primary>column</primary></indexterm>
Consult these commands for more details;
for a listing, type <literal>\?</literal> at the <application>psql</application> prompt. Each table is a named collection of <firstterm>rows</firstterm>.
Each row has the same set of named <firstterm>columns</firstterm>,
and each column is of a specific data type. Whereas columns have
a fixed order in each row, it is important to remember that SQL
does not guarantee the order of the rows within the table in any
way (unless they are explicitly sorted).
</para> </para>
</sect1>
<sect1 id="query-concepts">
<title>Concepts</title>
<para> <para>
The fundamental notion in <productname>Postgres</productname> is <indexterm><primary>cluster</primary></indexterm>
that of a <firstterm>table</firstterm>, which is a named
collection of <firstterm>rows</firstterm>. Each row has the same Tables are grouped into databases, and a collection of databases
set of named <firstterm>columns</firstterm>, and each column is of managed by a single <productname>PostgreSQL</productname> server
a specific type. Furthermore, each row has a permanent instance constitutes a database <firstterm>cluster</firstterm>.
<firstterm>object identifier</firstterm> (<acronym>OID</acronym>)
that is unique throughout the database cluster. Historially,
tables have been called classes in
<productname>Postgres</productname>, rows are object instances,
and columns are attributes. This makes sense if you consider the
object-relational aspects of the database system, but in this
manual we will use the customary <acronym>SQL</acronym>
terminology. As previously discussed,
tables are grouped into databases, and a collection of databases
managed by a single <application>postmaster</application> process
constitutes a database cluster.
</para> </para>
</sect1> </sect1>
<sect1 id="query-table">
<sect1 id="tutorial-table">
<title>Creating a New Table</title> <title>Creating a New Table</title>
<indexterm zone="tutorial-table">
<primary>CREATE TABLE</primary>
</indexterm>
<para> <para>
You can create a new table by specifying the table You can create a new table by specifying the table
name, along with all column names and their types: name, along with all column names and their types:
...@@ -110,39 +114,82 @@ CREATE TABLE weather ( ...@@ -110,39 +114,82 @@ CREATE TABLE weather (
date date date date
); );
</programlisting> </programlisting>
You can enter this into <command>psql</command> with the line
breaks. <command>psql</command> will recognize that the command
is not terminated until the semicolon.
</para>
<para>
White space (i.e., spaces, tabs, and newlines) may be used freely
in SQL commands. That means you can type the command aligned
differently than above, or even all on one line. Two dashes
(<quote><literal>--</literal></quote>) introduce comments.
Whatever follows them is ignored up to the end of the line. SQL
is also case insensitive about key words and identifiers, except
when identifiers are double-quoted to preserve the case (not done
above).
</para>
<para>
<type>varchar(80)</type> specifies a data type that can store
arbitrary character strings up to 80 characters in length.
<type>int</type> is the normal integer type. <type>real</type> is
a type for storing single precision floating point numbers.
<type>date</type> should be self-explanatory. (Yes, the column of
type <type>date</type> is also named <literal>date</literal>.
This may be convenient or confusing -- you choose.)
</para> </para>
<para> <para>
Note that both keywords and identifiers are case-insensitive; <productname>PostgreSQL</productname> supports the usual
identifiers can preserve case by surrounding them with
double-quotes as allowed
by <acronym>SQL92</acronym>.
<productname>Postgres</productname> <acronym>SQL</acronym>
supports the usual
<acronym>SQL</acronym> types <type>int</type>, <acronym>SQL</acronym> types <type>int</type>,
<type>float</type>, <type>real</type>, <type>smallint</type>, <type>smallint</type>, <type>real</type>, <type>double
<type>char(N)</type>, precision</type>, <type>char(<replaceable>N</>)</type>,
<type>varchar(N)</type>, <type>date</type>, <type>time</type>, <type>varchar(<replaceable>N</>)</type>, <type>date</type>,
and <type>timestamp</type>, as well as other types of general utility and <type>time</type>, <type>timestamp</type>, and
a rich set of geometric types. As we will <type>interval</type> as well as other types of general utility
see later, <productname>Postgres</productname> can be customized and a rich set of geometric types.
with an <productname>PostgreSQL</productname> can be customized with an
arbitrary number of arbitrary number of user-defined data types. Consequently, type
user-defined data types. Consequently, type names are names are not syntactical keywords, except where required to
not syntactical keywords, except where required to support special support special cases in the <acronym>SQL</acronym> standard.
cases in the <acronym>SQL92</acronym> standard. </para>
So far, the <productname>Postgres</productname>
<command>CREATE</command> command <para>
looks exactly like The second example will store cities and their associated
the command used to create a table in a traditional geographical location:
relational system. However, we will presently see that <programlisting>
tables have properties that are extensions of the CREATE TABLE cities (
relational model. name varchar(80),
location point
);
</programlisting>
The <type>point</type> type is such a
<productname>PostgreSQL</productname>-specific data type.
</para>
<para>
<indexterm>
<primary>DROP TABLE</primary>
</indexterm>
Finally, it should be mentioned that if you don't need a table any
longer or want to recreate it differently you can remove it using
the following command:
<synopsis>
DROP TABLE <replaceable>tablename</replaceable>;
</synopsis>
</para> </para>
</sect1> </sect1>
<sect1 id="query-populate">
<title>Populating a Table with Rows</title> <sect1 id="tutorial-populate">
<title>Populating a Table With Rows</title>
<indexterm zone="tutorial-populate">
<primary>INSERT</primary>
</indexterm>
<para> <para>
The <command>INSERT</command> statement is used to populate a table with The <command>INSERT</command> statement is used to populate a table with
...@@ -151,129 +198,184 @@ CREATE TABLE weather ( ...@@ -151,129 +198,184 @@ CREATE TABLE weather (
<programlisting> <programlisting>
INSERT INTO weather VALUES ('San Francisco', 46, 50, 0.25, '1994-11-27'); INSERT INTO weather VALUES ('San Francisco', 46, 50, 0.25, '1994-11-27');
</programlisting> </programlisting>
Note that all data types use rather obvious input formats. The
<type>date</type> column is actually quite flexible in what it
accepts, but for this tutorial we will stick to the unambiguous
format shown here.
</para> </para>
<para> <para>
You can also use <command>COPY</command> to load large The <type>point</type> type requires a coordinate pair as input,
amounts of data from flat (<acronym>ASCII</acronym>) files. as shown here:
This is usually faster because the data is read (or written) as a <programlisting>
single atomic INSERT INTO cities VALUES ('San Francisco', '(-194.0, 53.0)');
transaction directly to or from the target table. An example would be: </programlisting>
</para>
<para>
The syntax used so far requires you to remember the order of the
columns. An alternative syntax allows you to list the columns
explicitly:
<programlisting> <programlisting>
COPY weather FROM '/home/user/weather.txt' USING DELIMITERS '|'; INSERT INTO weather (city, temp_lo, temp_hi, prcp, date)
VALUES ('San Francisco', 43, 57, 0.0, '1994-11-29');
</programlisting>
You can also list the columns in a different order if you wish or
even omit some columns, e.g., unknown precipitation:
<programlisting>
INSERT INTO weather (date, city, temp_hi, temp_lo)
VALUES ('1994-11-29', 'Hayward', 54, 37);
</programlisting>
Many developers consider explicitly listing the columns better
style than relying on the order implicitly.
</para>
<para>
Please enter all the commands shown above so you have some data to
work with in the following sections.
</para>
<para>
<indexterm>
<primary>COPY</primary>
</indexterm>
You could also have used <command>COPY</command> to load large
amounts of data from flat text files. This is usually faster
because the <command>COPY</command> is optimized for this
application while allowing less flexibility than
<command>INSERT</command>. An example would be:
<programlisting>
COPY weather FROM '/home/user/weather.txt';
</programlisting> </programlisting>
where the path name for the source file must be available to the where the path name for the source file must be available to the
backend server backend server machine, not the client, since the backend server
machine, not the client, since the backend server reads the file directly. reads the file directly. You can read more about the
<command>COPY</command> command in the <citetitle>Reference
Manual</citetitle>.
</para> </para>
</sect1> </sect1>
<sect1 id="query-query">
<sect1 id="tutorial-select">
<title>Querying a Table</title> <title>Querying a Table</title>
<para> <para>
The <classname>weather</classname> table can be queried with normal relational <indexterm><primary>query</primary></indexterm>
selection and projection queries. A <acronym>SQL</acronym> <indexterm><primary>SELECT</primary></indexterm>
<command>SELECT</command>
statement is used to do this. The statement is divided into To retrieve data from a table it is
a target list (the part that lists the columns to be <firstterm>queried</firstterm>. An <acronym>SQL</acronym>
returned) and a qualification (the part that specifies <command>SELECT</command> statement is used to do this. The
any restrictions). For example, to retrieve all the statement is divided into a select list (the part that lists the
rows of weather, type: columns to be returned), a table list (the part that lists the
tables from which to retrieve the data), and an optional
qualification (the part that specifies any restrictions). For
example, to retrieve all the rows of
<classname>weather</classname>, type:
<programlisting> <programlisting>
SELECT * FROM weather; SELECT * FROM weather;
</programlisting> </programlisting>
(where <literal>*</literal> means <quote>all columns</quote>) and
the output should be:
<screen>
city | temp_lo | temp_hi | prcp | date
---------------+---------+---------+------+------------
San Francisco | 46 | 50 | 0.25 | 1994-11-27
San Francisco | 43 | 57 | 0 | 1994-11-29
Hayward | 37 | 54 | | 1994-11-29
(3 rows)
</screen>
</para>
and the output should be: <para>
<programlisting> You may specify any arbitrary expressions in the target list. For
+--------------+---------+---------+------+------------+
|city | temp_lo | temp_hi | prcp | date |
+--------------+---------+---------+------+------------+
|San Francisco | 46 | 50 | 0.25 | 1994-11-27 |
+--------------+---------+---------+------+------------+
|San Francisco | 43 | 57 | 0 | 1994-11-29 |
+--------------+---------+---------+------+------------+
|Hayward | 37 | 54 | | 1994-11-29 |
+--------------+---------+---------+------+------------+
</programlisting>
You may specify any arbitrary expressions in the target list. For
example, you can do: example, you can do:
<programlisting> <programlisting>
SELECT city, (temp_hi+temp_lo)/2 AS temp_avg, date FROM weather; SELECT city, (temp_hi+temp_lo)/2 AS temp_avg, date FROM weather;
</programlisting> </programlisting>
This should give:
<screen>
city | temp_avg | date
---------------+----------+------------
San Francisco | 48 | 1994-11-27
San Francisco | 50 | 1994-11-29
Hayward | 45 | 1994-11-29
(3 rows)
</screen>
Notice how the <literal>AS</literal> clause is used to relabel the
output column. (It is optional.)
</para> </para>
<para> <para>
Arbitrary Boolean operators Arbitrary Boolean operators (<literal>AND</literal>,
(<command>AND</command>, <command>OR</command> and <literal>OR</literal>, and <literal>NOT</literal>) are allowed in
<command>NOT</command>) are the qualification of a query. For example, the following
allowed in the qualification of any query. For example, retrieves the weather of San Francisco on rainy days:
<programlisting> <programlisting>
SELECT * FROM weather SELECT * FROM weather
WHERE city = 'San Francisco' WHERE city = 'San Francisco'
AND prcp > 0.0; AND prcp > 0.0;
</programlisting> </programlisting>
results in: Result:
<programlisting> <screen>
+--------------+---------+---------+------+------------+ city | temp_lo | temp_hi | prcp | date
|city | temp_lo | temp_hi | prcp | date | ---------------+---------+---------+------+------------
+--------------+---------+---------+------+------------+ San Francisco | 46 | 50 | 0.25 | 1994-11-27
|San Francisco | 46 | 50 | 0.25 | 1994-11-27 | (1 row)
+--------------+---------+---------+------+------------+ </screen>
</programlisting>
</para> </para>
<para> <para>
As a final note, you can specify that the results of a <indexterm><primary>ORDER BY</primary></indexterm>
select can be returned in a <firstterm>sorted order</firstterm> <indexterm><primary>DISTINCT</primary></indexterm>
or with duplicate rows removed. <indexterm><primary>duplicate</primary></indexterm>
As a final note, you can request that the results of a select can
be returned in sorted order or with duplicate rows removed. (Just
to make sure the following won't confuse you,
<literal>DISTINCT</literal> and <literal>ORDER BY</literal> can be
used separately.)
<programlisting> <programlisting>
SELECT DISTINCT city SELECT DISTINCT city
FROM weather FROM weather
ORDER BY city; ORDER BY city;
</programlisting> </programlisting>
</para>
</sect1>
<sect1 id="query-selectinto">
<title>Redirecting SELECT Queries</title>
<para>
Any <command>SELECT</command> query can be redirected to a new table
<programlisting>
SELECT * INTO TABLE temp FROM weather;
</programlisting>
</para>
<para> <screen>
This forms an implicit <command>CREATE</command> command, creating a new city
table temp with the column names and types specified ---------------
in the target list of the <command>SELECT INTO</command> command. We can Hayward
then, of course, perform any operations on the resulting San Francisco
table that we can perform on other tables. (2 rows)
</screen>
</para> </para>
</sect1> </sect1>
<sect1 id="query-join">
<sect1 id="tutorial-join">
<title>Joins Between Tables</title> <title>Joins Between Tables</title>
<indexterm zone="tutorial-join">
<primary>join</primary>
</indexterm>
<para> <para>
Thus far, our queries have only accessed one table at a Thus far, our queries have only accessed one table at a time.
time. Queries can access multiple tables at once, or Queries can access multiple tables at once, or access the same
access the same table in such a way that multiple table in such a way that multiple rows of the table are being
rows of the table are being processed at the same processed at the same time. A query that accesses multiple rows
time. A query that accesses multiple rows of the of the same or different tables at one time is called a
same or different tables at one time is called a join <firstterm>join</firstterm> query. As an example, say you wish to
query. list all the weather records together with the location of the
As an example, say we wish to find all the records that associated city. In effect, we need to compare the city column of
are in the temperature range of other records. In each row of the weather table with the name column of all rows in
effect, we need to compare the temp_lo and temp_hi the cities table.
columns of each WEATHER row to the temp_lo and
temp_hi columns of all other WEATHER columns.
<note> <note>
<para> <para>
This is only a conceptual model. The actual join may This is only a conceptual model. The actual join may
...@@ -281,102 +383,189 @@ SELECT * INTO TABLE temp FROM weather; ...@@ -281,102 +383,189 @@ SELECT * INTO TABLE temp FROM weather;
to the user. to the user.
</para> </para>
</note> </note>
This would be accomplished by the following query:
We can do this with the following query:
<programlisting> <programlisting>
SELECT W1.city, W1.temp_lo AS low, W1.temp_hi AS high, SELECT *
W2.city, W2.temp_lo AS low, W2.temp_hi AS high FROM weather, cities
FROM weather W1, weather W2 WHERE city = name;
WHERE W1.temp_lo < W2.temp_lo </programlisting>
AND W1.temp_hi > W2.temp_hi;
+--------------+-----+------+---------------+-----+------+ <screen>
|city | low | high | city | low | high | city | temp_lo | temp_hi | prcp | date | name | location
+--------------+-----+------+---------------+-----+------+ ---------------+---------+---------+------+------------+---------------+-----------
|San Francisco | 43 | 57 | San Francisco | 46 | 50 | San Francisco | 46 | 50 | 0.25 | 1994-11-27 | San Francisco | (-194,53)
+--------------+-----+------+---------------+-----+------+ San Francisco | 43 | 57 | 0 | 1994-11-29 | San Francisco | (-194,53)
|San Francisco | 37 | 54 | San Francisco | 46 | 50 | (2 rows)
+--------------+-----+------+---------------+-----+------+ </screen>
</programlisting>
<note>
<para>
The semantics of such a join are
that the qualification
is a truth expression defined for the Cartesian product of
the tables indicated in the query. For those rows in
the Cartesian product for which the qualification is true,
<productname>Postgres</productname> computes and returns the
values specified in the target list.
<productname>Postgres</productname> <acronym>SQL</acronym>
does not assign any meaning to
duplicate values in such expressions.
This means that <productname>Postgres</productname>
sometimes recomputes the same target list several times;
this frequently happens when Boolean expressions are connected
with an "or". To remove such duplicates, you must use
the <command>SELECT DISTINCT</command> statement.
</para>
</note>
</para> </para>
<para> <para>
In this case, both <literal>W1</literal> and Observe two things about the result set:
<literal>W2</literal> are surrogates for a <itemizedlist>
row of the table weather, and both range over all <listitem>
rows of the table. (In the terminology of most <para>
database systems, <literal>W1</literal> and <literal>W2</literal> There is no result row for the city of Hayward. This is
are known as <firstterm>range variables</firstterm>.) because there is no matching entry in the
A query can contain an arbitrary number of <classname>cities</classname> table for Hayward, so the join
table names and surrogates. cannot process the rows in the weather table. We will see
shortly how this can be fixed.
</para>
</listitem>
<listitem>
<para>
There are two columns containing the city name. This is
correct because the lists of columns of the
<classname>weather</classname> and the
<classname>cities</classname> tables are concatenated. In
practice this is undesirable, though, so you will probably want
to list the output columns explicitly rather than using
<literal>*</literal>:
<programlisting>
SELECT city, temp_lo, temp_hi, prcp, date, location
FROM weather, cities
WHERE city = name;
</programlisting>
</para>
</listitem>
</itemizedlist>
</para> </para>
</sect1>
<sect1 id="query-update"> <formalpara>
<title>Updates</title> <title>Exercise:</title>
<para>
Attempt to find out the semantics of this query when the
<literal>WHERE</literal> clause is omitted.
</para>
</formalpara>
<para> <para>
You can update existing rows using the Since the columns all had different names, the parser
<command>UPDATE</command> command. automatically found out which table they belong to, but it is good
Suppose you discover the temperature readings are style to fully qualify column names in join queries:
all off by 2 degrees as of Nov 28, you may update the
data as follow:
<programlisting> <programlisting>
UPDATE weather SELECT weather.city, weather.temp_lo, weather.temp_hi, weather.prcp, weather.date, cities.location
SET temp_hi = temp_hi - 2, temp_lo = temp_lo - 2 FROM weather, cities
WHERE date > '1994-11-28'; WHERE cities.name = weather.city;
</programlisting> </programlisting>
</para> </para>
</sect1>
<sect1 id="query-delete">
<title>Deletions</title>
<para> <para>
Deletions are performed using the <command>DELETE</command> command: Join queries of the kind seen thus far can also be written in this
alternative form:
<programlisting> <programlisting>
DELETE FROM weather WHERE city = 'Hayward'; SELECT *
FROM weather INNER JOIN cities ON (weather.city = cities.name);
</programlisting> </programlisting>
All weather recording belonging to Hayward are removed. This syntax is not as commonly used as the one above, but we show
One should be wary of queries of the form it here to help you understand the following topics.
</para>
<para>
<indexterm><primary>join</primary><secondary>outer</secondary></indexterm>
Now we will figure out how we can get the Hayward records back in.
What we want the query to do is to scan the
<classname>weather</classname> table and for each row to find the
matching <classname>cities</classname> row. If no matching row is
found we want some <quote>empty values</quote> to be substituted
for the <classname>cities</classname> table's columns. This kind
of query is called an <firstterm>outer join</firstterm>. (The
joins we have seen to far are inner joins.) The command looks
like this:
<programlisting> <programlisting>
DELETE FROM <replaceable>tablename</replaceable>; SELECT *
FROM weather LEFT OUTER JOIN cities ON (weather.city = cities.name);
city | temp_lo | temp_hi | prcp | date | name | location
---------------+---------+---------+------+------------+---------------+-----------
Hayward | 37 | 54 | | 1994-11-29 | |
San Francisco | 46 | 50 | 0.25 | 1994-11-27 | San Francisco | (-194,53)
San Francisco | 43 | 57 | 0 | 1994-11-29 | San Francisco | (-194,53)
(3 rows)
</programlisting> </programlisting>
Without a qualification, <command>DELETE</command> will simply In particular, this query is a <firstterm>left outer
remove all rows from the given table, leaving it join</firstterm> because the table mentioned on the left of the
empty. The system will not request confirmation before join operator will have each of its rows in the output at least
doing this. once, whereas the table on the right will only have those rows
output that match some row of the left table, and will have empty
values substituted appropriately.
</para>
<formalpara>
<title>Exercise:</title>
<para>
There are also right outer joins and full outer joins. Try to
find out what those do.
</para>
</formalpara>
<para>
<indexterm><primary>join</primary><secondary>self</secondary></indexterm>
<indexterm><primary>alias</primary><secondary>for table name in query</secondary></indexterm>
We can also join a table against itself. This is called a
<firstterm>self join</firstterm>. As an example, suppose we wish
to find all the weather records that are in the temperature range
of other weather records. So we need to compare the
<structfield>temp_lo</> and <structfield>temp_hi</> columns of
each <classname>weather</classname> row to the
<structfield>temp_lo</structfield> and
<structfield>temp_hi</structfield> columns of all other
<classname>weather</classname> rows. We can do this with the
following query:
<programlisting>
SELECT W1.city, W1.temp_lo AS low, W1.temp_hi AS high,
W2.city, W2.temp_lo AS low, W2.temp_hi AS high
FROM weather W1, weather W2
WHERE W1.temp_lo < W2.temp_lo
AND W1.temp_hi > W2.temp_hi;
city | low | high | city | low | high
---------------+-----+------+---------------+-----+------
San Francisco | 43 | 57 | San Francisco | 46 | 50
Hayward | 37 | 54 | San Francisco | 46 | 50
(2 rows)
</programlisting>
Here we have relabeled the weather table as <literal>W1</> and
<literal>W2</> to be able to distinguish the left and right side
of the join. You can also use these kinds of aliases in other
queries to save some typing, e.g.:
<programlisting>
SELECT *
FROM weather w, cities c
WHERE w.city = c.name;
</programlisting>
You will encounter this style of abbreviating quite frequently.
</para> </para>
</sect1> </sect1>
<sect1 id="query-agg">
<title>Using Aggregate Functions</title> <sect1 id="tutorial-agg">
<title>Aggregate Functions</title>
<indexterm zone="tutorial-agg">
<primary>aggregate</primary>
</indexterm>
<para> <para>
<indexterm><primary>average</primary></indexterm>
<indexterm><primary>count</primary></indexterm>
<indexterm><primary>max</primary></indexterm>
<indexterm><primary>min</primary></indexterm>
<indexterm><primary>sum</primary></indexterm>
Like most other relational database products, Like most other relational database products,
<productname>PostgreSQL</productname> supports <productname>PostgreSQL</productname> supports
aggregate functions. aggregate functions.
...@@ -387,95 +576,215 @@ DELETE FROM <replaceable>tablename</replaceable>; ...@@ -387,95 +576,215 @@ DELETE FROM <replaceable>tablename</replaceable>;
<function>min</function> (minimum) over a set of rows. <function>min</function> (minimum) over a set of rows.
</para> </para>
<para>
It is important to understand the interaction between aggregates and
SQL's <command>WHERE</command> and <command>HAVING</command> clauses.
The fundamental difference between <command>WHERE</command> and
<command>HAVING</command> is this: <command>WHERE</command> selects
input rows before groups and aggregates are computed (thus, it controls
which rows go into the aggregate computation), whereas
<command>HAVING</command> selects group rows after groups and
aggregates are computed. Thus, the
<command>WHERE</command> clause may not contain aggregate functions;
it makes no sense to try to use an aggregate to determine which rows
will be inputs to the aggregates. On the other hand,
<command>HAVING</command> clauses always contain aggregate functions.
(Strictly speaking, you are allowed to write a <command>HAVING</command>
clause that doesn't use aggregates, but it's wasteful; the same condition
could be used more efficiently at the <command>WHERE</command> stage.)
</para>
<para> <para>
As an example, we can find the highest low-temperature reading anywhere As an example, we can find the highest low-temperature reading anywhere
with with
<programlisting> <programlisting>
SELECT max(temp_lo) FROM weather; SELECT max(temp_lo) FROM weather;
</programlisting> </programlisting>
<screen>
max
-----
46
(1 row)
</screen>
</para>
<para>
<indexterm><primary>subquery</primary></indexterm>
If we want to know what city (or cities) that reading occurred in, If we want to know what city (or cities) that reading occurred in,
we might try we might try
<programlisting> <programlisting>
SELECT city FROM weather WHERE temp_lo = max(temp_lo); SELECT city FROM weather WHERE temp_lo = max(temp_lo); <lineannotation>WRONG</lineannotation>
</programlisting> </programlisting>
but this will not work since the aggregate but this will not work since the aggregate
<function>max</function> can't be used in <function>max</function> cannot be used in the
<command>WHERE</command>. However, as is often the case the query can be <literal>WHERE</literal> clause. However, as is often the case
restated to accomplish the intended result; here by using a the query can be restated to accomplish the intended result; here
<firstterm>subselect</firstterm>: by using a <firstterm>subquery</firstterm>:
<programlisting> <programlisting>
SELECT city FROM weather SELECT city FROM weather
WHERE temp_lo = (SELECT max(temp_lo) FROM weather); WHERE temp_lo = (SELECT max(temp_lo) FROM weather);
</programlisting> </programlisting>
<screen>
city
---------------
San Francisco
(1 row)
</screen>
This is OK because the sub-select is an independent computation that This is OK because the sub-select is an independent computation
computes its own aggregate separately from what's happening in the outer that computes its own aggregate separately from what is happening
select. in the outer select.
</para> </para>
<para> <para>
Aggregates are also very useful in combination with <indexterm><primary>GROUP BY</primary></indexterm>
<command>GROUP BY</command> clauses. For example, we can get the <indexterm><primary>HAVING</primary></indexterm>
maximum low temperature observed in each city with
Aggregates are also very useful in combination with <literal>GROUP
BY</literal> clauses. For example, we can get the maximum low
temperature observed in each city with
<programlisting> <programlisting>
SELECT city, max(temp_lo) SELECT city, max(temp_lo)
FROM weather FROM weather
GROUP BY city; GROUP BY city;
</programlisting> </programlisting>
<screen>
city | max
---------------+-----
Hayward | 37
San Francisco | 46
(2 rows)
</screen>
which gives us one output row per city. We can filter these grouped which gives us one output row per city. We can filter these grouped
rows using <command>HAVING</command>: rows using <literal>HAVING</literal>:
<programlisting> <programlisting>
SELECT city, max(temp_lo) SELECT city, max(temp_lo)
FROM weather FROM weather
GROUP BY city GROUP BY city
HAVING min(temp_lo) < 0; HAVING max(temp_lo) < 40;
</programlisting> </programlisting>
<screen>
city | max
---------+-----
Hayward | 37
(1 row)
</screen>
which gives us the same results for only the cities that have some which gives us the same results for only the cities that have some
below-zero readings. Finally, if we only care about cities whose below-forty readings. Finally, if we only care about cities whose
names begin with "<literal>P</literal>", we might do names begin with <quote><literal>S</literal></quote>, we might do
<programlisting> <programlisting>
SELECT city, max(temp_lo) SELECT city, max(temp_lo)
FROM weather FROM weather
WHERE city like 'P%' WHERE city LIKE 'S%'
GROUP BY city GROUP BY city
HAVING min(temp_lo) < 0; HAVING max(temp_lo) < 40;
</programlisting> </programlisting>
</para>
Note that we can apply the city-name restriction in <para>
<command>WHERE</command>, since it needs no aggregate. This is It is important to understand the interaction between aggregates and
more efficient than adding the restriction to <command>HAVING</command>, SQL's <literal>WHERE</literal> and <literal>HAVING</literal> clauses.
The fundamental difference between <literal>WHERE</literal> and
<literal>HAVING</literal> is this: <literal>WHERE</literal> selects
input rows before groups and aggregates are computed (thus, it controls
which rows go into the aggregate computation), whereas
<literal>HAVING</literal> selects group rows after groups and
aggregates are computed. Thus, the
<literal>WHERE</literal> clause must not contain aggregate functions;
it makes no sense to try to use an aggregate to determine which rows
will be inputs to the aggregates. On the other hand,
<literal>HAVING</literal> clauses always contain aggregate functions.
(Strictly speaking, you are allowed to write a <literal>HAVING</literal>
clause that doesn't use aggregates, but it's wasteful; the same condition
could be used more efficiently at the <literal>WHERE</literal> stage.)
</para>
<para>
Note that we can apply the city name restriction in
<literal>WHERE</literal>, since it needs no aggregate. This is
more efficient than adding the restriction to <literal>HAVING</literal>,
because we avoid doing the grouping and aggregate calculations because we avoid doing the grouping and aggregate calculations
for all rows that fail the <command>WHERE</command> check. for all rows that fail the <literal>WHERE</literal> check.
</para>
</sect1>
<sect1 id="tutorial-update">
<title>Updates</title>
<indexterm zone="tutorial-update">
<primary>UPDATE</primary>
</indexterm>
<para>
You can update existing rows using the
<command>UPDATE</command> command.
Suppose you discover the temperature readings are
all off by 2 degrees as of November 28, you may update the
data as follow:
<programlisting>
UPDATE weather
SET temp_hi = temp_hi - 2, temp_lo = temp_lo - 2
WHERE date > '1994-11-28';
</programlisting>
</para>
<para>
Look at the new state of the data:
<programlisting>
SELECT * FROM weather;
city | temp_lo | temp_hi | prcp | date
---------------+---------+---------+------+------------
San Francisco | 46 | 50 | 0.25 | 1994-11-27
San Francisco | 41 | 55 | 0 | 1994-11-29
Hayward | 35 | 52 | | 1994-11-29
(3 rows)
</programlisting>
</para> </para>
</sect1> </sect1>
<sect1 id="tutorial-delete">
<title>Deletions</title>
<indexterm zone="tutorial-delete">
<primary>DELETE</primary>
</indexterm>
<para>
Suppose you are no longer interested in the weather of Hayward,
then you can do the following to delete those rows from the table.
Deletions are performed using the <command>DELETE</command>
command:
<programlisting>
DELETE FROM weather WHERE city = 'Hayward';
</programlisting>
All weather recording belonging to Hayward are removed.
<programlisting>
SELECT * FROM weather;
</programlisting>
<screen>
city | temp_lo | temp_hi | prcp | date
---------------+---------+---------+------+------------
San Francisco | 46 | 50 | 0.25 | 1994-11-27
San Francisco | 41 | 55 | 0 | 1994-11-29
(2 rows)
</screen>
</para>
<para>
One should be wary of queries of the form
<synopsis>
DELETE FROM <replaceable>tablename</replaceable>;
</synopsis>
Without a qualification, <command>DELETE</command> will simply
remove all rows from the given table, leaving it
empty. The system will not request confirmation before
doing this.
</para>
</sect1>
</chapter> </chapter>
<!-- Keep this comment at the end of the file <!-- Keep this comment at the end of the file
......
<!-- <!--
$Header: /cvsroot/pgsql/doc/src/sgml/start.sgml,v 1.16 2001/07/15 13:45:03 petere Exp $ $Header: /cvsroot/pgsql/doc/src/sgml/start.sgml,v 1.17 2001/09/02 23:27:49 petere Exp $
--> -->
<chapter id="start"> <chapter id="tutorial-start">
<title>Getting Started</title> <title>Getting Started</title>
<abstract> <sect1 id="tutorial-install">
<title>Installation</title>
<para>
Before you can use <productname>PostgreSQL</productname> you need
to install it, of course. It is possible that
<productname>PostgreSQL</productname> is already installed at your
site, either because it was included in your operating system
distribution or because the system administrator already installed
it. If that is the case, you should obtain information from the
operating system documentation or your system administrator about
how to access <productname>PostgreSQL</productname>.
</para>
<para> <para>
How to begin work with <productname>Postgres</productname> for a new user. If you are not sure whether <productname>PostgreSQL</productname>
is already available or whether you can use it for your
experimentation then you can install it yourself. Doing so is not
hard and it can be a good exercise.
<productname>PostgreSQL</productname> can be installed by any
unprivileged user, no superuser (<systemitem>root</systemitem>)
access is required.
</para> </para>
</abstract>
<para>
Some of the steps required to use <productname>Postgres</productname>
can be performed by any Postgres user, and some must be done by
the site database administrator. This site administrator
is the person who installed the software, created
the database directories and started the
<application>postmaster</application>
process. This person does not have to be the Unix
superuser ("root")
or the computer system administrator; a person can install and use
<productname>Postgres</productname> without any special accounts or
privileges.
</para>
<para>
If you are installing <productname>Postgres</productname> yourself, then
refer to the Administrator's Guide for instructions on
installation, and return
to this guide when the installation is complete.
</para>
<para>
Throughout this manual, any examples that begin with
the character "<literal>%</literal>" are commands that should be typed
at the Unix shell prompt. Examples that begin with the
character "<literal>*</literal>" are commands in the Postgres query
language, Postgres <acronym>SQL</acronym>.
</para>
<sect1 id="start-env">
<title>Setting Up Your Environment</title>
<para> <para>
This section discusses how to set up If you are installing <productname>PostgreSQL</productname>
your own environment so that you can use frontend yourself, then refer to the <citetitle>Administrator's
applications. We assume <productname>Postgres</productname> has Guide</citetitle> for instructions on installation, and return to
already been this guide when the installation is complete. Be sure to follow
successfully installed and started; refer to the Administrator's Guide closely the section about setting up the appropriate environment
and the installation notes variables.
for how to install Postgres.
</para> </para>
<para> <para>
<productname>Postgres</productname> is a client/server If your site administrator has not set things up in the default
application. As a user, way, you may have some more work to do. For example, if the
you only need access to the client portions of the installation database server machine is a remote machine, you will need to set
(an example the <envar>PGHOST</envar> environment variable to the name of the
of a client application is the interactive monitor database server machine. The environment variable
<application>psql</application>). <envar>PGPORT</envar> may also have to be set. The bottom line is
For simplicity, this: if you try to start an application program and it complains
we will assume that <productname>Postgres</productname> has been that it cannot connect to the database, you should consult your
installed in the site administrator or, if that is you, the documentation to make
directory <filename>/usr/local/pgsql</filename>. Therefore, wherever sure that your environment is properly set up. If you did not
you see the directory <filename>/usr/local/pgsql</filename> you should understand the preceding paragraph then read the next section.
substitute the name of the directory where
<productname>Postgres</productname> is
actually installed.
All <productname>Postgres</productname> commands are installed in
the directory
<filename>/usr/local/pgsql/bin</filename>. Therefore, you should add
this directory to your shell command path. If you use
a variant of the Berkeley C shell, such as csh or tcsh,
you would add
<programlisting>
% set path = ( /usr/local/pgsql/bin path )
</programlisting>
in the <filename>.login</filename> file in your home directory.
If you use
a variant of the Bourne shell, such as sh, ksh, or
bash, then you would add
<programlisting>
% PATH=/usr/local/pgsql/bin:$PATH
% export PATH
</programlisting>
to the .profile file in your home directory.
From now on, we will assume that you have added the
<productname>Postgres</productname> bin directory to your path.
In addition, we
will make frequent reference to <quote>setting a shell
variable</quote> or <quote>setting an environment
variable</quote> throughout
this document. If you did not fully understand the
last paragraph on modifying your search path, you
should consult the Unix manual pages that describe your
shell before going any further.
</para> </para>
</sect1>
<sect1 id="tutorial-arch">
<title>Architectural Fundamentals</title>
<para> <para>
If your site administrator has not set things up in the Before we proceed, you should understand the basic
default way, you may have some more work to do. For example, if <productname>PostgreSQL</productname> system architecture.
the database Understanding how the parts of
server machine is a remote machine, you <productname>PostgreSQL</productname> interact will make the next
will need to set the <acronym>PGHOST</acronym> environment chapter somewhat clearer.
variable to the name
of the database server machine. The environment variable
<acronym>PGPORT</acronym> may also have to be set.
The bottom line is this: if
you try to start an application program and it complains
that it cannot connect to the <application>postmaster</application>,
you should immediately consult your site administrator to make
sure that your
environment is properly set up.
</para> </para>
<para>
In database jargon, <productname>PostgreSQL</productname> uses a
client/server model. A <productname>PostgreSQL</productname>
session consists of the following cooperating processes
(programs):
<itemizedlist>
<listitem>
<para>
A server process, which manages the database files, accepts
connections to the database from client applications, and
performs actions on the database on behalf of the clients. The
database server program is called
<filename>postmaster</filename>.
<indexterm><primary>postmaster</primary></indexterm>
</para>
</listitem>
<listitem>
<para>
The user's client (frontend) application that wants to perform
database operations. Client applications can be very diverse
in nature: They could be a text-oriented tool, a graphical
application, a web server that accesses the database to
display web pages, or a specialized database maintenance tool.
Some client applications are supplied with the
<productname>PostgreSQL</productname> distribution, most are
developed by users.
</para>
</listitem>
</itemizedlist>
</para>
<para>
As is typical of client/server applications, the client and the
server can be on different hosts. In that case they communicate
over a TCP/IP network connection. You should keep this in mind,
because the files that can be accessed on a client machine might
not be accessible (or might only be accessed using a different
file name) on the database server machine.
</para>
<para>
The <productname>PostgreSQL</productname> server can handle
multiple concurrent connections from clients. For that purpose it
starts (<quote>forks</quote>) a new process for each connection.
From that point on, the client and the new server process
communicate without intervention by the original
<filename>postmaster</filename> process. Thus, the
<filename>postmaster</filename> is always running, waiting for
client connections, whereas client and associated server processes
come and go. (All of this is of course invisible to the user. We
only mention it here for completeness.)
</para>
</sect1> </sect1>
<sect1 id="start-psql">
<title>Starting the Interactive Monitor (psql)</title> <sect1 id="tutorial-createdb">
<title>Creating a Database</title>
<indexterm zone="tutorial-createdb">
<primary>database</primary>
<secondary>creating</secondary>
</indexterm>
<indexterm zone="tutorial-createdb">
<primary>createdb</primary>
</indexterm>
<para> <para>
Assuming that your site administrator has properly The first test to see whether you can access the database server
started the <application>postmaster</application> process and is to try to create a database. A running
authorized you to <productname>PostgreSQL</productname> server can manage many
use the database, you (as a user) may begin to start up databases. Typically, a separate database is used for each
applications. As previously mentioned, you should add project or for each user.
<filename>/usr/local/pgsql/bin</filename> to your shell search path.
In most cases, this is all you should have to do in
terms of preparation.
</para> </para>
<para> <para>
Two different styles of connections Possibly, your site administrator has already created a database
are supported. The site administrator will have chosen to allow for your use. He should have told you what the name of your
TCP/IP network connections database is. In this case you can omit this step and skip ahead
or will have restricted database access to local (same-machine) to the next section.
socket connections only.
These choices become significant if you encounter problems in
connecting to a database, since you will want to confirm that you
are choosing an allowed connection option.
</para> </para>
<para> <para>
If you get the following error message from a To create a new database, in this example named
<productname>Postgres</productname> <quote>mydb</quote>, you use the following command:
command (such as <application>psql</application> or <screen>
<application>createdb</application>): <prompt>$</prompt> <userinput>createdb mydb</userinput>
</screen>
This should produce as response:
<screen>
CREATE DATABASE
</screen>
Is so, this step was successful and you can skip over the
remainder of this section.
</para>
<programlisting> <para>
% psql template1 If you see a message similar to
<screen>
createdb: command not found
</screen>
then PostgreSQL was not installed properly. Either it was not
installed at all or the search path was not set correctly. Try
calling the command with an absolute path instead:
<screen>
<prompt>$</prompt> <userinput>/usr/local/pgsql/bin/createdb</userinput>
</screen>
The path at your site might be different. Contact your site
administrator or check back in the installation instructions to
correct the situation.
</para>
<para>
Another response could be this:
<screen>
psql: could not connect to server: Connection refused psql: could not connect to server: Connection refused
Is the server running locally and accepting Is the server running locally and accepting
connections on Unix domain socket "/tmp/.s.PGSQL.5432"? connections on Unix domain socket "/tmp/.s.PGSQL.5432"?
</programlisting> createdb: database creation failed
</screen>
This means that the server was not started, or it was not started
where <command>createdb</command> expected it. Again, check the
installation instructions or consult the administrator.
</para>
or <para>
If you do not have the privileges required to create a database,
you will see the following:
<screen>
ERROR: CREATE DATABASE: permission denied
createdb: database creation failed
</screen>
Not every user has authorization to create new databases. If
<productname>PostgreSQL</productname> refuses to create databases
for you then the site administrator needs to grant you permission
to create databases. Consult your site administrator if this
occurs. If you installed <productname>PostgreSQL</productname>
yourself then you should log in for the purposes of this tutorial
under the user account that you started the server as.
<footnote>
<para>
As an explanation for why this works:
<productname>PostgreSQL</productname> user names are separate
from operating system user accounts. If you connect to a
database, you can choose what
<productname>PostgreSQL</productname> user name to connect as;
if you don't, it will default to the same name as your current
operating system account. As it happens, there will always be a
<productname>PostgreSQL</productname> user account that has the
same name as the operating system user that started the server,
and it also happens that that user always has permission to
create databases. Instead of logging in as that user you can
also specify the <option>-U</option> option everywhere to select
a <productname>PostgreSQL</productname> user name to connect as.
</para>
</footnote>
</para>
<para>
You can also create databases with other names.
<productname>PostgreSQL</productname> allows you to create any
number of databases at a given site. Database names must have an
alphabetic first character and are limited to 32 characters in
length. A convenient choice is to create a database with the same
name as your current user name. Many tools assume that database
name as the default, so it can save you some typing. To create
that database, simply type
<screen>
<prompt>$</prompt> <userinput>createdb</userinput>
</screen>
</para>
<para>
If you don't want to use your database anymore you can remove it.
For example, if you are the owner (creator) of the database
<quote>mydb</quote>, you can destroy it using the following
command:
<screen>
<prompt>$</prompt> <userinput>dropdb mydb</userinput>
</screen>
(In this case, the database name does not default to the user
account name. You always need to specify it.) This action
physically removes all files associated with the database and
cannot be undone, so this should only be done with a great deal of
forethought.
</para>
</sect1>
<programlisting>
% psql -h localhost template1
psql: could not connect to server: Connection refused
Is the server running on host localhost and accepting
TCP/IP connections on port 5432?
</programlisting>
it is usually because <sect1 id="tutorial-accessdb">
<title>Accessing a Database</title>
<indexterm zone="tutorial-accessdb">
<primary>psql</primary>
</indexterm>
<para>
Once you have created a database, you can access it by:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<para>
Running the <productname>PostgreSQL</productname> interactive
terminal program, called <quote>psql</quote>, which allows you
to interactively enter, edit, and execute
<acronym>SQL</acronym> commands.
</para>
</listitem>
<itemizedlist mark="bullet" spacing="compact">
<listitem> <listitem>
<para> <para>
the <application>postmaster</application> is not running, Using an existing graphical frontend tool like
or <application>PgAccess</application> or
<application>ApplixWare</application> (via
<acronym>ODBC</acronym>) to create and manipulate a database.
These possibilities are not covered in this tutorial.
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
<para> <para>
you are attempting to connect to the wrong server host. Writing a custom application, using one of the several
available language bindings. These possibilities are discussed
further in <citetitle>The PostgreSQL Programmer's
Guide</citetitle>.
</para> </para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
You probably want to start up <command>psql</command>, to try out
the examples in this tutorial. It can be activated for the
<quote>mydb</quote> database by typing the command:
<screen>
<prompt>$</prompt> <userinput>psql mydb</userinput>
</screen>
If you leave off the database name then it will default to your
user account name. You already discovered this scheme in the
previous section.
</para> </para>
<para> <para>
If you get the following error message: In <command>psql</command>, you will be greeted with the following
message:
<programlisting> <screen>
FATAL 1:Feb 17 23:19:55:process userid (2360) != database owner (268) Welcome to psql, the PostgreSQL interactive terminal.
</programlisting>
Type: \copyright for distribution terms
it means that the site administrator started the \h for help with SQL commands
<application>postmaster</application> \? for help on internal slash commands
as the wrong user. Tell him to restart it as \g or terminate with semicolon to execute query
the <productname>Postgres</productname> superuser. \q to quit
mydb=&gt;
</screen>
<indexterm><primary>superuser</primary></indexterm>
The last line could also be
<screen>
mydb=#
</screen>
That would mean you are a database superuser, which is most likely
the case if you installed <productname>PostgreSQL</productname>
yourself. Being a superuser means that you are not subject to
access controls. For the purpose of this tutorial this is not of
importance.
</para> </para>
</sect1>
<sect1 id="start-manage-db"> <para>
<title>Managing a Database</title> If you have encountered problems starting <command>psql</command>
then go back to the previous section. The diagnostics of
<command>psql</command> and <command>createdb</command> are
similar, and if the latter worked the former should work as well.
</para>
<para> <para>
Now that <productname>Postgres</productname> is up and running we The last line printed out by <command>psql</command> is the
can create some prompt, and it indicates that <command>psql</command> is listening
databases to experiment with. Here, we describe the to you and that you can type <acronym>SQL</acronym> queries into a
basic commands for managing a database. workspace maintained by <command>psql</command>. Try out these
commands:
<indexterm><primary>version</primary></indexterm>
<screen>
<prompt>mydb=&gt;</prompt> <userinput>SELECT version();</userinput>
version
----------------------------------------------------------------
PostgreSQL 7.2devel on i586-pc-linux-gnu, compiled by GCC 2.96
(1 row)
<prompt>mydb=&gt;</prompt> <userinput>SELECT current_date;</userinput>
date
------------
2001-08-31
(1 row)
<prompt>mydb=&gt;</prompt> <userinput>SELECT 2 + 2;</userinput>
?column?
----------
4
(1 row)
</screen>
</para> </para>
<para> <para>
Most <productname>Postgres</productname> The <command>psql</command> program has a number of internal
applications assume that the database name, if not specified, is commands that are not SQL commands. They begin the backslash
the same as the name on your computer character, <quote><literal>\</literal></quote>. Some of these
account. commands were already listed in the welcome message. For example,
you can get help on the syntax of various
<productname>PostgreSQL</productname> <acronym>SQL</acronym>
commands by typing:
<screen>
<prompt>mydb=&gt;</prompt> <userinput>\h</userinput>
</screen>
</para> </para>
<para> <para>
If your database administrator has set up your account without To get out of <command>psql</command>, type
database creation privileges, <screen>
then she should have told you what the name of your database is. If <prompt>mydb=&gt;</prompt> <userinput>\q</userinput>
this is the case, then you </screen>
can skip the sections on creating and destroying databases. and <command>psql</command> will quit and return you to your
command shell. (For more internal commands, type
<literal>\?</literal> at the <command>psql</command> prompt.) The
full capabilities of <command>psql</command> are documented in the
<citetitle>Reference Manual</citetitle>. If PostgreSQL is
installed correctly you can also type <literal>man psql</literal>
at the operating system shell prompt to see the documentation. In
this tutorial we will not use these features explicitly, but you
can use them yourself when you see it fit.
</para> </para>
<sect2>
<title>Creating a Database</title>
<para>
Let's say you want to create a database named
<database>mydb</database>.
You can do this with the following command:
<programlisting>
% createdb mydb
</programlisting>
</para>
<para>
If you do not have the privileges required to create a database,
you will see
the following:
<programlisting>
% createdb mydb
NOTICE:user "your username" is not allowed to create/destroy databases
createdb: database creation failed on mydb.
</programlisting>
</para>
<para>
<productname>Postgres</productname> allows you to create any
number of databases
at a given site and you automatically become the
database administrator of the database you just created.
Database names must have an alphabetic first
character and are limited to 32 characters in length.
Not every user has authorization to become a database
administrator. If <productname>Postgres</productname> refuses to
create databases
for you, then the site administrator needs to grant you
permission to create databases. Consult your site
administrator if this occurs.
</para>
</sect2>
<sect2>
<title>Accessing a Database</title>
<para>
Once you have constructed a database, you can access it
by:
<itemizedlist spacing="compact" mark="bullet">
<listitem>
<para>
Running the <productname>Postgres</productname> terminal
monitor programs
(e.g. <application>psql</application>) which allows you to
interactively
enter, edit, and execute <acronym>SQL</acronym> commands.
</para>
</listitem>
<listitem>
<para>
Using an existing native frontend tool like
<application>pgaccess</application> or
<application>ApplixWare</application> (via
<acronym>ODBC</acronym>) to create and manipulate a
database.
</para>
</listitem>
<listitem>
<para>
Using a language like perl or tcl which has a supported
interface for <productname>Postgres</productname>. Some of
these languages also have convenient and powerful GUI toolkits
which can help you construct custom
applications. <application>pgaccess</application>, mentioned
above, is one such application written in tk/tcl and can be
used as an example.
</para>
</listitem>
<listitem>
<para>
Writing a <acronym>C</acronym> program using
the LIBPQ subroutine
library. This allows you to submit
<acronym>SQL</acronym> commands
from <acronym>C</acronym> and get answers and
status messages back to
your program. This interface is discussed further
in <citetitle>The PostgreSQL Programmer's Guide</citetitle>.
</para>
</listitem>
</itemizedlist>
You might want to start up <application>psql</application>,
to try out the examples in this manual.
It can be activated for the <database>mydb</database>
database by typing the command:
<programlisting>
% psql mydb
</programlisting>
You will be greeted with the following message:
<programlisting>
Welcome to the POSTGRESQL interactive sql monitor:
Please read the file COPYRIGHT for copyright terms of POSTGRESQL
type \? for help on slash commands
type \q to quit
type \g or terminate with semicolon to execute query
You are currently connected to the database: template1
mydb=>
</programlisting>
</para>
<para>
This prompt indicates that the terminal monitor is listening
to you and that you can type <acronym>SQL</acronym> queries into a
workspace maintained by the terminal monitor.
The <application>psql</application> program responds to escape
codes that begin
with the backslash character, "<literal>\</literal>" For example, you
can get help on the syntax of various
<productname>Postgres</productname> <acronym>SQL</acronym>
commands by typing:
<programlisting>
mydb=> \h
</programlisting>
</para>
<para>
Once you have finished entering your queries into the
workspace, you can pass the contents of the workspace
to the <productname>Postgres</productname> server by typing:
<programlisting>
mydb=> \g
</programlisting>
This tells the server to process the query. If you
terminate your query with a semicolon, the "<literal>\g</literal>" is not
necessary.
<application>psql</application> will automatically process
semicolon terminated queries.
To read queries from a file, say myFile, instead of
entering them interactively, type:
<programlisting>
mydb=> \i fileName
</programlisting>
To get out of <application>psql</application> and return to Unix, type
<programlisting>
mydb=> \q
</programlisting>
and <application>psql</application> will quit and return
you to your command
shell. (For more escape codes, type <command>\h</command> at the
monitor prompt.)
White space (i.e., spaces, tabs and newlines) may be
used freely in <acronym>SQL</acronym> queries. Single-line
comments are denoted by
"<literal>--</literal>". Everything after the dashes up to the end of the
line is ignored. Multiple-line comments, and comments within a line,
are denoted by "<literal>/* ... */</literal>".
</para>
</sect2>
<sect2>
<title>Destroying a Database</title>
<para>
If you are the database administrator for the database
<database>mydb</database>, you can destroy it using the
following Unix command:
<programlisting>
% dropdb mydb
</programlisting>
This action physically removes all of the Unix files
associated with the database and cannot be undone, so
this should only be done with a great deal of forethought.
</para>
</sect2>
</sect1> </sect1>
</chapter> </chapter>
<!-- Keep this comment at the end of the file <!-- Keep this comment at the end of the file
......
<!-- <!--
$Header: /cvsroot/pgsql/doc/src/sgml/Attic/tutorial.sgml,v 1.14 2001/02/03 19:03:27 petere Exp $ $Header: /cvsroot/pgsql/doc/src/sgml/Attic/tutorial.sgml,v 1.15 2001/09/02 23:27:49 petere Exp $
--> -->
<book id="tutorial"> <book id="tutorial">
...@@ -10,16 +10,46 @@ $Header: /cvsroot/pgsql/doc/src/sgml/Attic/tutorial.sgml,v 1.14 2001/02/03 19:03 ...@@ -10,16 +10,46 @@ $Header: /cvsroot/pgsql/doc/src/sgml/Attic/tutorial.sgml,v 1.14 2001/02/03 19:03
&legal; &legal;
</bookinfo> </bookinfo>
<preface id="tutorial-welcome">
<title>Welcome</title>
<para>
Welcome to <productname>PostgreSQL</productname> and the
<citetitle>PostgreSQL Tutorial</citetitle>. The following few
chapters are intended to give a simple introduction to
<productname>PostgreSQL</productname>, relational database
concepts, and the SQL language to those who are new to any one of
these aspects. We only assume some general knowledge about how to
use computers. No particular Unix or programming experience is
required.
</para>
<para>
After you have worked through this tutorial you might want to move on
to reading the <![%single-book;[<citetitle>User's
Guide</citetitle>]]><![%set-of-books;[<xref linkend="user">]]> to
gain a more formal knowledge of the SQL language, or the
<![%single-book;[<citetitle>Programmer's
Guide</citetitle>]]><![%set-of-books;[<xref linkend="programmer">]]>
for information about developing applications for
<productname>PostgreSQL</productname>.
</para>
<para>
We hope you have a pleasant experience with
<productname>PostgreSQL</productname>.
</para>
</preface>
<![%single-book;[ <![%single-book;[
&intro; &intro;
]]> ]]>
&sql;
&arch;
&start; &start;
&query; &query;
&advanced; &advanced;
<![%single-book;[ <![%single-book;[
&biblio; &biblio;
&bookindex;
]]> ]]>
</book> </book>
......
--------------------------------------------------------------------------- ---------------------------------------------------------------------------
-- --
-- advanced.sql- -- advanced.sql-
-- more POSTGRES SQL features. (These are not part of the SQL-92 -- Tutorial on advanced more PostgreSQL features
-- standard.)
-- --
-- --
-- Copyright (c) 1994, Regents of the University of California -- Copyright (c) 1994, Regents of the University of California
-- --
-- $Id: advanced.source,v 1.3 1999/07/08 15:28:51 momjian Exp $ -- $Id: advanced.source,v 1.4 2001/09/02 23:27:50 petere Exp $
-- --
--------------------------------------------------------------------------- ---------------------------------------------------------------------------
----------------------------- -----------------------------
-- Inheritance: -- Inheritance:
-- a table can inherit from zero or more tables. A query can reference -- S table can inherit from zero or more tables. A query can reference
-- either all rows of a table or all rows of a table plus all of its -- either all rows of a table or all rows of a table plus all of its
-- descendants. -- descendants.
----------------------------- -----------------------------
...@@ -31,7 +30,7 @@ CREATE TABLE capitals ( ...@@ -31,7 +30,7 @@ CREATE TABLE capitals (
state char(2) state char(2)
) INHERITS (cities); ) INHERITS (cities);
-- now, let's populate the tables -- Now, let's populate the tables.
INSERT INTO cities VALUES ('San Francisco', 7.24E+5, 63); INSERT INTO cities VALUES ('San Francisco', 7.24E+5, 63);
INSERT INTO cities VALUES ('Las Vegas', 2.583E+5, 2174); INSERT INTO cities VALUES ('Las Vegas', 2.583E+5, 2174);
INSERT INTO cities VALUES ('Mariposa', 1200, 1953); INSERT INTO cities VALUES ('Mariposa', 1200, 1953);
...@@ -56,72 +55,6 @@ FROM cities* c ...@@ -56,72 +55,6 @@ FROM cities* c
WHERE c.altitude > 500; WHERE c.altitude > 500;
-----------------------------
-- Time Travel:
-- this feature allows you to run historical queries.
-- removed for v6.3, but possible using triggers.
-- see contrib/spi/README for more information.
-----------------------------
-- first, let's make some changes to the cities table (suppose Mariposa's
-- population grows 10% this year)
-- UPDATE cities
-- SET population = population * 1.1
-- WHERE name = 'Mariposa';
-- the default time is the current time ('now'):
-- SELECT * FROM cities WHERE name = 'Mariposa';
-- we can also retrieve the population of Mariposa ever has. ('epoch' is the
-- earliest time representable by the system)
-- SELECT name, population
-- FROM cities['epoch', 'now'] -- can be abbreviated to cities[,]
-- WHERE name = 'Mariposa';
----------------------
-- Arrays:
-- attributes can be arrays of base types or user-defined types
----------------------
CREATE TABLE sal_emp (
name text,
pay_by_quarter int4[],
schedule text[][]
);
-- insert instances with array attributes. Note the use of braces
INSERT INTO sal_emp VALUES (
'Bill',
'{10000,10000,10000,10000}',
'{{"meeting", "lunch"}, {}}');
INSERT INTO sal_emp VALUES (
'Carol',
'{20000,25000,25000,25000}',
'{{"talk", "consult"}, {"meeting"}}');
----------------------
-- queries on array attributes
----------------------
SELECT name FROM sal_emp WHERE
sal_emp.pay_by_quarter[1] <> sal_emp.pay_by_quarter[2];
-- retrieve third quarter pay of all employees
SELECT sal_emp.pay_by_quarter[3] FROM sal_emp;
-- select subarrays
SELECT sal_emp.schedule[1:2][1:1] FROM sal_emp WHERE
sal_emp.name = 'Bill';
-- clean up (you must remove the children first) -- clean up (you must remove the children first)
DROP TABLE sal_emp;
DROP TABLE capitals; DROP TABLE capitals;
DROP TABLE cities; DROP TABLE cities;
...@@ -6,22 +6,22 @@ ...@@ -6,22 +6,22 @@
-- --
-- Copyright (c) 1994, Andrew Yu, University of California -- Copyright (c) 1994, Andrew Yu, University of California
-- --
-- $Id: basics.source,v 1.3 1999/07/08 15:27:01 momjian Exp $ -- $Id: basics.source,v 1.4 2001/09/02 23:27:50 petere Exp $
-- --
--------------------------------------------------------------------------- ---------------------------------------------------------------------------
----------------------------- -----------------------------
-- Creating a table: -- Creating a New Table:
-- a CREATE TABLE is used to create base tables. POSTGRES SQL has -- A CREATE TABLE is used to create base tables. PostgreSQL has
-- its own set of built-in types. (Note that keywords are case- -- its own set of built-in types. (Note that SQL is case-
-- insensitive but identifiers are case-sensitive.) -- insensitive.)
----------------------------- -----------------------------
CREATE TABLE weather ( CREATE TABLE weather (
city varchar(80), city varchar(80),
temp_lo int, -- low temperature temp_lo int, -- low temperature
temp_hi int, -- high temperature temp_hi int, -- high temperature
prcp float8, -- precipitation prcp real, -- precipitation
date date date date
); );
...@@ -30,98 +30,105 @@ CREATE TABLE cities ( ...@@ -30,98 +30,105 @@ CREATE TABLE cities (
location point location point
); );
----------------------------- -----------------------------
-- Inserting data: -- Populating a Table With Rows:
-- an INSERT statement is used to insert a new row into a table. There -- An INSERT statement is used to insert a new row into a table. There
-- are several ways you can specify what columns the data should go to. -- are several ways you can specify what columns the data should go to.
----------------------------- -----------------------------
-- 1. the simplest case is when the list of value correspond to the order of -- 1. The simplest case is when the list of value correspond to the order of
-- the columns specified in CREATE TABLE. -- the columns specified in CREATE TABLE.
INSERT INTO weather INSERT INTO weather
VALUES ('San Francisco', 46, 50, 0.25, '11/27/1994'); VALUES ('San Francisco', 46, 50, 0.25, '1994-11-27');
INSERT INTO cities INSERT INTO cities
VALUES ('San Francisco', '(-194.0, 53.0)'); VALUES ('San Francisco', '(-194.0, 53.0)');
-- 2. you can also specify what column the values correspond to. (The columns -- 2. You can also specify what column the values correspond to. (The columns
-- can be specified in any order. You may also omit any number of columns. -- can be specified in any order. You may also omit any number of columns,
-- eg. unknown precipitation below) -- e.g., unknown precipitation below.
INSERT INTO weather (city, temp_lo, temp_hi, prcp, date) INSERT INTO weather (city, temp_lo, temp_hi, prcp, date)
VALUES ('San Francisco', 43, 57, 0.0, '11/29/1994'); VALUES ('San Francisco', 43, 57, 0.0, '1994-11-29');
INSERT INTO weather (date, city, temp_hi, temp_lo) INSERT INTO weather (date, city, temp_hi, temp_lo)
VALUES ('11/29/1994', 'Hayward', 54, 37); VALUES ('1994-11-29', 'Hayward', 54, 37);
----------------------------- -----------------------------
-- Retrieving data: -- Querying a Table:
-- a SELECT statement is used for retrieving data. The basic syntax is -- A SELECT statement is used for retrieving data. The basic syntax is
-- SELECT columns FROM tables WHERE predicates -- SELECT columns FROM tables WHERE predicates.
----------------------------- -----------------------------
-- A simple one would be:
SELECT * FROM weather; SELECT * FROM weather;
-- You may also specify expressions in the target list. (The 'AS column'
-- you may also specify expressions in the target list (the 'AS column' -- specifies the column name of the result. It is optional.)
SELECT city, (temp_hi+temp_lo)/2 AS temp_avg, date FROM weather; SELECT city, (temp_hi+temp_lo)/2 AS temp_avg, date FROM weather;
-- If you want to retrieve rows that satisfy certain condition (i.e., a
-- if you want to retrieve rows that satisfy certain condition (ie. a -- restriction), specify the condition in WHERE. The following retrieves
-- the weather of San Francisco on rainy days. -- the weather of San Francisco on rainy days.
SELECT * SELECT *
SELECT * FROM weather
FROM weather WHERE city = 'San Francisco'
WHERE city = 'San Francisco' AND prcp > 0.0;
-- Here is a more complicated one. Duplicates are removed when DISTINCT is
-- here is a more complicated one. Duplicates are removed when DISTINCT is -- specified. ORDER BY specifies the column to sort on. (Just to make sure the
-- following won't confuse you, DISTINCT and ORDER BY can be used separately.) -- following won't confuse you, DISTINCT and ORDER BY can be used separately.)
SELECT DISTINCT city SELECT DISTINCT city
SELECT DISTINCT city FROM weather
FROM weather ORDER BY city;
----------------------------- -----------------------------
----------------------------- -- Joins Between Tables:
-- Retrieving data into other classes: -- queries can access multiple tables at once or access the same table
-- a SELECT ... INTO statement can be used to retrieve data into -- in such a way that multiple instances of the table are being processed
-- at the same time.
----------------------------- -----------------------------
-- The following joins the weather table and the cities table.
SELECT * INTO TABLE mytemp
FROM weather
WHERE city = 'San Francisco'
SELECT *
FROM weather, cities
WHERE city = name;
-- This prevents a duplicate city name column:
-----------------------------
-- Aggregates
SELECT city, temp_lo, temp_hi, prcp, date, location
SELECT max(temp_lo) FROM weather, cities
WHERE city = name;
-- since the column names are all different, we don't have to specify the
-- Aggregate with GROUP BY -- table name. If you want to be clear, you can do the following. They give
SELECT city, max(temp_lo) -- identical results, of course.
FROM weather
SELECT weather.city, weather.temp_lo, weather.temp_hi, weather.prcp, weather.date, cities.location
----------------------------- FROM weather, cities
-- Joining tables: WHERE cities.name = weather.city;
-- queries can access multiple tables at once or access the same table
-- in such a way that multiple instances of the table are being processed
-- at the same time.
-- JOIN syntax
-- suppose we want to find all the records that are in the temperature range
SELECT *
FROM weather JOIN cities ON (weather.city = cities.name);
-- Outer join
SELECT *
FROM weather LEFT OUTER JOIN cities ON (weather.city = cities.name);
-- Suppose we want to find all the records that are in the temperature range
-- of other records. W1 and W2 are aliases for weather.
SELECT W1.city, W1.temp_lo, W1.temp_hi, SELECT W1.city, W1.temp_lo, W1.temp_hi,
W2.city, W2.temp_lo, W2.temp_hi W2.city, W2.temp_lo, W2.temp_hi
...@@ -129,60 +136,69 @@ FROM weather W1, weather W2 ...@@ -129,60 +136,69 @@ FROM weather W1, weather W2
WHERE W1.temp_lo < W2.temp_lo WHERE W1.temp_lo < W2.temp_lo
and W1.temp_hi > W2.temp_hi; and W1.temp_hi > W2.temp_hi;
-- let's join two tables. The following joins the weather table
-- and the cities table.
SELECT city, location, prcp, date -----------------------------
FROM weather, cities -- Aggregate Functions
WHERE name = city; -----------------------------
-- since the column names are all different, we don't have to specify the SELECT max(temp_lo)
-- table name. If you want to be clear, you can do the following. They give FROM weather;
-- identical results, of course.
SELECT city FROM weather
WHERE temp_lo = (SELECT max(temp_lo) FROM weather);
-- Aggregate with GROUP BY
SELECT city, max(temp_lo)
FROM weather
GROUP BY city;
-- ... and HAVING
SELECT city, max(temp_lo)
FROM weather
GROUP BY city
HAVING max(temp_lo) < 40;
SELECT w.city, c.location, w.prcp, w.date
FROM weather w, cities c
WHERE c.name = w.city;
----------------------------- -----------------------------
-- Updating data: -- Updates:
-- an UPDATE statement is used for updating data. -- An UPDATE statement is used for updating data.
----------------------------- -----------------------------
-- suppose you discover the temperature readings are all off by 2 degrees as -- Suppose you discover the temperature readings are all off by 2 degrees as
-- of Nov 28, you may update the data as follow: -- of Nov 28, you may update the data as follow:
UPDATE weather UPDATE weather
SET temp_hi = temp_hi - 2, temp_lo = temp_lo - 2 SET temp_hi = temp_hi - 2, temp_lo = temp_lo - 2
WHERE date > '11/28/1994'; WHERE date > '1994-11-28';
SELECT * from weather; SELECT * FROM weather;
----------------------------- -----------------------------
-- Deleting data: -- Deletions:
-- a DELETE statement is used for deleting rows from a table. -- A DELETE statement is used for deleting rows from a table.
----------------------------- -----------------------------
-- suppose you are no longer interested in the weather of Hayward, you can -- Suppose you are no longer interested in the weather of Hayward, then you can
-- do the following to delete those rows from the table -- do the following to delete those rows from the table.
DELETE FROM weather WHERE city = 'Hayward'; DELETE FROM weather WHERE city = 'Hayward';
SELECT * from weather; SELECT * FROM weather;
-- you can also delete all the rows in a table by doing the following. (This -- You can also delete all the rows in a table by doing the following. (This
-- is different from DROP TABLE which removes the table in addition to the -- is different from DROP TABLE which removes the table in addition to the
-- removing the rows.) -- removing the rows.)
DELETE FROM weather; DELETE FROM weather;
SELECT * from weather; SELECT * FROM weather;
----------------------------- -----------------------------
-- Removing the tables: -- Removing the tables:
-- DROP TABLE is used to remove tables. After you have done this, you -- DROP TABLE is used to remove tables. After you have done this, you
-- can no longer use those tables. -- can no longer use those tables.
----------------------------- -----------------------------
DROP TABLE weather, cities, mytemp; DROP TABLE weather, cities;
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment