Commit 30be6c23 authored by Bruce Momjian's avatar Bruce Momjian

Handle mixed-case names in reindex script.

Document need for reindex in SGML docs.
parent a8a1f158
#!/bin/sh #!/bin/sh
# -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- # # -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- #
# Package : reindexdb Version : $Revision: 1.2 $ # Package : reindexdb Version : $Revision: 1.3 $
# Date : 05/08/2002 Author : Shaun Thomas # Date : 05/08/2002 Author : Shaun Thomas
# Req : psql, sh, perl, sed Type : Utility # Req : psql, sh, perl, sed Type : Utility
# #
...@@ -188,7 +188,7 @@ if [ "$index" ]; then ...@@ -188,7 +188,7 @@ if [ "$index" ]; then
# Ok, no index. Is there a specific table to reindex? # Ok, no index. Is there a specific table to reindex?
elif [ "$table" ]; then elif [ "$table" ]; then
$PSQL $PSQLOPT $ECHOOPT -c "REINDEX TABLE $table" -d $dbname $PSQL $PSQLOPT $ECHOOPT -c "REINDEX TABLE \"$table\"" -d $dbname
# No specific table, no specific index, either we have a specific database, # No specific table, no specific index, either we have a specific database,
# or were told to do all databases. Do it! # or were told to do all databases. Do it!
...@@ -206,7 +206,7 @@ else ...@@ -206,7 +206,7 @@ else
# database that we may reindex. # database that we may reindex.
tables=`$PSQL $PSQLOPT -q -t -A -d $db -c "$sql"` tables=`$PSQL $PSQLOPT -q -t -A -d $db -c "$sql"`
for tab in $tables; do for tab in $tables; do
$PSQL $PSQLOPT $ECHOOPT -c "REINDEX TABLE $tab" -d $db $PSQL $PSQLOPT $ECHOOPT -c "REINDEX TABLE \"$tab\"" -d $db
done done
done done
......
<!-- <!--
$Header: /cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v 1.15 2002/06/22 04:08:07 momjian Exp $ $Header: /cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v 1.16 2002/06/23 03:37:12 momjian Exp $
--> -->
<chapter id="maintenance"> <chapter id="maintenance">
...@@ -55,8 +55,8 @@ $Header: /cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v 1.15 2002/06/22 04:08:07 ...@@ -55,8 +55,8 @@ $Header: /cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v 1.15 2002/06/22 04:08:07
</indexterm> </indexterm>
<para> <para>
<productname>PostgreSQL</productname>'s <command>VACUUM</> command must be <productname>PostgreSQL</productname>'s <command>VACUUM</> command
run on a regular basis for several reasons: must be run on a regular basis for several reasons:
<orderedlist> <orderedlist>
<listitem> <listitem>
...@@ -100,26 +100,27 @@ $Header: /cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v 1.15 2002/06/22 04:08:07 ...@@ -100,26 +100,27 @@ $Header: /cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v 1.15 2002/06/22 04:08:07
</indexterm> </indexterm>
<para> <para>
In normal <productname>PostgreSQL</productname> operation, an <command>UPDATE</> or In normal <productname>PostgreSQL</productname> operation, an
<command>DELETE</> of a row does not immediately remove the old <firstterm>tuple</> <command>UPDATE</> or <command>DELETE</> of a row does not
(version of the row). This approach is necessary to gain the benefits immediately remove the old <firstterm>tuple</> (version of the row).
of multiversion concurrency control (see the <citetitle>User's Guide</>): This approach is necessary to gain the benefits of multiversion
the tuple must not be deleted while concurrency control (see the <citetitle>User's Guide</>): the tuple
it is still potentially visible to other transactions. But eventually, must not be deleted while it is still potentially visible to other
an outdated or deleted tuple is no longer of interest to any transaction. transactions. But eventually, an outdated or deleted tuple is no
The space it occupies must be reclaimed for reuse by new tuples, to avoid longer of interest to any transaction. The space it occupies must be
infinite growth of disk space requirements. This is done by running reclaimed for reuse by new tuples, to avoid infinite growth of disk
<command>VACUUM</>. space requirements. This is done by running <command>VACUUM</>.
</para> </para>
<para> <para>
Clearly, a table that receives frequent updates or deletes will need Clearly, a table that receives frequent updates or deletes will need
to be vacuumed more often than tables that are seldom updated. It may to be vacuumed more often than tables that are seldom updated. It
be useful to set up periodic <application>cron</> tasks that vacuum only selected tables, may be useful to set up periodic <application>cron</> tasks that
skipping tables that are known not to change often. This is only likely vacuum only selected tables, skipping tables that are known not to
to be helpful if you have both large heavily-updated tables and large change often. This is only likely to be helpful if you have both
seldom-updated tables --- the extra cost of vacuuming a small table large heavily-updated tables and large seldom-updated tables --- the
isn't enough to be worth worrying about. extra cost of vacuuming a small table isn't enough to be worth
worrying about.
</para> </para>
<para> <para>
...@@ -174,18 +175,18 @@ $Header: /cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v 1.15 2002/06/22 04:08:07 ...@@ -174,18 +175,18 @@ $Header: /cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v 1.15 2002/06/22 04:08:07
<para> <para>
As with vacuuming for space recovery, frequent updates of statistics As with vacuuming for space recovery, frequent updates of statistics
are more useful for heavily-updated tables than for seldom-updated ones. are more useful for heavily-updated tables than for seldom-updated
But even for a heavily-updated table, there may be no need for ones. But even for a heavily-updated table, there may be no need for
statistics updates if the statistical distribution of the data is not statistics updates if the statistical distribution of the data is
changing much. A simple rule of thumb is to think about how much not changing much. A simple rule of thumb is to think about how much
the minimum and maximum values of the columns in the table change. the minimum and maximum values of the columns in the table change.
For example, a <type>timestamp</type> column that contains the time of row update For example, a <type>timestamp</type> column that contains the time
will have a constantly-increasing maximum value as rows are added and of row update will have a constantly-increasing maximum value as
updated; such a column will probably need more frequent statistics rows are added and updated; such a column will probably need more
updates than, say, a column containing URLs for pages accessed on a frequent statistics updates than, say, a column containing URLs for
website. The URL column may receive changes just as often, but the pages accessed on a website. The URL column may receive changes just
statistical distribution of its values probably changes relatively as often, but the statistical distribution of its values probably
slowly. changes relatively slowly.
</para> </para>
<para> <para>
...@@ -247,42 +248,45 @@ $Header: /cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v 1.15 2002/06/22 04:08:07 ...@@ -247,42 +248,45 @@ $Header: /cvsroot/pgsql/doc/src/sgml/maintenance.sgml,v 1.15 2002/06/22 04:08:07
<para> <para>
Prior to <productname>PostgreSQL</productname> 7.2, the only defense Prior to <productname>PostgreSQL</productname> 7.2, the only defense
against XID wraparound was to re-<command>initdb</> at least every 4 billion against XID wraparound was to re-<command>initdb</> at least every 4
transactions. This of course was not very satisfactory for high-traffic billion transactions. This of course was not very satisfactory for
sites, so a better solution has been devised. The new approach allows an high-traffic sites, so a better solution has been devised. The new
installation to remain up indefinitely, without <command>initdb</> or any sort of approach allows an installation to remain up indefinitely, without
restart. The price is this maintenance requirement: <command>initdb</> or any sort of restart. The price is this
<emphasis>every table in the database must be vacuumed at least once every maintenance requirement: <emphasis>every table in the database must
billion transactions</emphasis>. be vacuumed at least once every billion transactions</emphasis>.
</para> </para>
<para> <para>
In practice this isn't an onerous requirement, but since the consequences In practice this isn't an onerous requirement, but since the
of failing to meet it can be complete data loss (not just wasted disk consequences of failing to meet it can be complete data loss (not
space or slow performance), some special provisions have been made to help just wasted disk space or slow performance), some special provisions
database administrators keep track of the time since the last have been made to help database administrators keep track of the
<command>VACUUM</>. The remainder of this section gives the details. time since the last <command>VACUUM</>. The remainder of this
section gives the details.
</para> </para>
<para> <para>
The new approach to XID comparison distinguishes two special XIDs, numbers The new approach to XID comparison distinguishes two special XIDs,
1 and 2 (<literal>BootstrapXID</> and <literal>FrozenXID</>). These two numbers 1 and 2 (<literal>BootstrapXID</> and
XIDs are always considered older than every normal XID. Normal XIDs (those <literal>FrozenXID</>). These two XIDs are always considered older
greater than 2) are compared using modulo-2<superscript>31</> arithmetic. This means than every normal XID. Normal XIDs (those greater than 2) are
compared using modulo-2<superscript>31</> arithmetic. This means
that for every normal XID, there are two billion XIDs that are that for every normal XID, there are two billion XIDs that are
<quote>older</> and two billion that are <quote>newer</>; another way to <quote>older</> and two billion that are <quote>newer</>; another
say it is that the normal XID space is circular with no endpoint. way to say it is that the normal XID space is circular with no
Therefore, once a tuple has been created with a particular normal XID, the endpoint. Therefore, once a tuple has been created with a particular
tuple will appear to be <quote>in the past</> for the next two billion normal XID, the tuple will appear to be <quote>in the past</> for
transactions, no matter which normal XID we are talking about. If the the next two billion transactions, no matter which normal XID we are
tuple still exists after more than two billion transactions, it will talking about. If the tuple still exists after more than two billion
suddenly appear to be in the future. To prevent data loss, old tuples transactions, it will suddenly appear to be in the future. To
must be reassigned the XID <literal>FrozenXID</> sometime before they reach prevent data loss, old tuples must be reassigned the XID
the two-billion-transactions-old mark. Once they are assigned this <literal>FrozenXID</> sometime before they reach the
special XID, they will appear to be <quote>in the past</> to all normal two-billion-transactions-old mark. Once they are assigned this
transactions regardless of wraparound issues, and so such tuples will be special XID, they will appear to be <quote>in the past</> to all
good until deleted, no matter how long that is. This reassignment of normal transactions regardless of wraparound issues, and so such
XID is handled by <command>VACUUM</>. tuples will be good until deleted, no matter how long that is. This
reassignment of XID is handled by <command>VACUUM</>.
</para> </para>
<para> <para>
...@@ -346,21 +350,22 @@ VACUUM ...@@ -346,21 +350,22 @@ VACUUM
<para> <para>
<command>VACUUM</> with the <command>FREEZE</> option uses a more <command>VACUUM</> with the <command>FREEZE</> option uses a more
aggressive freezing policy: tuples are frozen if they are old enough aggressive freezing policy: tuples are frozen if they are old enough
to be considered good by all open transactions. In particular, if to be considered good by all open transactions. In particular, if a
a <command>VACUUM FREEZE</> is performed in an otherwise-idle database, <command>VACUUM FREEZE</> is performed in an otherwise-idle
it is guaranteed that <emphasis>all</> tuples in that database will be database, it is guaranteed that <emphasis>all</> tuples in that
frozen. Hence, as long as the database is not modified in any way, it database will be frozen. Hence, as long as the database is not
will not need subsequent vacuuming to avoid transaction ID wraparound modified in any way, it will not need subsequent vacuuming to avoid
problems. This technique is used by <filename>initdb</> to prepare the transaction ID wraparound problems. This technique is used by
<filename>template0</> database. It should also be used to prepare any <filename>initdb</> to prepare the <filename>template0</> database.
user-created databases that are to be marked <literal>datallowconn</> = It should also be used to prepare any user-created databases that
<literal>false</> in <filename>pg_database</>, since there isn't any are to be marked <literal>datallowconn</> = <literal>false</> in
convenient way to vacuum a database that you can't connect to. Note <filename>pg_database</>, since there isn't any convenient way to
that <command>VACUUM</command>'s automatic warning message about unvacuumed databases will vacuum a database that you can't connect to. Note that
ignore <filename>pg_database</> entries with <literal>datallowconn</> = <command>VACUUM</command>'s automatic warning message about
<literal>false</>, so as to avoid giving false warnings about these unvacuumed databases will ignore <filename>pg_database</> entries
databases; therefore it's up to you to ensure that such databases are with <literal>datallowconn</> = <literal>false</>, so as to avoid
frozen correctly. giving false warnings about these databases; therefore it's up to
you to ensure that such databases are frozen correctly.
</para> </para>
</sect2> </sect2>
...@@ -375,13 +380,20 @@ VACUUM ...@@ -375,13 +380,20 @@ VACUUM
</indexterm> </indexterm>
<para> <para>
<productname>PostgreSQL</productname> is unable to reuse index pages <productname>PostgreSQL</productname> is unable to reuse btree index
in some cases. The problem is that if indexed rows are deleted, those pages in certain cases. The problem is that if indexed rows are
indexes pages can only be reused by rows with similar values. In deleted, those index pages can only be reused by rows with similar
cases where low indexed rows are deleted and newly inserted rows have values. For example, if indexed rows are deleted and newly
high values, disk space used by the index will grow indefinately, even inserted/updated rows have much higher values, the new rows can't use
if <command>VACUUM</> is run frequently. the index space made available by the deleted rows. Instead, such
TO BE COMPLETED 2002-06-22 bjm new rows must be placed on new index pages. In such cases, disk
space used by the index will grow indefinately, even if
<command>VACUUM</> is run frequently.
</para>
<para>
As a solution, you can use the <command>REINDEX</> command
periodically to discard pages used by deleted rows. There is also
<filename>contrib/reindex</> which can reindex an entire database.
</para> </para>
</sect1> </sect1>
...@@ -404,31 +416,32 @@ VACUUM ...@@ -404,31 +416,32 @@ VACUUM
</para> </para>
<para> <para>
If you simply direct the postmaster's <systemitem>stderr</> into a file, the only way If you simply direct the postmaster's <systemitem>stderr</> into a
to truncate the log file is to stop and restart the postmaster. This file, the only way to truncate the log file is to stop and restart
may be OK for development setups but you won't want to run a production the postmaster. This may be OK for development setups but you won't
server that way. want to run a production server that way.
</para> </para>
<para> <para>
The simplest production-grade approach to managing log output is to send it The simplest production-grade approach to managing log output is to
all to <application>syslog</> and let <application>syslog</> deal with file send it all to <application>syslog</> and let <application>syslog</>
rotation. To do this, make sure <productname>PostgreSQL</> was built with deal with file rotation. To do this, make sure
the <option>--enable-syslog</> configure option, and set <productname>PostgreSQL</> was built with the
<literal>syslog</> to 2 <option>--enable-syslog</> configure option, and set
(log to syslog only) in <filename>postgresql.conf</>. <literal>syslog</> to 2 (log to syslog only) in
Then you can send a <literal>SIGHUP</literal> signal to the <filename>postgresql.conf</>. Then you can send a
<application>syslog</> daemon whenever you want to force it to start <literal>SIGHUP</literal> signal to the <application>syslog</> daemon
writing a new log file. whenever you want to force it to start writing a new log file.
</para> </para>
<para> <para>
On many systems, however, syslog is not very reliable, particularly On many systems, however, syslog is not very reliable, particularly
with large log messages; it may truncate or drop messages just when with large log messages; it may truncate or drop messages just when
you need them the most. You may find it more useful to pipe the you need them the most. You may find it more useful to pipe the
<application>postmaster</>'s <systemitem>stderr</> to some type of log rotation script. <application>postmaster</>'s <systemitem>stderr</> to some type of
If you start the postmaster with <application>pg_ctl</>, then the log rotation script. If you start the postmaster with
postmaster's <systemitem>stderr</> is already redirected to <systemitem>stdout</>, so you just need a <application>pg_ctl</>, then the postmaster's <systemitem>stderr</>
is already redirected to <systemitem>stdout</>, so you just need a
pipe command: pipe command:
<screen> <screen>
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment