Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
P
Postgres FD Implementation
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Abuhujair Javed
Postgres FD Implementation
Commits
bfc6e9c9
Commit
bfc6e9c9
authored
Oct 12, 2006
by
Neil Conway
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Make some incremental improvements and fixes to the documentation on
Continuous Archiving. Plenty of editorial work remains...
parent
0c998388
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
106 additions
and
95 deletions
+106
-95
doc/src/sgml/backup.sgml
doc/src/sgml/backup.sgml
+106
-95
No files found.
doc/src/sgml/backup.sgml
View file @
bfc6e9c9
<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.
89 2006/10/02 22:33:02 momjian
Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.
90 2006/10/12 19:38:08 neilc
Exp $ -->
<chapter id="backup">
<chapter id="backup">
<title>Backup and Restore</title>
<title>Backup and Restore</title>
...
@@ -27,7 +27,7 @@
...
@@ -27,7 +27,7 @@
<title><acronym>SQL</> Dump</title>
<title><acronym>SQL</> Dump</title>
<para>
<para>
The idea behind th
e SQL-
dump method is to generate a text file with SQL
The idea behind th
is
dump method is to generate a text file with SQL
commands that, when fed back to the server, will recreate the
commands that, when fed back to the server, will recreate the
database in the same state as it was at the time of the dump.
database in the same state as it was at the time of the dump.
<productname>PostgreSQL</> provides the utility program
<productname>PostgreSQL</> provides the utility program
...
@@ -471,7 +471,7 @@ tar -cf backup.tar /usr/local/pgsql/data
...
@@ -471,7 +471,7 @@ tar -cf backup.tar /usr/local/pgsql/data
To recover successfully using continuous archiving (also called "online
To recover successfully using continuous archiving (also called "online
backup" by many database vendors), you need a continuous
backup" by many database vendors), you need a continuous
sequence of archived WAL files that extends back at least as far as the
sequence of archived WAL files that extends back at least as far as the
start time of your backup. So to get started, you should set
up and test
start time of your backup. So to get started, you should setup and test
your procedure for archiving WAL files <emphasis>before</> you take your
your procedure for archiving WAL files <emphasis>before</> you take your
first base backup. Accordingly, we first discuss the mechanics of
first base backup. Accordingly, we first discuss the mechanics of
archiving WAL files.
archiving WAL files.
...
@@ -861,8 +861,8 @@ SELECT pg_stop_backup();
...
@@ -861,8 +861,8 @@ SELECT pg_stop_backup();
<para>
<para>
Remove any files present in <filename>pg_xlog/</>; these came from the
Remove any files present in <filename>pg_xlog/</>; these came from the
backup dump and are therefore probably obsolete rather than current.
backup dump and are therefore probably obsolete rather than current.
If you didn't archive <filename>pg_xlog/</> at all, then re
-
create it,
If you didn't archive <filename>pg_xlog/</> at all, then recreate it,
and be sure to re
-
create the subdirectory
and be sure to recreate the subdirectory
<filename>pg_xlog/archive_status/</> as well.
<filename>pg_xlog/archive_status/</> as well.
</para>
</para>
</listitem>
</listitem>
...
@@ -905,7 +905,7 @@ SELECT pg_stop_backup();
...
@@ -905,7 +905,7 @@ SELECT pg_stop_backup();
</para>
</para>
<para>
<para>
The key part of all this is to set
up a recovery command file that
The key part of all this is to setup a recovery command file that
describes how you want to recover and how far the recovery should
describes how you want to recover and how far the recovery should
run. You can use <filename>recovery.conf.sample</> (normally
run. You can use <filename>recovery.conf.sample</> (normally
installed in the installation <filename>share/</> directory) as a
installed in the installation <filename>share/</> directory) as a
...
@@ -1196,7 +1196,7 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
...
@@ -1196,7 +1196,7 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
</para>
</para>
<para>
<para>
To make use of this capability you will need to set
up a Standby database
To make use of this capability you will need to setup a Standby database
on a second system, as described in <xref linkend="warm-standby">. By
on a second system, as described in <xref linkend="warm-standby">. By
taking a backup of the Standby server while it is running you will
taking a backup of the Standby server while it is running you will
have produced an incrementally updated backup. Once this configuration
have produced an incrementally updated backup. Once this configuration
...
@@ -1219,35 +1219,38 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
...
@@ -1219,35 +1219,38 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
<itemizedlist>
<itemizedlist>
<listitem>
<listitem>
<para>
<para>
Operations on hash indexes are
Operations on hash indexes are
not presently WAL-logged, so
not presently WAL-logged, so replay will not update these indexes.
replay will not update these indexes. The recommended workaround
The recommended workaround is to manually <command>REINDEX</> each
is to manually <xref linkend="sql-reindex" endterm="sql-reindex-title">
such index after completing a recovery operation.
each
such index after completing a recovery operation.
</para>
</para>
</listitem>
</listitem>
<listitem>
<listitem>
<para>
<para>
If a <command>CREATE DATABASE</> command is executed while a base
If a <xref linkend="sql-createdatabase" endterm="sql-createdatabase-title">
backup is being taken, and then the template database that the
command is executed while a base backup is being taken, and then
<command>CREATE DATABASE</> copied is modified while the base backup
the template database that the <command>CREATE DATABASE</> copied
is still in progress, it is possible that recovery will cause those
is modified while the base backup is still in progress, it is
modifications to be propagated into the created database as well.
possible that recovery will cause those modifications to be
This is of course undesirable. To avoid this risk, it is best not to
propagated into the created database as well. This is of course
modify any template databases while taking a base backup.
undesirable. To avoid this risk, it is best not to modify any
template databases while taking a base backup.
</para>
</para>
</listitem>
</listitem>
<listitem>
<listitem>
<para>
<para>
<command>CREATE TABLESPACE</> commands are WAL-logged with the literal
<xref linkend="sql-createtablespace" endterm="sql-createtablespace-title">
absolute path, and will therefore be replayed as tablespace creations
commands are WAL-logged with the literal absolute path, and will
with the same absolute path. This might be undesirable if the log is
therefore be replayed as tablespace creations with the same
being replayed on a different machine. It can be dangerous even if
absolute path. This might be undesirable if the log is being
the log is being replayed on the same machine, but into a new data
replayed on a different machine. It can be dangerous even if the
directory: the replay will still overwrite the contents of the original
log is being replayed on the same machine, but into a new data
tablespace. To avoid potential gotchas of this sort, the best practice
directory: the replay will still overwrite the contents of the
is to take a new base backup after creating or dropping tablespaces.
original tablespace. To avoid potential gotchas of this sort,
the best practice is to take a new base backup after creating or
dropping tablespaces.
</para>
</para>
</listitem>
</listitem>
</itemizedlist>
</itemizedlist>
...
@@ -1256,21 +1259,20 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
...
@@ -1256,21 +1259,20 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
<para>
<para>
It should also be noted that the default <acronym>WAL</acronym>
It should also be noted that the default <acronym>WAL</acronym>
format is fairly bulky since it includes many disk page snapshots.
format is fairly bulky since it includes many disk page snapshots.
These page snapshots are designed to support crash recovery,
These page snapshots are designed to support crash recovery, since
since we may need to fix partially-written disk pages. Depending
we may need to fix partially-written disk pages. Depending on
on your system hardware and software, the risk of partial writes may
your system hardware and software, the risk of partial writes may
be small enough to ignore, in which case you can significantly reduce
be small enough to ignore, in which case you can significantly
the total volume of archived logs by turning off page snapshots
reduce the total volume of archived logs by turning off page
using the <xref linkend="guc-full-page-writes"> parameter.
snapshots using the <xref linkend="guc-full-page-writes">
(Read the notes and warnings in
parameter. (Read the notes and warnings in <xref linkend="wal">
<xref linkend="wal"> before you do so.)
before you do so.) Turning off page snapshots does not prevent
Turning off page snapshots does not prevent use of the logs for PITR
use of the logs for PITR operations. An area for future
operations.
development is to compress archived WAL data by removing
An area for future development is to compress archived WAL data by
unnecessary page copies even when <varname>full_page_writes</> is
removing unnecessary page copies even when <varname>full_page_writes</>
on. In the meantime, administrators may wish to reduce the number
is on. In the meantime, administrators
of page snapshots included in WAL by increasing the checkpoint
may wish to reduce the number of page snapshots included in WAL by
interval parameters as much as feasible.
increasing the checkpoint interval parameters as much as feasible.
</para>
</para>
</sect2>
</sect2>
</sect1>
</sect1>
...
@@ -1326,8 +1328,8 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
...
@@ -1326,8 +1328,8 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
<para>
<para>
Directly moving WAL or "log" records from one database server to another
Directly moving WAL or "log" records from one database server to another
is typically described as Log Shipping.
PostgreSQL implements file-based
is typically described as Log Shipping.
<productname>PostgreSQL</>
Log Shipping, meaning
WAL records are batched one file at a time. WAL
implements file-based log shipping, which means that
WAL records are batched one file at a time. WAL
files can be shipped easily and cheaply over any distance, whether it be
files can be shipped easily and cheaply over any distance, whether it be
to an adjacent system, another system on the same site or another system
to an adjacent system, another system on the same site or another system
on the far side of the globe. The bandwidth required for this technique
on the far side of the globe. The bandwidth required for this technique
...
@@ -1339,13 +1341,13 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
...
@@ -1339,13 +1341,13 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
</para>
</para>
<para>
<para>
It should be noted that the log shipping is asynchronous, i.e. the
WAL
It should be noted that the log shipping is asynchronous, i.e. the
records are shipped after transaction commit. As a result there can be a
WAL records are shipped after transaction commit. As a result there
small window of data loss, should the Primary Server suffer a
can be a small window of data loss, should the Primary Server
catastrophic failure. The window of data loss is minimised by the use of
suffer a catastrophic failure. The window of data loss is minimised
the archive_timeout parameter, which can be set as low as a few seconds
by the use of the <varname>archive_timeout</varname> parameter,
if required. A very low setting can increase the bandwidth requirements
which can be set as low as a few seconds if required. A very low
for file shipping.
setting can increase the bandwidth requirements
for file shipping.
</para>
</para>
<para>
<para>
...
@@ -1374,7 +1376,7 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
...
@@ -1374,7 +1376,7 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
<para>
<para>
In general, log shipping between servers running different release
In general, log shipping between servers running different release
levels will not be possible. It is the policy of the PostgreSQL
Worldwide
levels will not be possible. It is the policy of the PostgreSQL
Global
Development Group not to make changes to disk formats during minor release
Development Group not to make changes to disk formats during minor release
upgrades, so it is likely that running different minor release levels
upgrades, so it is likely that running different minor release levels
on Primary and Standby servers will work successfully. However, no
on Primary and Standby servers will work successfully. However, no
...
@@ -1389,7 +1391,8 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
...
@@ -1389,7 +1391,8 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
On the Standby server all tablespaces and paths will refer to similarly
On the Standby server all tablespaces and paths will refer to similarly
named mount points, so it is important to create the Primary and Standby
named mount points, so it is important to create the Primary and Standby
servers so that they are as similar as possible, at least from the
servers so that they are as similar as possible, at least from the
perspective of the database server. Furthermore, any CREATE TABLESPACE
perspective of the database server. Furthermore, any <xref
linkend="sql-createtablespace" endterm="sql-createtablespace-title">
commands will be passed across as-is, so any new mount points must be
commands will be passed across as-is, so any new mount points must be
created on both servers before they are used on the Primary. Hardware
created on both servers before they are used on the Primary. Hardware
need not be the same, but experience shows that maintaining two
need not be the same, but experience shows that maintaining two
...
@@ -1408,28 +1411,31 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
...
@@ -1408,28 +1411,31 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
</para>
</para>
<para>
<para>
The magic that makes the two loosely coupled servers work together is
The magic that makes the two loosely coupled servers work together
simply a restore_command that waits for the next WAL file to be archived
is simply a <varname>restore_command</> that waits for the next
from the Primary. The restore_command is specified in the recovery.conf
WAL file to be archived from the Primary. The <varname>restore_command</>
file on the Standby Server. Normal recovery processing would request a
is specified in the <filename>recovery.conf</> file on the Standby
file from the WAL archive, causing an error if the file was unavailable.
Server. Normal recovery processing would request a file from the
For Standby processing it is normal for the next file to be unavailable,
WAL archive, causing an error if the file was unavailable. For
so we must be patient and wait for it to appear. A waiting
Standby processing it is normal for the next file to be
restore_command can be written as a custom script that loops after
unavailable, so we must be patient and wait for it to appear. A
polling for the existence of the next WAL file. There must also be some
waiting <varname>restore_command</> can be written as a custom
way to trigger failover, which should interrupt the restore_command,
script that loops after polling for the existence of the next WAL
break the loop and return a file not found error to the Standby Server.
file. There must also be some way to trigger failover, which
This then ends recovery and the Standby will then come up as a normal
should interrupt the <varname>restore_command</>, break the loop
and return a file not found error to the Standby Server. This then
ends recovery and the Standby will then come up as a normal
server.
server.
</para>
</para>
<para>
<para>
Sample code for the C version of the restore_command would be be:
Sample code for the C version of the <varname>restore_command</>
would be be:
<programlisting>
<programlisting>
triggered = false;
triggered = false;
while (!NextWALFileReady() && !triggered)
while (!NextWALFileReady() && !triggered)
{
{
sleep(100000L); /
/ wait for ~0.1 sec
sleep(100000L); /
* wait for ~0.1 sec */
if (CheckForExternalTrigger())
if (CheckForExternalTrigger())
triggered = true;
triggered = true;
}
}
...
@@ -1439,24 +1445,27 @@ if (!triggered)
...
@@ -1439,24 +1445,27 @@ if (!triggered)
</para>
</para>
<para>
<para>
PostgreSQL does not provide the system software required to identify a
<productname>PostgreSQL</productname> does not provide the system
failure on the Primary and notify the Standby system and then the
software required to identify a failure on the Primary and notify
Standby database server. Many such tools exist and are well integrated
the Standby system and then the Standby database server. Many such
with other aspects of a system failover, such as ip address migration.
tools exist and are well integrated with other aspects of a system
failover, such as IP address migration.
</para>
</para>
<para>
<para>
Triggering failover is an important part of planning and design. The
Triggering failover is an important part of planning and
restore_command is executed in full once for each WAL file. The process
design. The <varname>restore_command</> is executed in full once
running the restore_command is therefore created and dies for each file,
for each WAL file. The process running the <varname>restore_command</>
so there is no daemon or server process and so we cannot use signals and
is therefore created and dies for each file, so there is no daemon
a signal handler. A more permanent notification is required to trigger
or server process and so we cannot use signals and a signal
the failover. It is possible to use a simple timeout facility,
handler. A more permanent notification is required to trigger the
especially if used in conjunction with a known archive_timeout setting
failover. It is possible to use a simple timeout facility,
on the Primary. This is somewhat error prone since a network or busy
especially if used in conjunction with a known
Primary server might be sufficient to initiate failover. A notification
<varname>archive_timeout</> setting on the Primary. This is
mechanism such as the explicit creation of a trigger file is less error
somewhat error prone since a network or busy Primary server might
prone, if this can be arranged.
be sufficient to initiate failover. A notification mechanism such
as the explicit creation of a trigger file is less error prone, if
this can be arranged.
</para>
</para>
</sect2>
</sect2>
...
@@ -1469,13 +1478,14 @@ if (!triggered)
...
@@ -1469,13 +1478,14 @@ if (!triggered)
<orderedlist>
<orderedlist>
<listitem>
<listitem>
<para>
<para>
Set up Primary and Standby systems as near identically as possible,
Setup Primary and Standby systems as near identically as
including two identical copies of PostgreSQL at same release level.
possible, including two identical copies of
<productname>PostgreSQL</> at the same release level.
</para>
</para>
</listitem>
</listitem>
<listitem>
<listitem>
<para>
<para>
Set
up Continuous Archiving from the Primary to a WAL archive located
Setup Continuous Archiving from the Primary to a WAL archive located
in a directory on the Standby Server. Ensure that both <xref
in a directory on the Standby Server. Ensure that both <xref
linkend="guc-archive-command"> and <xref linkend="guc-archive-timeout">
linkend="guc-archive-command"> and <xref linkend="guc-archive-timeout">
are set. (See <xref linkend="backup-archiving-wal">)
are set. (See <xref linkend="backup-archiving-wal">)
...
@@ -1489,9 +1499,10 @@ if (!triggered)
...
@@ -1489,9 +1499,10 @@ if (!triggered)
</listitem>
</listitem>
<listitem>
<listitem>
<para>
<para>
Begin recovery on the Standby Server from the local WAL archive,
Begin recovery on the Standby Server from the local WAL
using a recovery.conf that specifies a restore_command that waits as
archive, using a <filename>recovery.conf</> that specifies a
described previously. (See <xref linkend="backup-pitr-recovery">)
<varname>restore_command</> that waits as described
previously. (See <xref linkend="backup-pitr-recovery">)
</para>
</para>
</listitem>
</listitem>
</orderedlist>
</orderedlist>
...
@@ -1551,7 +1562,7 @@ if (!triggered)
...
@@ -1551,7 +1562,7 @@ if (!triggered)
At the instant that failover takes place to the Standby, we have only a
At the instant that failover takes place to the Standby, we have only a
single server in operation. This is known as a degenerate state.
single server in operation. This is known as a degenerate state.
The former Standby is now the Primary, but the former Primary is down
The former Standby is now the Primary, but the former Primary is down
and may stay down. We must now fully re
-
create a Standby server,
and may stay down. We must now fully recreate a Standby server,
either on the former Primary system when it comes up, or on a third,
either on the former Primary system when it comes up, or on a third,
possibly new, system. Once complete the Primary and Standby can be
possibly new, system. Once complete the Primary and Standby can be
considered to have switched roles. Some people choose to use a third
considered to have switched roles. Some people choose to use a third
...
@@ -1577,18 +1588,18 @@ if (!triggered)
...
@@ -1577,18 +1588,18 @@ if (!triggered)
The main features for Log Shipping in this release are based
The main features for Log Shipping in this release are based
around the file-based Log Shipping described above. It is also
around the file-based Log Shipping described above. It is also
possible to implement record-based Log Shipping using the
possible to implement record-based Log Shipping using the
<function>pg_xlogfile_name_offset</function> function (see <xref
<function>pg_xlogfile_name_offset
()
</function> function (see <xref
linkend="functions-admin">), though this requires custom
linkend="functions-admin">), though this requires custom
development.
development.
</para>
</para>
<para>
<para>
An external program can call
pg_xlogfile_name_offset() to find out the
An external program can call
<function>pg_xlogfile_name_offset()</>
filename and the exact byte offset within it of the latest WAL pointer.
to find out the filename and the exact byte offset within it of
If the external program regularly polls the server it can find out how
the latest WAL pointer. If the external program regularly polls
far forward the pointer has moved. It can then access the WAL file
the server it can find out how far forward the pointer has
directly and copy those bytes across to a less up-to-date copy on a
moved. It can then access the WAL file directly and copy those
Standby Server.
bytes across to a less up-to-date copy on a
Standby Server.
</para>
</para>
</sect2>
</sect2>
</sect1>
</sect1>
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment