Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
P
Postgres FD Implementation
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Abuhujair Javed
Postgres FD Implementation
Commits
2ff4e440
Commit
2ff4e440
authored
Apr 22, 2004
by
Neil Conway
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Improvements to the backup & restore documentation.
parent
e3391133
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
101 additions
and
57 deletions
+101
-57
doc/src/sgml/backup.sgml
doc/src/sgml/backup.sgml
+24
-25
doc/src/sgml/perform.sgml
doc/src/sgml/perform.sgml
+77
-32
No files found.
doc/src/sgml/backup.sgml
View file @
2ff4e440
<!--
<!--
$PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.3
8 2004/03/09 16:57:46
neilc Exp $
$PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.3
9 2004/04/22 07:02:35
neilc Exp $
-->
-->
<chapter id="backup">
<chapter id="backup">
<title>Backup and Restore</title>
<title>Backup and Restore</title>
...
@@ -30,7 +30,7 @@ $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.38 2004/03/09 16:57:46 neilc Exp
...
@@ -30,7 +30,7 @@ $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.38 2004/03/09 16:57:46 neilc Exp
commands that, when fed back to the server, will recreate the
commands that, when fed back to the server, will recreate the
database in the same state as it was at the time of the dump.
database in the same state as it was at the time of the dump.
<productname>PostgreSQL</> provides the utility program
<productname>PostgreSQL</> provides the utility program
<
application>pg_dump</
> for this purpose. The basic usage of this
<
xref linkend="app-pgdump"
> for this purpose. The basic usage of this
command is:
command is:
<synopsis>
<synopsis>
pg_dump <replaceable class="parameter">dbname</replaceable> > <replaceable class="parameter">outfile</replaceable>
pg_dump <replaceable class="parameter">dbname</replaceable> > <replaceable class="parameter">outfile</replaceable>
...
@@ -126,10 +126,11 @@ psql <replaceable class="parameter">dbname</replaceable> < <replaceable class
...
@@ -126,10 +126,11 @@ psql <replaceable class="parameter">dbname</replaceable> < <replaceable class
</para>
</para>
<para>
<para>
Once restored, it is wise to run <command>ANALYZE</> on each
Once restored, it is wise to run <xref linkend="sql-analyze"
database so the optimizer has useful statistics. You
endterm="sql-analyze-title"> on each database so the optimizer has
can also run <command>vacuumdb -a -z</> to <command>ANALYZE</> all
useful statistics. You can also run <command>vacuumdb -a -z</> to
databases.
<command>VACUUM ANALYZE</> all databases; this is equivalent to
running <command>VACUUM ANALYZE</command> manually.
</para>
</para>
<para>
<para>
...
@@ -153,13 +154,11 @@ pg_dump -h <replaceable>host1</> <replaceable>dbname</> | psql -h <replaceable>h
...
@@ -153,13 +154,11 @@ pg_dump -h <replaceable>host1</> <replaceable>dbname</> | psql -h <replaceable>h
</para>
</para>
</important>
</important>
<tip>
<para>
<para>
For advice on how to load large amounts of data into
Restore performance can be improved by increasing the
<productname>PostgreSQL</productname> efficiently, refer to <xref
configuration parameter <xref
linkend="populate">.
linkend="guc-maintenance-work-mem">.
</para>
</para>
</tip>
</sect2>
</sect2>
<sect2 id="backup-dump-all">
<sect2 id="backup-dump-all">
...
@@ -167,12 +166,11 @@ pg_dump -h <replaceable>host1</> <replaceable>dbname</> | psql -h <replaceable>h
...
@@ -167,12 +166,11 @@ pg_dump -h <replaceable>host1</> <replaceable>dbname</> | psql -h <replaceable>h
<para>
<para>
The above mechanism is cumbersome and inappropriate when backing
The above mechanism is cumbersome and inappropriate when backing
up an entire database cluster. For this reason the
up an entire database cluster. For this reason the
<xref
<application>pg_dumpall</
> program is provided.
linkend="app-pg-dumpall"
> program is provided.
<application>pg_dumpall</> backs up each database in a given
<application>pg_dumpall</> backs up each database in a given
cluster, and also preserves cluster-wide data such as
cluster, and also preserves cluster-wide data such as users and
users and groups. The call sequence for
groups. The basic usage of this command is:
<application>pg_dumpall</> is simply
<synopsis>
<synopsis>
pg_dumpall > <replaceable>outfile</>
pg_dumpall > <replaceable>outfile</>
</synopsis>
</synopsis>
...
@@ -195,7 +193,7 @@ psql template1 < <replaceable class="parameter">infile</replaceable>
...
@@ -195,7 +193,7 @@ psql template1 < <replaceable class="parameter">infile</replaceable>
Since <productname>PostgreSQL</productname> allows tables larger
Since <productname>PostgreSQL</productname> allows tables larger
than the maximum file size on your system, it can be problematic
than the maximum file size on your system, it can be problematic
to dump such a table to a file, since the resulting file will likely
to dump such a table to a file, since the resulting file will likely
be larger than the maximum size allowed by your system.
As
be larger than the maximum size allowed by your system.
Since
<application>pg_dump</> can write to the standard output, you can
<application>pg_dump</> can write to the standard output, you can
just use standard Unix tools to work around this possible problem.
just use standard Unix tools to work around this possible problem.
</para>
</para>
...
@@ -274,7 +272,7 @@ pg_dump -Fc <replaceable class="parameter">dbname</replaceable> > <replaceable c
...
@@ -274,7 +272,7 @@ pg_dump -Fc <replaceable class="parameter">dbname</replaceable> > <replaceable c
For reasons of backward compatibility, <application>pg_dump</>
For reasons of backward compatibility, <application>pg_dump</>
does not dump large objects by default.<indexterm><primary>large
does not dump large objects by default.<indexterm><primary>large
object</primary><secondary>backup</secondary></indexterm> To dump
object</primary><secondary>backup</secondary></indexterm> To dump
large objects you must use either the custom or the
TAR
output
large objects you must use either the custom or the
tar
output
format, and use the <option>-b</> option in
format, and use the <option>-b</> option in
<application>pg_dump</>. See the reference pages for details. The
<application>pg_dump</>. See the reference pages for details. The
directory <filename>contrib/pg_dumplo</> of the
directory <filename>contrib/pg_dumplo</> of the
...
@@ -315,11 +313,12 @@ tar -cf backup.tar /usr/local/pgsql/data
...
@@ -315,11 +313,12 @@ tar -cf backup.tar /usr/local/pgsql/data
<para>
<para>
The database server <emphasis>must</> be shut down in order to
The database server <emphasis>must</> be shut down in order to
get a usable backup. Half-way measures such as disallowing all
get a usable backup. Half-way measures such as disallowing all
connections will not work as there is always some buffering
connections will <emphasis>not</emphasis> work
going on. Information about stopping the server can be
(<command>tar</command> and similar tools do not take an atomic
found in <xref linkend="postmaster-shutdown">. Needless to say
snapshot of the state of the filesystem at a point in
that you also need to shut down the server before restoring the
time). Information about stopping the server can be found in
data.
<xref linkend="postmaster-shutdown">. Needless to say that you
also need to shut down the server before restoring the data.
</para>
</para>
</listitem>
</listitem>
...
...
doc/src/sgml/perform.sgml
View file @
2ff4e440
<!--
<!--
$PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.4
3 2004/03/25 18:57:57 tgl
Exp $
$PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.4
4 2004/04/22 07:02:36 neilc
Exp $
-->
-->
<chapter id="performance-tips">
<chapter id="performance-tips">
...
@@ -28,8 +28,8 @@ $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.43 2004/03/25 18:57:57 tgl Exp
...
@@ -28,8 +28,8 @@ $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.43 2004/03/25 18:57:57 tgl Exp
plan</firstterm> for each query it is given. Choosing the right
plan</firstterm> for each query it is given. Choosing the right
plan to match the query structure and the properties of the data
plan to match the query structure and the properties of the data
is absolutely critical for good performance. You can use the
is absolutely critical for good performance. You can use the
<
command>EXPLAIN</command> command to see what query plan the system
<
xref linkend="sql-explain" endterm="sql-explain-title"> command
creates for any query.
to see what query plan the system
creates for any query.
Plan-reading is an art that deserves an extensive tutorial, which
Plan-reading is an art that deserves an extensive tutorial, which
this is not; but here is some basic information.
this is not; but here is some basic information.
</para>
</para>
...
@@ -638,30 +638,51 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
...
@@ -638,30 +638,51 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
</indexterm>
</indexterm>
<para>
<para>
Turn off autocommit and just do one commit at
Turn off autocommit and just do one commit at the end. (In plain
the end. (In plain SQL, this means issuing <command>BEGIN</command>
SQL, this means issuing <command>BEGIN</command> at the start and
at the start and <command>COMMIT</command> at the end. Some client
<command>COMMIT</command> at the end. Some client libraries may
libraries may do this behind your back, in which case you need to
do this behind your back, in which case you need to make sure the
make sure the library does it when you want it done.)
library does it when you want it done.) If you allow each
If you allow each insertion to be committed separately,
insertion to be committed separately,
<productname>PostgreSQL</productname> is doing a lot of work for each
<productname>PostgreSQL</productname> is doing a lot of work for
row that is added.
each row that is added. An additional benefit of doing all
An additional benefit of doing all insertions in one transaction
insertions in one transaction is that if the insertion of one row
is that if the insertion of one row were to fail then the
were to fail then the insertion of all rows inserted up to that
insertion of all rows inserted up to that point would be rolled
point would be rolled back, so you won't be stuck with partially
back, so you won't be stuck with partially loaded data.
loaded data.
</para>
<para>
If you are issuing a large sequence of <command>INSERT</command>
commands to bulk load some data, also consider using <xref
linkend="sql-prepare" endterm="sql-prepare-title"> to create a
prepared <command>INSERT</command> statement. Since you are
executing the same command multiple times, it is more efficient to
prepare the command once and then use <command>EXECUTE</command>
as many times as required.
</para>
</para>
</sect2>
</sect2>
<sect2 id="populate-copy-from">
<sect2 id="populate-copy-from">
<title>Use <command>COPY FROM</command></title>
<title>Use <command>COPY</command></title>
<para>
Use <xref linkend="sql-copy" endterm="sql-copy-title"> to load
all the rows in one command, instead of using a series of
<command>INSERT</command> commands. The <command>COPY</command>
command is optimized for loading large numbers of rows; it is less
flexible than <command>INSERT</command>, but incurs significantly
less overhead for large data loads. Since <command>COPY</command>
is a single command, there is no need to disable autocommit if you
use this method to populate a table.
</para>
<para>
<para>
Use <command>COPY FROM STDIN</command> to load all the rows in one
Note that loading a large number of rows using
command, instead of using a series of <command>INSERT</command>
<command>COPY</command> is almost always faster than using
commands. This reduces parsing, planning, etc. overhead a great
<command>INSERT</command>, even if multiple
deal. If you do this then it is not necessary to turn off
<command>INSERT</command> commands are batched into a single
autocommit, since it is only one command anyway
.
transaction
.
</para>
</para>
</sect2>
</sect2>
...
@@ -678,11 +699,12 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
...
@@ -678,11 +699,12 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
<para>
<para>
If you are augmenting an existing table, you can drop the index,
If you are augmenting an existing table, you can drop the index,
load the table, then recreate the index. Of
load the table, and then recreate the index. Of course, the
course, the database performance for other users may be adversely
database performance for other users may be adversely affected
affected during the time that the index is missing. One should also
during the time that the index is missing. One should also think
think twice before dropping unique indexes, since the error checking
twice before dropping unique indexes, since the error checking
afforded by the unique constraint will be lost while the index is missing.
afforded by the unique constraint will be lost while the index is
missing.
</para>
</para>
</sect2>
</sect2>
...
@@ -701,16 +723,39 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
...
@@ -701,16 +723,39 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
</para>
</para>
</sect2>
</sect2>
<sect2 id="populate-checkpoint-segments">
<title>Increase <varname>checkpoint_segments</varname></title>
<para>
Temporarily increasing the <xref
linkend="guc-checkpoint-segments"> configuration variable can also
make large data loads faster. This is because loading a large
amount of data into <productname>PostgreSQL</productname> can
cause checkpoints to occur more often than the normal checkpoint
frequency (specified by the <varname>checkpoint_timeout</varname>
configuration variable). Whenever a checkpoint occurs, all dirty
pages must be flushed to disk. By increasing
<varname>checkpoint_segments</varname> temporarily during bulk
data loads, the number of checkpoints that are required can be
reduced.
</para>
</sect2>
<sect2 id="populate-analyze">
<sect2 id="populate-analyze">
<title>Run <command>ANALYZE</command> Afterwards</title>
<title>Run <command>ANALYZE</command> Afterwards</title>
<para>
<para>
It's a good idea to run <command>ANALYZE</command> or <command>VACUUM
Whenever you have significantly altered the distribution of data
ANALYZE</command> anytime you've added or updated a lot of data,
within a table, running <xref linkend="sql-analyze"
including just after initially populating a table. This ensures that
endterm="sql-analyze-title"> is strongly recommended. This
the planner has up-to-date statistics about the table. With no statistics
includes when bulk loading large amounts of data into
or obsolete statistics, the planner may make poor choices of query plans,
<productname>PostgreSQL</productname>. Running
leading to bad performance on queries that use your table.
<command>ANALYZE</command> (or <command>VACUUM ANALYZE</command>)
ensures that the planner has up-to-date statistics about the
table. With no statistics or obsolete statistics, the planner may
make poor decisions during query planning, leading to poor
performance on any tables with inaccurate or nonexistent
statistics.
</para>
</para>
</sect2>
</sect2>
</sect1>
</sect1>
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment