Commit fab13dc5 authored by Fujii Masao's avatar Fujii Masao

Make pg_basebackup ask the server to estimate the total backup size, by default.

This commit changes pg_basebackup so that it specifies PROGRESS option in
BASE_BACKUP replication command whether --progress is specified or not.
This causes the server to estimate the total backup size and report it in
pg_stat_progress_basebackup.backup_total, by default. This is reasonable
default because the time required for the estimation would not be so large
in most cases.

Also this commit adds new option --no-estimate-size to pg_basebackup.
This option prevents the server from the estimation, and so is useful to
avoid such estimation time if it's too long.

Author: Fujii Masao
Reviewed-by: Magnus Hagander, Amit Langote
Discussion: https://postgr.es/m/CABUevEyDPPSjP7KRvfTXPdqOdY5aWNkqsB5aAXs3bco5ZwtGHg@mail.gmail.com
parent c314c147
...@@ -4392,10 +4392,7 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid, ...@@ -4392,10 +4392,7 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
<entry><structfield>backup_total</structfield></entry> <entry><structfield>backup_total</structfield></entry>
<entry><type>bigint</type></entry> <entry><type>bigint</type></entry>
<entry> <entry>
Total amount of data that will be streamed. If progress reporting Total amount of data that will be streamed. This is estimated and
is not enabled in <application>pg_basebackup</application>
(i.e., <literal>--progress</literal> option is not specified),
this is <literal>0</literal>. Otherwise, this is estimated and
reported as of the beginning of reported as of the beginning of
<literal>streaming database files</literal> phase. Note that <literal>streaming database files</literal> phase. Note that
this is only an approximation since the database this is only an approximation since the database
...@@ -4403,7 +4400,10 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid, ...@@ -4403,7 +4400,10 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
and WAL log may be included in the backup later. This is always and WAL log may be included in the backup later. This is always
the same value as <structfield>backup_streamed</structfield> the same value as <structfield>backup_streamed</structfield>
once the amount of data streamed exceeds the estimated once the amount of data streamed exceeds the estimated
total size. total size. If the estimation is disabled in
<application>pg_basebackup</application>
(i.e., <literal>--no-estimate-size</literal> option is specified),
this is <literal>0</literal>.
</entry> </entry>
</row> </row>
<row> <row>
......
...@@ -460,21 +460,6 @@ PostgreSQL documentation ...@@ -460,21 +460,6 @@ PostgreSQL documentation
in this case the estimated target size will increase once it passes the in this case the estimated target size will increase once it passes the
total estimate without WAL. total estimate without WAL.
</para> </para>
<para>
When this is enabled, the backup will start by enumerating the size of
the entire database, and then go back and send the actual contents.
This may make the backup take slightly longer, and in particular it
will take longer before the first data is sent.
</para>
<para>
Whether this is enabled or not, the
<structname>pg_stat_progress_basebackup</structname> view
report the progress of the backup in the server side. But note
that the total amount of data that will be streamed is estimated
and reported only when this option is enabled. In other words,
<literal>backup_total</literal> column in the view always
indicates <literal>0</literal> if this option is disabled.
</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
...@@ -552,6 +537,30 @@ PostgreSQL documentation ...@@ -552,6 +537,30 @@ PostgreSQL documentation
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry>
<term><option>--no-estimate-size</option></term>
<listitem>
<para>
This option prevents the server from estimating the total
amount of backup data that will be streamed, resulting in the
<literal>backup_total</literal> column in the
<structname>pg_stat_progress_basebackup</structname>
to be <literal>0</literal>.
</para>
<para>
Without this option, the backup will start by enumerating
the size of the entire database, and then go back and send
the actual contents. This may make the backup take slightly
longer, and in particular it will take longer before the first
data is sent. This option is useful to avoid such estimation
time if it's too long.
</para>
<para>
This option is not allowed when using <option>--progress</option>.
</para>
</listitem>
</varlistentry>
</variablelist> </variablelist>
</para> </para>
......
...@@ -121,6 +121,7 @@ static char *label = "pg_basebackup base backup"; ...@@ -121,6 +121,7 @@ static char *label = "pg_basebackup base backup";
static bool noclean = false; static bool noclean = false;
static bool checksum_failure = false; static bool checksum_failure = false;
static bool showprogress = false; static bool showprogress = false;
static bool estimatesize = true;
static int verbose = 0; static int verbose = 0;
static int compresslevel = 0; static int compresslevel = 0;
static IncludeWal includewal = STREAM_WAL; static IncludeWal includewal = STREAM_WAL;
...@@ -386,6 +387,7 @@ usage(void) ...@@ -386,6 +387,7 @@ usage(void)
printf(_(" --no-slot prevent creation of temporary replication slot\n")); printf(_(" --no-slot prevent creation of temporary replication slot\n"));
printf(_(" --no-verify-checksums\n" printf(_(" --no-verify-checksums\n"
" do not verify checksums\n")); " do not verify checksums\n"));
printf(_(" --no-estimate-size do not estimate backup size in server side\n"));
printf(_(" -?, --help show this help, then exit\n")); printf(_(" -?, --help show this help, then exit\n"));
printf(_("\nConnection options:\n")); printf(_("\nConnection options:\n"));
printf(_(" -d, --dbname=CONNSTR connection string\n")); printf(_(" -d, --dbname=CONNSTR connection string\n"));
...@@ -1741,7 +1743,7 @@ BaseBackup(void) ...@@ -1741,7 +1743,7 @@ BaseBackup(void)
basebkp = basebkp =
psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s %s", psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s %s",
escaped_label, escaped_label,
showprogress ? "PROGRESS" : "", estimatesize ? "PROGRESS" : "",
includewal == FETCH_WAL ? "WAL" : "", includewal == FETCH_WAL ? "WAL" : "",
fastcheckpoint ? "FAST" : "", fastcheckpoint ? "FAST" : "",
includewal == NO_WAL ? "" : "NOWAIT", includewal == NO_WAL ? "" : "NOWAIT",
...@@ -2066,6 +2068,7 @@ main(int argc, char **argv) ...@@ -2066,6 +2068,7 @@ main(int argc, char **argv)
{"waldir", required_argument, NULL, 1}, {"waldir", required_argument, NULL, 1},
{"no-slot", no_argument, NULL, 2}, {"no-slot", no_argument, NULL, 2},
{"no-verify-checksums", no_argument, NULL, 3}, {"no-verify-checksums", no_argument, NULL, 3},
{"no-estimate-size", no_argument, NULL, 4},
{NULL, 0, NULL, 0} {NULL, 0, NULL, 0}
}; };
int c; int c;
...@@ -2234,6 +2237,9 @@ main(int argc, char **argv) ...@@ -2234,6 +2237,9 @@ main(int argc, char **argv)
case 3: case 3:
verify_checksums = false; verify_checksums = false;
break; break;
case 4:
estimatesize = false;
break;
default: default:
/* /*
...@@ -2356,6 +2362,14 @@ main(int argc, char **argv) ...@@ -2356,6 +2362,14 @@ main(int argc, char **argv)
} }
#endif #endif
if (showprogress && !estimatesize)
{
pg_log_error("--progress and --no-estimate-size are incompatible options");
fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
progname);
exit(1);
}
/* connection in replication mode to server */ /* connection in replication mode to server */
conn = GetConnection(); conn = GetConnection();
if (!conn) if (!conn)
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment