Commit 72d422a5 authored by Andrew Dunstan's avatar Andrew Dunstan

Map basebackup tablespaces using a tablespace_map file

Windows can't reliably restore symbolic links from a tar format, so
instead during backup start we create a tablespace_map file, which is
used by the restoring postgres to create the correct links in pg_tblspc.
The backup protocol also now has an option to request this file to be
included in the backup stream, and this is used by pg_basebackup when
operating in tar mode.

This is done on all platforms, not just Windows.

This means that pg_basebackup will not not work in tar mode against 9.4
and older servers, as this protocol option isn't implemented there.

Amit Kapila, reviewed by Dilip Kumar, with a little editing from me.
parent d02f1647
......@@ -836,8 +836,11 @@ SELECT pg_start_backup('label');
<function>pg_start_backup</> creates a <firstterm>backup label</> file,
called <filename>backup_label</>, in the cluster directory with
information about your backup, including the start time and label
string. The file is critical to the integrity of the backup, should
you need to restore from it.
string. The function also creates a <firstterm>tablespace map</> file,
called <filename>tablespace_map</>, in the cluster directory with
information about tablespace symbolic links in <filename>pg_tblspc/</>
if one or more such link is present. Both files are critical to the
integrity of the backup, should you need to restore from it.
</para>
<para>
......@@ -965,17 +968,20 @@ SELECT pg_stop_backup();
<para>
It's also worth noting that the <function>pg_start_backup</> function
makes a file named <filename>backup_label</> in the database cluster
directory, which is removed by <function>pg_stop_backup</>.
This file will of course be archived as a part of your backup dump file.
The backup label file includes the label string you gave to
<function>pg_start_backup</>, as well as the time at which
<function>pg_start_backup</> was run, and the name of the starting WAL
file. In case of confusion it is therefore possible to look inside a
backup dump file and determine exactly which backup session the dump file
came from. However, this file is not merely for your information; its
presence and contents are critical to the proper operation of the system's
recovery process.
makes files named <filename>backup_label</> and
<filename>tablesapce_map</> in the database cluster directory,
which are removed by <function>pg_stop_backup</>. These files will of
course be archived as a part of your backup dump file. The backup label
file includes the label string you gave to <function>pg_start_backup</>,
as well as the time at which <function>pg_start_backup</> was run, and
the name of the starting WAL file. In case of confusion it is therefore
possible to look inside a backup dump file and determine exactly which
backup session the dump file came from. The tablespace map file includes
the symbolic link names as they exist in the directory
<filename>pg_tblspc/</> and the full path of each symbolic link.
These files are not merely for your information; their presence and
contents are critical to the proper operation of the system's recovery
process.
</para>
<para>
......
......@@ -16591,11 +16591,12 @@ SELECT set_config('log_statement_stats', 'off', false);
<function>pg_start_backup</> accepts an
arbitrary user-defined label for the backup. (Typically this would be
the name under which the backup dump file will be stored.) The function
writes a backup label file (<filename>backup_label</>) into the
database cluster's data directory, performs a checkpoint,
and then returns the backup's starting transaction log location as text.
The user can ignore this result value, but it is
provided in case it is useful.
writes a backup label file (<filename>backup_label</>) and, if there
are any links in the <filename>pg_tblspc/</> directory, a tablespace map
file (<filename>tablespace_map</>) into the database cluster's data
directory, performs a checkpoint, and then returns the backup's starting
transaction log location as text. The user can ignore this result value,
but it is provided in case it is useful.
<programlisting>
postgres=# select pg_start_backup('label_goes_here');
pg_start_backup
......@@ -16610,7 +16611,8 @@ postgres=# select pg_start_backup('label_goes_here');
</para>
<para>
<function>pg_stop_backup</> removes the label file created by
<function>pg_stop_backup</> removes the label file and, if it exists,
the <filename>tablespace_map</> file created by
<function>pg_start_backup</>, and creates a backup history file in
the transaction log archive area. The history file includes the label given to
<function>pg_start_backup</>, the starting and ending transaction log locations for
......
......@@ -1882,7 +1882,7 @@ The commands accepted in walsender mode are:
</varlistentry>
<varlistentry>
<term>BASE_BACKUP [<literal>LABEL</literal> <replaceable>'label'</replaceable>] [<literal>PROGRESS</literal>] [<literal>FAST</literal>] [<literal>WAL</literal>] [<literal>NOWAIT</literal>] [<literal>MAX_RATE</literal> <replaceable>rate</replaceable>]
<term>BASE_BACKUP [<literal>LABEL</literal> <replaceable>'label'</replaceable>] [<literal>PROGRESS</literal>] [<literal>FAST</literal>] [<literal>WAL</literal>] [<literal>NOWAIT</literal>] [<literal>MAX_RATE</literal> <replaceable>rate</replaceable>] [<literal>TABLESPACE_MAP</literal>]
<indexterm><primary>BASE_BACKUP</primary></indexterm>
</term>
<listitem>
......@@ -1968,6 +1968,19 @@ The commands accepted in walsender mode are:
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>TABLESPACE_MAP</literal></term>
<listitem>
<para>
Include information about symbolic links present in the directory
<filename>pg_tblspc</filename> in a file named
<filename>tablespace_map</filename>. The tablespace map file includes
each symbolic link name as it exists in the directory
<filename>pg_tblspc/</> and the full path of that symbolic link.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
<para>
......
......@@ -587,11 +587,23 @@ PostgreSQL documentation
tablespaces.
</para>
<para>
When tar format mode is used, it is the user's responsibility to unpack each
tar file before starting postgres. If there are additional tablespaces, the
tar files for them need to be unpacked in the correct locations. In this
case the symbolic links for those tablespaces will be created by Postgres
according to the contents of the <filename>tablespace_map</> file that is
included in the <filename>base.tar</> file.
</para>
<para>
<application>pg_basebackup</application> works with servers of the same
or an older major version, down to 9.1. However, WAL streaming mode (-X
stream) only works with server version 9.3 and later.
stream) only works with server version 9.3 and later, and tar format mode
(--format=tar) of the current version only works with server version 9.5
or later.
</para>
</refsect1>
<refsect1>
......
This diff is collapsed.
......@@ -51,6 +51,7 @@ pg_start_backup(PG_FUNCTION_ARGS)
bool fast = PG_GETARG_BOOL(1);
char *backupidstr;
XLogRecPtr startpoint;
DIR *dir;
backupidstr = text_to_cstring(backupid);
......@@ -59,7 +60,16 @@ pg_start_backup(PG_FUNCTION_ARGS)
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser or replication role to run a backup")));
startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL);
/* Make sure we can open the directory with tablespaces in it */
dir = AllocateDir("pg_tblspc");
if (!dir)
ereport(ERROR,
(errmsg("could not open directory \"%s\": %m", "pg_tblspc")));
startpoint = do_pg_start_backup(backupidstr, fast, NULL, NULL,
dir, NULL, NULL, false, true);
FreeDir(dir);
PG_RETURN_LSN(startpoint);
}
......
......@@ -46,11 +46,12 @@ typedef struct
bool nowait;
bool includewal;
uint32 maxrate;
bool sendtblspcmapfile;
} basebackup_options;
static int64 sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces);
static int64 sendTablespace(char *path, bool sizeonly);
static int64 sendDir(char *path, int basepathlen, bool sizeonly,
List *tablespaces, bool sendtblspclinks);
static bool sendFile(char *readfilename, char *tarfilename,
struct stat * statbuf, bool missing_ok);
static void sendFileWithContent(const char *filename, const char *content);
......@@ -93,15 +94,6 @@ static int64 elapsed_min_unit;
/* The last check of the transfer rate. */
static int64 throttled_last;
typedef struct
{
char *oid;
char *path;
char *rpath; /* relative path within PGDATA, or NULL */
int64 size;
} tablespaceinfo;
/*
* Called when ERROR or FATAL happens in perform_base_backup() after
* we have started the backup - make sure we end it!
......@@ -126,14 +118,18 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir)
XLogRecPtr endptr;
TimeLineID endtli;
char *labelfile;
char *tblspc_map_file = NULL;
int datadirpathlen;
List *tablespaces = NIL;
datadirpathlen = strlen(DataDir);
backup_started_in_recovery = RecoveryInProgress();
startptr = do_pg_start_backup(opt->label, opt->fastcheckpoint, &starttli,
&labelfile);
&labelfile, tblspcdir, &tablespaces,
&tblspc_map_file,
opt->progress, opt->sendtblspcmapfile);
/*
* Once do_pg_start_backup has been called, ensure that any failure causes
* us to abort the backup so we don't "leak" a backup counter. For this reason,
......@@ -143,9 +139,7 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir)
PG_ENSURE_ERROR_CLEANUP(base_backup_cleanup, (Datum) 0);
{
List *tablespaces = NIL;
ListCell *lc;
struct dirent *de;
tablespaceinfo *ti;
SendXlogRecPtrResult(startptr, starttli);
......@@ -162,70 +156,9 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir)
else
statrelpath = pgstat_stat_directory;
/* Collect information about all tablespaces */
while ((de = ReadDir(tblspcdir, "pg_tblspc")) != NULL)
{
char fullpath[MAXPGPATH];
char linkpath[MAXPGPATH];
char *relpath = NULL;
int rllen;
/* Skip special stuff */
if (strcmp(de->d_name, ".") == 0 || strcmp(de->d_name, "..") == 0)
continue;
snprintf(fullpath, sizeof(fullpath), "pg_tblspc/%s", de->d_name);
#if defined(HAVE_READLINK) || defined(WIN32)
rllen = readlink(fullpath, linkpath, sizeof(linkpath));
if (rllen < 0)
{
ereport(WARNING,
(errmsg("could not read symbolic link \"%s\": %m",
fullpath)));
continue;
}
else if (rllen >= sizeof(linkpath))
{
ereport(WARNING,
(errmsg("symbolic link \"%s\" target is too long",
fullpath)));
continue;
}
linkpath[rllen] = '\0';
/*
* Relpath holds the relative path of the tablespace directory
* when it's located within PGDATA, or NULL if it's located
* elsewhere.
*/
if (rllen > datadirpathlen &&
strncmp(linkpath, DataDir, datadirpathlen) == 0 &&
IS_DIR_SEP(linkpath[datadirpathlen]))
relpath = linkpath + datadirpathlen + 1;
ti = palloc(sizeof(tablespaceinfo));
ti->oid = pstrdup(de->d_name);
ti->path = pstrdup(linkpath);
ti->rpath = relpath ? pstrdup(relpath) : NULL;
ti->size = opt->progress ? sendTablespace(fullpath, true) : -1;
tablespaces = lappend(tablespaces, ti);
#else
/*
* If the platform does not have symbolic links, it should not be
* possible to have tablespaces - clearly somebody else created
* them. Warn about it and ignore.
*/
ereport(WARNING,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("tablespaces are not supported on this platform")));
#endif
}
/* Add a node for the base directory at the end */
ti = palloc0(sizeof(tablespaceinfo));
ti->size = opt->progress ? sendDir(".", 1, true, tablespaces) : -1;
ti->size = opt->progress ? sendDir(".", 1, true, tablespaces, true) : -1;
tablespaces = lappend(tablespaces, ti);
/* Send tablespace header */
......@@ -274,8 +207,17 @@ perform_base_backup(basebackup_options *opt, DIR *tblspcdir)
/* In the main tar, include the backup_label first... */
sendFileWithContent(BACKUP_LABEL_FILE, labelfile);
/* ... then the bulk of the files ... */
sendDir(".", 1, false, tablespaces);
/*
* Send tablespace_map file if required and then the bulk of
* the files.
*/
if (tblspc_map_file && opt->sendtblspcmapfile)
{
sendFileWithContent(TABLESPACE_MAP, tblspc_map_file);
sendDir(".", 1, false, tablespaces, false);
}
else
sendDir(".", 1, false, tablespaces, true);
/* ... and pg_control after everything else. */
if (lstat(XLOG_CONTROL_FILE, &statbuf) != 0)
......@@ -567,6 +509,7 @@ parse_basebackup_options(List *options, basebackup_options *opt)
bool o_nowait = false;
bool o_wal = false;
bool o_maxrate = false;
bool o_tablespace_map = false;
MemSet(opt, 0, sizeof(*opt));
foreach(lopt, options)
......@@ -637,6 +580,15 @@ parse_basebackup_options(List *options, basebackup_options *opt)
opt->maxrate = (uint32) maxrate;
o_maxrate = true;
}
else if (strcmp(defel->defname, "tablespace_map") == 0)
{
if (o_tablespace_map)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("duplicate option \"%s\"", defel->defname)));
opt->sendtblspcmapfile = true;
o_tablespace_map = true;
}
else
elog(ERROR, "option \"%s\" not recognized",
defel->defname);
......@@ -865,7 +817,7 @@ sendFileWithContent(const char *filename, const char *content)
*
* Only used to send auxiliary tablespaces, not PGDATA.
*/
static int64
int64
sendTablespace(char *path, bool sizeonly)
{
int64 size;
......@@ -899,7 +851,7 @@ sendTablespace(char *path, bool sizeonly)
size = 512; /* Size of the header just added */
/* Send all the files in the tablespace version directory */
size += sendDir(pathbuf, strlen(path), sizeonly, NIL);
size += sendDir(pathbuf, strlen(path), sizeonly, NIL, true);
return size;
}
......@@ -911,9 +863,14 @@ sendTablespace(char *path, bool sizeonly)
*
* Omit any directory in the tablespaces list, to avoid backing up
* tablespaces twice when they were created inside PGDATA.
*
* If sendtblspclinks is true, we need to include symlink
* information in the tar file. If not, we can skip that
* as it will be sent separately in the tablespace_map file.
*/
static int64
sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces)
sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces,
bool sendtblspclinks)
{
DIR *dir;
struct dirent *de;
......@@ -941,13 +898,17 @@ sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces)
continue;
/*
* If there's a backup_label file, it belongs to a backup started by
* the user with pg_start_backup(). It is *not* correct for this
* backup, our backup_label is injected into the tar separately.
* If there's a backup_label or tablespace_map file, it belongs to a
* backup started by the user with pg_start_backup(). It is *not*
* correct for this backup, our backup_label/tablespace_map is injected
* into the tar separately.
*/
if (strcmp(de->d_name, BACKUP_LABEL_FILE) == 0)
continue;
if (strcmp(de->d_name, TABLESPACE_MAP) == 0)
continue;
/*
* Check if the postmaster has signaled us to exit, and abort with an
* error in that case. The error handler further up will call
......@@ -1120,8 +1081,15 @@ sendDir(char *path, int basepathlen, bool sizeonly, List *tablespaces)
break;
}
}
/*
* skip sending directories inside pg_tblspc, if not required.
*/
if (strcmp(pathbuf, "./pg_tblspc") == 0 && !sendtblspclinks)
skip_this_dir = true;
if (!skip_this_dir)
size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces);
size += sendDir(pathbuf, basepathlen, sizeonly, tablespaces, sendtblspclinks);
}
else if (S_ISREG(statbuf.st_mode))
{
......
......@@ -71,13 +71,16 @@ Node *replication_parse_result;
%token K_NOWAIT
%token K_MAX_RATE
%token K_WAL
%token K_TABLESPACE_MAP
%token K_TIMELINE
%token K_PHYSICAL
%token K_LOGICAL
%token K_SLOT
%type <node> command
%type <node> base_backup start_replication start_logical_replication create_replication_slot drop_replication_slot identify_system timeline_history
%type <node> base_backup start_replication start_logical_replication
create_replication_slot drop_replication_slot identify_system
timeline_history
%type <list> base_backup_opt_list
%type <defelt> base_backup_opt
%type <uintval> opt_timeline
......@@ -119,12 +122,14 @@ identify_system:
;
/*
* BASE_BACKUP [LABEL '<label>'] [PROGRESS] [FAST] [WAL] [NOWAIT] [MAX_RATE %d]
* BASE_BACKUP [LABEL '<label>'] [PROGRESS] [FAST] [WAL] [NOWAIT]
* [MAX_RATE %d] [TABLESPACE_MAP]
*/
base_backup:
K_BASE_BACKUP base_backup_opt_list
{
BaseBackupCmd *cmd = (BaseBackupCmd *) makeNode(BaseBackupCmd);
BaseBackupCmd *cmd =
(BaseBackupCmd *) makeNode(BaseBackupCmd);
cmd->options = $2;
$$ = (Node *) cmd;
}
......@@ -168,6 +173,11 @@ base_backup_opt:
$$ = makeDefElem("max_rate",
(Node *)makeInteger($2));
}
| K_TABLESPACE_MAP
{
$$ = makeDefElem("tablespace_map",
(Node *)makeInteger(TRUE));
}
;
create_replication_slot:
......
......@@ -88,6 +88,7 @@ NOWAIT { return K_NOWAIT; }
PROGRESS { return K_PROGRESS; }
MAX_RATE { return K_MAX_RATE; }
WAL { return K_WAL; }
TABLESPACE_MAP { return K_TABLESPACE_MAP; }
TIMELINE { return K_TIMELINE; }
START_REPLICATION { return K_START_REPLICATION; }
CREATE_REPLICATION_SLOT { return K_CREATE_REPLICATION_SLOT; }
......
......@@ -1652,13 +1652,14 @@ BaseBackup(void)
maxrate_clause = psprintf("MAX_RATE %u", maxrate);
basebkp =
psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s",
psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s",
escaped_label,
showprogress ? "PROGRESS" : "",
includewal && !streamwal ? "WAL" : "",
fastcheckpoint ? "FAST" : "",
includewal ? "NOWAIT" : "",
maxrate_clause ? maxrate_clause : "");
maxrate_clause ? maxrate_clause : "",
format == 't' ? "TABLESPACE_MAP": "");
if (PQsendQuery(conn, basebkp) == 0)
{
......
......@@ -17,6 +17,8 @@
#include "access/xlogreader.h"
#include "datatype/timestamp.h"
#include "lib/stringinfo.h"
#include "nodes/pg_list.h"
#include "storage/fd.h"
/* Sync methods */
......@@ -258,7 +260,9 @@ extern void assign_checkpoint_completion_target(double newval, void *extra);
* Starting/stopping a base backup
*/
extern XLogRecPtr do_pg_start_backup(const char *backupidstr, bool fast,
TimeLineID *starttli_p, char **labelfile);
TimeLineID *starttli_p, char **labelfile, DIR *tblspcdir,
List **tablespaces, char **tblspcmapfile, bool infotbssize,
bool needtblspcmapfile);
extern XLogRecPtr do_pg_stop_backup(char *labelfile, bool waitforarchive,
TimeLineID *stoptli_p);
extern void do_pg_abort_backup(void);
......@@ -267,4 +271,7 @@ extern void do_pg_abort_backup(void);
#define BACKUP_LABEL_FILE "backup_label"
#define BACKUP_LABEL_OLD "backup_label.old"
#define TABLESPACE_MAP "tablespace_map"
#define TABLESPACE_MAP_OLD "tablespace_map.old"
#endif /* XLOG_H */
......@@ -21,6 +21,16 @@
#define MAX_RATE_UPPER 1048576
typedef struct
{
char *oid;
char *path;
char *rpath; /* relative path within PGDATA, or NULL */
int64 size;
} tablespaceinfo;
extern void SendBaseBackup(BaseBackupCmd *cmd);
extern int64 sendTablespace(char *path, bool sizeonly);
#endif /* _BASEBACKUP_H */
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment