Commit 61081e75 authored by Heikki Linnakangas's avatar Heikki Linnakangas

Add pg_rewind, for re-synchronizing a master server after failback.

Earlier versions of this tool were available (and still are) on github.

Thanks to Michael Paquier, Alvaro Herrera, Peter Eisentraut, Amit Kapila,
and Satoshi Nagayasu for review.
parent 87cec51d
...@@ -1272,7 +1272,9 @@ primary_slot_name = 'node_a_slot' ...@@ -1272,7 +1272,9 @@ primary_slot_name = 'node_a_slot'
and might stay down. To return to normal operation, a standby server and might stay down. To return to normal operation, a standby server
must be recreated, must be recreated,
either on the former primary system when it comes up, or on a third, either on the former primary system when it comes up, or on a third,
possibly new, system. Once complete, the primary and standby can be possibly new, system. The <xref linkend="app-pgrewind"> utility can be
used to speed up this process on large clusters.
Once complete, the primary and standby can be
considered to have switched roles. Some people choose to use a third considered to have switched roles. Some people choose to use a third
server to provide backup for the new primary until the new standby server to provide backup for the new primary until the new standby
server is recreated, server is recreated,
......
...@@ -190,6 +190,7 @@ Complete list of usable sgml source files in this directory. ...@@ -190,6 +190,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY pgRecvlogical SYSTEM "pg_recvlogical.sgml"> <!ENTITY pgRecvlogical SYSTEM "pg_recvlogical.sgml">
<!ENTITY pgResetxlog SYSTEM "pg_resetxlog.sgml"> <!ENTITY pgResetxlog SYSTEM "pg_resetxlog.sgml">
<!ENTITY pgRestore SYSTEM "pg_restore.sgml"> <!ENTITY pgRestore SYSTEM "pg_restore.sgml">
<!ENTITY pgRewind SYSTEM "pg_rewind.sgml">
<!ENTITY postgres SYSTEM "postgres-ref.sgml"> <!ENTITY postgres SYSTEM "postgres-ref.sgml">
<!ENTITY postmaster SYSTEM "postmaster.sgml"> <!ENTITY postmaster SYSTEM "postmaster.sgml">
<!ENTITY psqlRef SYSTEM "psql-ref.sgml"> <!ENTITY psqlRef SYSTEM "psql-ref.sgml">
......
<!--
doc/src/sgml/ref/pg_rewind.sgml
PostgreSQL documentation
-->
<refentry id="app-pgrewind">
<indexterm zone="app-pgrewind">
<primary>pg_rewind</primary>
</indexterm>
<refmeta>
<refentrytitle><application>pg_rewind</application></refentrytitle>
<manvolnum>1</manvolnum>
<refmiscinfo>Application</refmiscinfo>
</refmeta>
<refnamediv>
<refname>pg_rewind</refname>
<refpurpose>synchronize a <productname>PostgreSQL</productname> data directory with another data directory that was forked from the first one</refpurpose>
</refnamediv>
<refsynopsisdiv>
<cmdsynopsis>
<command>pg_rewind</command>
<arg rep="repeat"><replaceable>option</replaceable></arg>
<group choice="plain">
<group choice="req">
<arg choice="plain"><option>-D </option></arg>
<arg choice="plain"><option>--target-pgdata</option></arg>
</group>
<replaceable> directory</replaceable>
<group choice="req">
<arg choice="plain"><option>--source-pgdata=<replaceable>directory</replaceable></option></arg>
<arg choice="plain"><option>--source-server=<replaceable>connstr</replaceable></option></arg>
</group>
</group>
</cmdsynopsis>
</refsynopsisdiv>
<refsect1>
<title>Description</title>
<para>
<application>pg_rewind</> is a tool for synchronizing a PostgreSQL cluster
with another copy of the same cluster, after the clusters' timelines have
diverged. A typical scenario is to bring an old master server back online
after failover, as a standby that follows the new master.
</para>
<para>
The result is equivalent to replacing the target data directory with the
source one. All files are copied, including configuration files. The
advantage of <application>pg_rewind</> over taking a new base backup, or
tools like <application>rsync</>, is that <application>pg_rewind</> does
not require reading through all unchanged files in the cluster. That makes
it a lot faster when the database is large and only a small portion of it
differs between the clusters.
</para>
<para>
<application>pg_rewind</> examines the timeline histories of the source
and target clusters to determine the point where they diverged, and
expects to find WAL in the target cluster's <filename>pg_xlog</> directory
reaching all the way back to the point of divergence. In the typical
failover scenario where the target cluster was shut down soon after the
divergence, that is not a problem, but if the target cluster had run for a
long time after the divergence, the old WAL files might not be present
anymore. In that case, they can be manually copied from the WAL archive to
the <filename>pg_xlog</> directory. Fetching missing files from a WAL
archive automatically is currently not supported.
</para>
<para>
When the target server is started up for the first time after running
<application>pg_rewind</>, it will go into recovery mode and replay all
WAL generated in the source server after the point of divergence.
If some of the WAL was no longer available in the source server when
<application>pg_rewind</> was run, and therefore could not be copied by
<application>pg_rewind</> session, it needs to be made available when the
target server is started up. That can be done by creating a
<filename>recovery.conf</> file in the target data directory with a
suitable <varname>restore_command</>.
</para>
</refsect1>
<refsect1>
<title>Options</title>
<para>
<application>pg_rewind</application> accepts the following command-line
arguments:
<variablelist>
<varlistentry>
<term><option>-D</option></term>
<term><option>--target-pgdata</option></term>
<listitem>
<para>
This option specifies the target data directory that is synchronized
with the source. The target server must shut down cleanly before
running <application>pg_rewind</application>
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--source-pgdata</option></term>
<listitem>
<para>
Specifies path to the data directory of the source server, to
synchronize the target with. When <option>--source-pgdata</> is
used, the source server must be cleanly shut down.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--source-server</option></term>
<listitem>
<para>
Specifies a libpq connection string to connect to the source
<productname>PostgreSQL</> server to synchronize the target with.
The server must be up and running, and must not be in recovery mode.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-n</option></term>
<term><option>--dry-run</option></term>
<listitem>
<para>
Do everything except actually modifying the target directory.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-P</option></term>
<term><option>--progress</option></term>
<listitem>
<para>
Enables progress reporting. Turning this on will deliver an approximate
progress report while copying data over from the source cluster.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--debug</option></term>
<listitem>
<para>
Print verbose debugging output that is mostly useful for developers
debugging <application>pg_rewind</>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-V</option></term>
<term><option>--version</option></term>
<listitem><para>Display version information, then exit</para></listitem>
</varlistentry>
<varlistentry>
<term><option>-?</option></term>
<term><option>--help</option></term>
<listitem><para>Show help, then exit</para></listitem>
</varlistentry>
</variablelist>
</para>
</refsect1>
<refsect1>
<title>Environment</title>
<para>
When <option>--source-server</> option is used,
<application>pg_rewind</application> also uses the environment variables
supported by <application>libpq</> (see <xref linkend="libpq-envars">).
</para>
</refsect1>
<refsect1>
<title>Notes</title>
<para>
<application>pg_rewind</> requires that the <varname>wal_log_hints</>
option is enabled in <filename>postgresql.conf</>, or that data checksums
were enabled when the cluster was initialized with <application>initdb</>.
<varname>full_page_writes</> must also be enabled.
</para>
<refsect2>
<title>How it works</title>
<para>
The basic idea is to copy everything from the new cluster to the old
cluster, except for the blocks that we know to be the same.
</para>
<procedure>
<step>
<para>
Scan the WAL log of the old cluster, starting from the last checkpoint
before the point where the new cluster's timeline history forked off
from the old cluster. For each WAL record, make a note of the data
blocks that were touched. This yields a list of all the data blocks
that were changed in the old cluster, after the new cluster forked off.
</para>
</step>
<step>
<para>
Copy all those changed blocks from the new cluster to the old cluster.
</para>
</step>
<step>
<para>
Copy all other files like clog, conf files etc. from the new cluster
to old cluster. Everything except the relation files.
</para>
</step>
<step>
<para>
Apply the WAL from the new cluster, starting from the checkpoint
created at failover. (Strictly speaking, <application>pg_rewind</>
doesn't apply the WAL, it just creates a backup label file indicating
that when <productname>PostgreSQL</> is started, it will start replay
from that checkpoint and apply all the required WAL.)
</para>
</step>
</procedure>
</refsect2>
</refsect1>
</refentry>
...@@ -260,6 +260,7 @@ ...@@ -260,6 +260,7 @@
&pgControldata; &pgControldata;
&pgCtl; &pgCtl;
&pgResetxlog; &pgResetxlog;
&pgRewind;
&postgres; &postgres;
&postmaster; &postmaster;
......
...@@ -21,6 +21,7 @@ SUBDIRS = \ ...@@ -21,6 +21,7 @@ SUBDIRS = \
pg_ctl \ pg_ctl \
pg_dump \ pg_dump \
pg_resetxlog \ pg_resetxlog \
pg_rewind \
psql \ psql \
scripts scripts
......
# Files generated during build
/xlogreader.c
/pg_rewind
# Generated by test suite
/tmp_check/
/regress_log/
#-------------------------------------------------------------------------
#
# Makefile for src/bin/pg_rewind
#
# Portions Copyright (c) 2013-2015, PostgreSQL Global Development Group
#
# src/bin/pg_rewind/Makefile
#
#-------------------------------------------------------------------------
PGFILEDESC = "pg_rewind - repurpose an old master server as standby"
PGAPPICON = win32
subdir = src/bin/pg_rewind
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
PG_CPPFLAGS = -I$(libpq_srcdir)
PG_LIBS = $(libpq_pgport)
override CPPFLAGS := -I$(libpq_srcdir) -DFRONTEND $(CPPFLAGS)
OBJS = pg_rewind.o parsexlog.o xlogreader.o datapagemap.o timeline.o \
fetch.o file_ops.o copy_fetch.o libpq_fetch.o filemap.o logging.o \
$(WIN32RES)
EXTRA_CLEAN = $(RMGRDESCSOURCES) xlogreader.c
all: pg_rewind
pg_rewind: $(OBJS) | submake-libpq submake-libpgport
$(CC) $(CFLAGS) $^ $(libpq_pgport) $(LDFLAGS) $(LDFLAGS_EX) $(LIBS) -o $@$(X)
xlogreader.c: % : $(top_srcdir)/src/backend/access/transam/%
rm -f $@ && $(LN_S) $< .
install: all installdirs
$(INSTALL_PROGRAM) pg_rewind$(X) '$(DESTDIR)$(bindir)/pg_rewind$(X)'
installdirs:
$(MKDIR_P) '$(DESTDIR)$(bindir)'
uninstall:
rm -f '$(DESTDIR)$(bindir)/pg_rewind$(X)'
clean distclean maintainer-clean:
rm -f pg_rewind$(X) $(OBJS) xlogreader.c
rm -rf tmp_check
check: all
$(prove_check) :: local
$(prove_check) :: remote
package RewindTest;
# Test driver for pg_rewind. Each test consists of a cycle where a new cluster
# is first created with initdb, and a streaming replication standby is set up
# to follow the master. Then the master is shut down and the standby is
# promoted, and finally pg_rewind is used to rewind the old master, using the
# standby as the source.
#
# To run a test, the test script (in t/ subdirectory) calls the functions
# in this module. These functions should be called in this sequence:
#
# 1. init_rewind_test - sets up log file etc.
#
# 2. setup_cluster - creates a PostgreSQL cluster that runs as the master
#
# 3. create_standby - runs pg_basebackup to initialize a standby server, and
# sets it up to follow the master.
#
# 4. promote_standby - runs "pg_ctl promote" to promote the standby server.
# The old master keeps running.
#
# 5. run_pg_rewind - stops the old master (if it's still running) and runs
# pg_rewind to synchronize it with the now-promoted standby server.
#
# The test script can use the helper functions master_psql and standby_psql
# to run psql against the master and standby servers, respectively. The
# test script can also use the $connstr_master and $connstr_standby global
# variables, which contain libpq connection strings for connecting to the
# master and standby servers. The data directories are also available
# in paths $test_master_datadir and $test_standby_datadir
use TestLib;
use Test::More;
use File::Copy;
use File::Path qw(remove_tree);
use IPC::Run qw(run start);
use Exporter 'import';
our @EXPORT = qw(
$connstr_master
$connstr_standby
$test_master_datadir
$test_standby_datadir
append_to_file
master_psql
standby_psql
check_query
init_rewind_test
setup_cluster
create_standby
promote_standby
run_pg_rewind
);
# Adjust these paths for your environment
my $testroot = "./tmp_check";
$test_master_datadir="$testroot/data_master";
$test_standby_datadir="$testroot/data_standby";
mkdir $testroot;
# Log files are created here
mkdir "regress_log";
# Define non-conflicting ports for both nodes.
my $port_master=$ENV{PGPORT};
my $port_standby=$port_master + 1;
my $log_path;
my $tempdir_short;
$connstr_master="port=$port_master";
$connstr_standby="port=$port_standby";
$ENV{PGDATABASE} = "postgres";
sub master_psql
{
my $cmd = shift;
system_or_bail("psql -q --no-psqlrc -d $connstr_master -c \"$cmd\"");
}
sub standby_psql
{
my $cmd = shift;
system_or_bail("psql -q --no-psqlrc -d $connstr_standby -c \"$cmd\"");
}
# Run a query against the master, and check that the output matches what's
# expected
sub check_query
{
my ($query, $expected_stdout, $test_name) = @_;
my ($stdout, $stderr);
# we want just the output, no formatting
my $result = run ['psql', '-q', '-A', '-t', '--no-psqlrc',
'-d', $connstr_master,
'-c' , $query],
'>', \$stdout, '2>', \$stderr;
# We don't use ok() for the exit code and stderr, because we want this
# check to be just a single test.
if (!$result) {
fail ("$test_name: psql exit code");
} elsif ($stderr ne '') {
diag $stderr;
fail ("$test_name: psql no stderr");
} else {
is ($stdout, $expected_stdout, "$test_name: query result matches");
}
}
sub append_to_file
{
my($filename, $str) = @_;
open my $fh, ">>", $filename or die "could not open file $filename";
print $fh $str;
close $fh;
}
sub init_rewind_test
{
($testname, $test_mode) = @_;
$log_path="regress_log/pg_rewind_log_${testname}_${test_mode}";
remove_tree $log_path;
}
sub setup_cluster
{
$tempdir_short = tempdir_short;
# Initialize master, data checksums are mandatory
remove_tree($test_master_datadir);
standard_initdb($test_master_datadir);
# Custom parameters for master's postgresql.conf
append_to_file("$test_master_datadir/postgresql.conf", qq(
wal_level = hot_standby
max_wal_senders = 2
wal_keep_segments = 20
max_wal_size = 200MB
shared_buffers = 1MB
wal_log_hints = on
hot_standby = on
autovacuum = off
max_connections = 10
));
# Accept replication connections on master
append_to_file("$test_master_datadir/pg_hba.conf", qq(
local replication all trust
));
system_or_bail("pg_ctl -w -D $test_master_datadir -o \"-k $tempdir_short --listen-addresses='' -p $port_master\" start >>$log_path 2>&1");
#### Now run the test-specific parts to initialize the master before setting
# up standby
$ENV{PGHOST} = $tempdir_short;
}
sub create_standby
{
# Set up standby with necessary parameter
remove_tree $test_standby_datadir;
# Base backup is taken with xlog files included
system_or_bail("pg_basebackup -D $test_standby_datadir -p $port_master -x >>$log_path 2>&1");
append_to_file("$test_standby_datadir/recovery.conf", qq(
primary_conninfo='$connstr_master'
standby_mode=on
recovery_target_timeline='latest'
));
# Start standby
system_or_bail("pg_ctl -w -D $test_standby_datadir -o \"-k $tempdir_short --listen-addresses='' -p $port_standby\" start >>$log_path 2>&1");
# sleep a bit to make sure the standby has caught up.
sleep 1;
}
sub promote_standby
{
#### Now run the test-specific parts to run after standby has been started
# up standby
# Now promote slave and insert some new data on master, this will put
# the master out-of-sync with the standby.
system_or_bail("pg_ctl -w -D $test_standby_datadir promote >>$log_path 2>&1");
sleep 1;
}
sub run_pg_rewind
{
# Stop the master and be ready to perform the rewind
system_or_bail("pg_ctl -w -D $test_master_datadir stop -m fast >>$log_path 2>&1");
# At this point, the rewind processing is ready to run.
# We now have a very simple scenario with a few diverged WAL record.
# The real testing begins really now with a bifurcation of the possible
# scenarios that pg_rewind supports.
# Keep a temporary postgresql.conf for master node or it would be
# overwritten during the rewind.
copy("$test_master_datadir/postgresql.conf", "$testroot/master-postgresql.conf.tmp");
# Now run pg_rewind
if ($test_mode == "local")
{
# Do rewind using a local pgdata as source
# Stop the master and be ready to perform the rewind
system_or_bail("pg_ctl -w -D $test_standby_datadir stop -m fast >>$log_path 2>&1");
my $result =
run(['./pg_rewind',
"--debug",
"--source-pgdata=$test_standby_datadir",
"--target-pgdata=$test_master_datadir"],
'>>', $log_path, '2>&1');
ok ($result, 'pg_rewind local');
}
elsif ($test_mode == "remote")
{
# Do rewind using a remote connection as source
my $result =
run(['./pg_rewind',
"--source-server=\"port=$port_standby dbname=postgres\"",
"--target-pgdata=$test_master_datadir"],
'>>', $log_path, '2>&1');
ok ($result, 'pg_rewind remote');
} else {
# Cannot come here normally
die("Incorrect test mode specified");
}
# Now move back postgresql.conf with old settings
move("$testroot/master-postgresql.conf.tmp", "$test_master_datadir/postgresql.conf");
# Plug-in rewound node to the now-promoted standby node
append_to_file("$test_master_datadir/recovery.conf", qq(
primary_conninfo='port=$port_standby'
standby_mode=on
recovery_target_timeline='latest'
));
# Restart the master to check that rewind went correctly
system_or_bail("pg_ctl -w -D $test_master_datadir -o \"-k $tempdir_short --listen-addresses='' -p $port_master\" start >>$log_path 2>&1");
#### Now run the test-specific parts to check the result
}
# Clean up after the test. Stop both servers, if they're still running.
END
{
my $save_rc = $?;
if ($test_master_datadir)
{
system "pg_ctl -D $test_master_datadir -s -m immediate stop 2> /dev/null";
}
if ($test_standby_datadir)
{
system "pg_ctl -D $test_standby_datadir -s -m immediate stop 2> /dev/null";
}
$? = $save_rc;
}
/*-------------------------------------------------------------------------
*
* copy_fetch.c
* Functions for using a data directory as the source.
*
* Portions Copyright (c) 2013-2015, PostgreSQL Global Development Group
*
*-------------------------------------------------------------------------
*/
#include "postgres_fe.h"
#include <sys/types.h>
#include <sys/stat.h>
#include <dirent.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include "datapagemap.h"
#include "fetch.h"
#include "file_ops.h"
#include "filemap.h"
#include "logging.h"
#include "pg_rewind.h"
#include "catalog/catalog.h"
static void recurse_dir(const char *datadir, const char *path,
process_file_callback_t callback);
static void execute_pagemap(datapagemap_t *pagemap, const char *path);
/*
* Traverse through all files in a data directory, calling 'callback'
* for each file.
*/
void
traverse_datadir(const char *datadir, process_file_callback_t callback)
{
recurse_dir(datadir, NULL, callback);
}
/*
* recursive part of traverse_datadir
*/
static void
recurse_dir(const char *datadir, const char *parentpath,
process_file_callback_t callback)
{
DIR *xldir;
struct dirent *xlde;
char fullparentpath[MAXPGPATH];
if (parentpath)
snprintf(fullparentpath, MAXPGPATH, "%s/%s", datadir, parentpath);
else
snprintf(fullparentpath, MAXPGPATH, "%s", datadir);
xldir = opendir(fullparentpath);
if (xldir == NULL)
pg_fatal("could not open directory \"%s\": %s\n",
fullparentpath, strerror(errno));
while (errno = 0, (xlde = readdir(xldir)) != NULL)
{
struct stat fst;
char fullpath[MAXPGPATH];
char path[MAXPGPATH];
if (strcmp(xlde->d_name, ".") == 0 ||
strcmp(xlde->d_name, "..") == 0)
continue;
snprintf(fullpath, MAXPGPATH, "%s/%s", fullparentpath, xlde->d_name);
if (lstat(fullpath, &fst) < 0)
{
pg_log(PG_WARNING, "could not stat file \"%s\": %s",
fullpath, strerror(errno));
/*
* This is ok, if the new master is running and the file was just
* removed. If it was a data file, there should be a WAL record of
* the removal. If it was something else, it couldn't have been
* critical anyway.
*
* TODO: But complain if we're processing the target dir!
*/
}
if (parentpath)
snprintf(path, MAXPGPATH, "%s/%s", parentpath, xlde->d_name);
else
snprintf(path, MAXPGPATH, "%s", xlde->d_name);
if (S_ISREG(fst.st_mode))
callback(path, FILE_TYPE_REGULAR, fst.st_size, NULL);
else if (S_ISDIR(fst.st_mode))
{
callback(path, FILE_TYPE_DIRECTORY, 0, NULL);
/* recurse to handle subdirectories */
recurse_dir(datadir, path, callback);
}
#ifndef WIN32
else if (S_ISLNK(fst.st_mode))
#else
else if (pgwin32_is_junction(fullpath))
#endif
{
#if defined(HAVE_READLINK) || defined(WIN32)
char link_target[MAXPGPATH];
ssize_t len;
len = readlink(fullpath, link_target, sizeof(link_target) - 1);
if (len == -1)
pg_fatal("readlink() failed on \"%s\": %s\n",
fullpath, strerror(errno));
if (len == sizeof(link_target) - 1)
{
/* path was truncated */
pg_fatal("symbolic link \"%s\" target path too long\n",
fullpath);
}
callback(path, FILE_TYPE_SYMLINK, 0, link_target);
/*
* If it's a symlink within pg_tblspc, we need to recurse into it,
* to process all the tablespaces.
*/
if (strcmp(parentpath, "pg_tblspc") == 0)
recurse_dir(datadir, path, callback);
#else
pg_fatal("\"%s\" is a symbolic link, but symbolic links are not supported on this platform\n",
fullpath);
#endif /* HAVE_READLINK */
}
}
if (errno)
pg_fatal("could not read directory \"%s\": %s\n",
fullparentpath, strerror(errno));
if (closedir(xldir))
pg_fatal("could not close archive location \"%s\": %s\n",
fullparentpath, strerror(errno));
}
/*
* Copy a file from source to target, between 'begin' and 'end' offsets.
*
* If 'trunc' is true, any existing file with the same name is truncated.
*/
static void
copy_file_range(const char *path, off_t begin, off_t end, bool trunc)
{
char buf[BLCKSZ];
char srcpath[MAXPGPATH];
int srcfd;
snprintf(srcpath, sizeof(srcpath), "%s/%s", datadir_source, path);
srcfd = open(srcpath, O_RDONLY | PG_BINARY, 0);
if (srcfd < 0)
pg_fatal("could not open source file \"%s\": %s\n",
srcpath, strerror(errno));
if (lseek(srcfd, begin, SEEK_SET) == -1)
pg_fatal("could not seek in source file: %s\n", strerror(errno));
open_target_file(path, trunc);
while (end - begin > 0)
{
int readlen;
int len;
if (end - begin > sizeof(buf))
len = sizeof(buf);
else
len = end - begin;
readlen = read(srcfd, buf, len);
if (readlen < 0)
pg_fatal("could not read file \"%s\": %s\n",
srcpath, strerror(errno));
else if (readlen == 0)
pg_fatal("unexpected EOF while reading file \"%s\"\n", srcpath);
write_target_range(buf, begin, readlen);
begin += readlen;
}
if (close(srcfd) != 0)
pg_fatal("error closing file \"%s\": %s\n", srcpath, strerror(errno));
}
/*
* Copy all relation data files from datadir_source to datadir_target, which
* are marked in the given data page map.
*/
void
copy_executeFileMap(filemap_t *map)
{
file_entry_t *entry;
int i;
for (i = 0; i < map->narray; i++)
{
entry = map->array[i];
execute_pagemap(&entry->pagemap, entry->path);
switch (entry->action)
{
case FILE_ACTION_NONE:
/* ok, do nothing.. */
break;
case FILE_ACTION_COPY:
copy_file_range(entry->path, 0, entry->newsize, true);
break;
case FILE_ACTION_TRUNCATE:
truncate_target_file(entry->path, entry->newsize);
break;
case FILE_ACTION_COPY_TAIL:
copy_file_range(entry->path, entry->oldsize, entry->newsize, false);
break;
case FILE_ACTION_CREATE:
create_target(entry);
break;
case FILE_ACTION_REMOVE:
remove_target(entry);
break;
}
}
close_target_file();
}
static void
execute_pagemap(datapagemap_t *pagemap, const char *path)
{
datapagemap_iterator_t *iter;
BlockNumber blkno;
off_t offset;
iter = datapagemap_iterate(pagemap);
while (datapagemap_next(iter, &blkno))
{
offset = blkno * BLCKSZ;
copy_file_range(path, offset, offset + BLCKSZ, false);
/* Ok, this block has now been copied from new data dir to old */
}
free(iter);
}
/*-------------------------------------------------------------------------
*
* datapagemap.c
* A data structure for keeping track of data pages that have changed.
*
* This is a fairly simple bitmap.
*
* Copyright (c) 2013-2015, PostgreSQL Global Development Group
*
*-------------------------------------------------------------------------
*/
#include "postgres_fe.h"
#include "datapagemap.h"
struct datapagemap_iterator
{
datapagemap_t *map;
BlockNumber nextblkno;
};
/*****
* Public functions
*/
/*
* Add a block to the bitmap.
*/
void
datapagemap_add(datapagemap_t *map, BlockNumber blkno)
{
int offset;
int bitno;
offset = blkno / 8;
bitno = blkno % 8;
/* enlarge or create bitmap if needed */
if (map->bitmapsize <= offset)
{
int oldsize = map->bitmapsize;
int newsize;
/*
* The minimum to hold the new bit is offset + 1. But add some
* headroom, so that we don't need to repeatedly enlarge the bitmap in
* the common case that blocks are modified in order, from beginning
* of a relation to the end.
*/
newsize = offset + 1;
newsize += 10;
map->bitmap = pg_realloc(map->bitmap, newsize);
/* zero out the newly allocated region */
memset(&map->bitmap[oldsize], 0, newsize - oldsize);
map->bitmapsize = newsize;
}
/* Set the bit */
map->bitmap[offset] |= (1 << bitno);
}
/*
* Start iterating through all entries in the page map.
*
* After datapagemap_iterate, call datapagemap_next to return the entries,
* until it returns NULL. After you're done, use free() to destroy the
* iterator.
*/
datapagemap_iterator_t *
datapagemap_iterate(datapagemap_t *map)
{
datapagemap_iterator_t *iter;
iter = pg_malloc(sizeof(datapagemap_iterator_t));
iter->map = map;
iter->nextblkno = 0;
return iter;
}
bool
datapagemap_next(datapagemap_iterator_t *iter, BlockNumber *blkno)
{
datapagemap_t *map = iter->map;
for (;;)
{
BlockNumber blk = iter->nextblkno;
int nextoff = blk / 8;
int bitno = blk % 8;
if (nextoff >= map->bitmapsize)
break;
iter->nextblkno++;
if (map->bitmap[nextoff] & (1 << bitno))
{
*blkno = blk;
return true;
}
}
/* no more set bits in this bitmap. */
return false;
}
/*
* A debugging aid. Prints out the contents of the page map.
*/
void
datapagemap_print(datapagemap_t *map)
{
datapagemap_iterator_t *iter;
BlockNumber blocknum;
iter = datapagemap_iterate(map);
while (datapagemap_next(iter, &blocknum))
printf(" blk %u\n", blocknum);
free(iter);
}
/*-------------------------------------------------------------------------
*
* datapagemap.h
*
* Copyright (c) 2013-2015, PostgreSQL Global Development Group
*
*-------------------------------------------------------------------------
*/
#ifndef DATAPAGEMAP_H
#define DATAPAGEMAP_H
#include "storage/relfilenode.h"
#include "storage/block.h"
struct datapagemap
{
char *bitmap;
int bitmapsize;
};
typedef struct datapagemap datapagemap_t;
typedef struct datapagemap_iterator datapagemap_iterator_t;
extern datapagemap_t *datapagemap_create(void);
extern void datapagemap_destroy(datapagemap_t *map);
extern void datapagemap_add(datapagemap_t *map, BlockNumber blkno);
extern datapagemap_iterator_t *datapagemap_iterate(datapagemap_t *map);
extern bool datapagemap_next(datapagemap_iterator_t *iter, BlockNumber *blkno);
extern void datapagemap_print(datapagemap_t *map);
#endif /* DATAPAGEMAP_H */
/*-------------------------------------------------------------------------
*
* fetch.c
* Functions for fetching files from a local or remote data dir
*
* This file forms an abstraction of getting files from the "source".
* There are two implementations of this interface: one for copying files
* from a data directory via normal filesystem operations (copy_fetch.c),
* and another for fetching files from a remote server via a libpq
* connection (libpq_fetch.c)
*
*
* Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
*
*-------------------------------------------------------------------------
*/
#include "postgres_fe.h"
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include "pg_rewind.h"
#include "fetch.h"
#include "file_ops.h"
#include "filemap.h"
void
fetchRemoteFileList(void)
{
if (datadir_source)
traverse_datadir(datadir_source, &process_remote_file);
else
libpqProcessFileList();
}
/*
* Fetch all relation data files that are marked in the given data page map.
*/
void
executeFileMap(void)
{
if (datadir_source)
copy_executeFileMap(filemap);
else
libpq_executeFileMap(filemap);
}
/*
* Fetch a single file into a malloc'd buffer. The file size is returned
* in *filesize. The returned buffer is always zero-terminated, which is
* handy for text files.
*/
char *
fetchFile(char *filename, size_t *filesize)
{
if (datadir_source)
return slurpFile(datadir_source, filename, filesize);
else
return libpqGetFile(filename, filesize);
}
/*-------------------------------------------------------------------------
*
* fetch.h
* Fetching data from a local or remote data directory.
*
* This file includes the prototypes for functions used to copy files from
* one data directory to another. The source to copy from can be a local
* directory (copy method), or a remote PostgreSQL server (libpq fetch
* method).
*
* Copyright (c) 2013-2015, PostgreSQL Global Development Group
*
*-------------------------------------------------------------------------
*/
#ifndef FETCH_H
#define FETCH_H
#include "c.h"
#include "access/xlogdefs.h"
#include "filemap.h"
/*
* Common interface. Calls the copy or libpq method depending on global
* config options.
*/
extern void fetchRemoteFileList(void);
extern char *fetchFile(char *filename, size_t *filesize);
extern void executeFileMap(void);
/* in libpq_fetch.c */
extern void libpqProcessFileList(void);
extern char *libpqGetFile(const char *filename, size_t *filesize);
extern void libpq_executeFileMap(filemap_t *map);
extern void libpqConnect(const char *connstr);
extern XLogRecPtr libpqGetCurrentXlogInsertLocation(void);
/* in copy_fetch.c */
extern void copy_executeFileMap(filemap_t *map);
typedef void (*process_file_callback_t) (const char *path, file_type_t type, size_t size, const char *link_target);
extern void traverse_datadir(const char *datadir, process_file_callback_t callback);
#endif /* FETCH_H */
/*-------------------------------------------------------------------------
*
* file_ops.c
* Helper functions for operating on files.
*
* Most of the functions in this file are helper functions for writing to
* the target data directory. The functions check the --dry-run flag, and
* do nothing if it's enabled. You should avoid accessing the target files
* directly but if you do, make sure you honor the --dry-run mode!
*
* Portions Copyright (c) 2013-2015, PostgreSQL Global Development Group
*
*-------------------------------------------------------------------------
*/
#include "postgres_fe.h"
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include "file_ops.h"
#include "filemap.h"
#include "logging.h"
#include "pg_rewind.h"
/*
* Currently open destination file.
*/
static int dstfd = -1;
static char dstpath[MAXPGPATH] = "";
static void remove_target_file(const char *path);
static void create_target_dir(const char *path);
static void remove_target_dir(const char *path);
static void create_target_symlink(const char *path, const char *link);
static void remove_target_symlink(const char *path);
/*
* Open a target file for writing. If 'trunc' is true and the file already
* exists, it will be truncated.
*/
void
open_target_file(const char *path, bool trunc)
{
int mode;
if (dry_run)
return;
if (dstfd != -1 && !trunc &&
strcmp(path, &dstpath[strlen(datadir_target) + 1]) == 0)
return; /* already open */
close_target_file();
snprintf(dstpath, sizeof(dstpath), "%s/%s", datadir_target, path);
mode = O_WRONLY | O_CREAT | PG_BINARY;
if (trunc)
mode |= O_TRUNC;
dstfd = open(dstpath, mode, 0600);
if (dstfd < 0)
pg_fatal("could not open destination file \"%s\": %s\n",
dstpath, strerror(errno));
}
/*
* Close target file, if it's open.
*/
void
close_target_file(void)
{
if (dstfd == -1)
return;
if (close(dstfd) != 0)
pg_fatal("error closing destination file \"%s\": %s\n",
dstpath, strerror(errno));
dstfd = -1;
/* fsync? */
}
void
write_target_range(char *buf, off_t begin, size_t size)
{
int writeleft;
char *p;
/* update progress report */
fetch_done += size;
progress_report(false);
if (dry_run)
return;
if (lseek(dstfd, begin, SEEK_SET) == -1)
pg_fatal("could not seek in destination file \"%s\": %s\n",
dstpath, strerror(errno));
writeleft = size;
p = buf;
while (writeleft > 0)
{
int writelen;
writelen = write(dstfd, p, writeleft);
if (writelen < 0)
pg_fatal("could not write file \"%s\": %s\n",
dstpath, strerror(errno));
p += writelen;
writeleft -= writelen;
}
/* keep the file open, in case we need to copy more blocks in it */
}
void
remove_target(file_entry_t *entry)
{
Assert(entry->action == FILE_ACTION_REMOVE);
switch (entry->type)
{
case FILE_TYPE_DIRECTORY:
remove_target_dir(entry->path);
break;
case FILE_TYPE_REGULAR:
remove_target_file(entry->path);
break;
case FILE_TYPE_SYMLINK:
remove_target_symlink(entry->path);
break;
}
}
void
create_target(file_entry_t *entry)
{
Assert(entry->action == FILE_ACTION_CREATE);
switch (entry->type)
{
case FILE_TYPE_DIRECTORY:
create_target_dir(entry->path);
break;
case FILE_TYPE_SYMLINK:
create_target_symlink(entry->path, entry->link_target);
break;
case FILE_TYPE_REGULAR:
/* can't happen. Regular files are created with open_target_file. */
pg_fatal("invalid action (CREATE) for regular file\n");
break;
}
}
static void
remove_target_file(const char *path)
{
char dstpath[MAXPGPATH];
if (dry_run)
return;
snprintf(dstpath, sizeof(dstpath), "%s/%s", datadir_target, path);
if (unlink(dstpath) != 0)
pg_fatal("could not remove file \"%s\": %s\n",
dstpath, strerror(errno));
}
void
truncate_target_file(const char *path, off_t newsize)
{
char dstpath[MAXPGPATH];
int fd;
if (dry_run)
return;
snprintf(dstpath, sizeof(dstpath), "%s/%s", datadir_target, path);
fd = open(dstpath, O_WRONLY, 0);
if (fd < 0)
pg_fatal("could not open file \"%s\" for truncation: %s\n",
dstpath, strerror(errno));
if (ftruncate(fd, newsize) != 0)
pg_fatal("could not truncate file \"%s\" to %u bytes: %s\n",
dstpath, (unsigned int) newsize, strerror(errno));
close(fd);
}
static void
create_target_dir(const char *path)
{
char dstpath[MAXPGPATH];
if (dry_run)
return;
snprintf(dstpath, sizeof(dstpath), "%s/%s", datadir_target, path);
if (mkdir(dstpath, S_IRWXU) != 0)
pg_fatal("could not create directory \"%s\": %s\n",
dstpath, strerror(errno));
}
static void
remove_target_dir(const char *path)
{
char dstpath[MAXPGPATH];
if (dry_run)
return;
snprintf(dstpath, sizeof(dstpath), "%s/%s", datadir_target, path);
if (rmdir(dstpath) != 0)
pg_fatal("could not remove directory \"%s\": %s\n",
dstpath, strerror(errno));
}
static void
create_target_symlink(const char *path, const char *link)
{
char dstpath[MAXPGPATH];
if (dry_run)
return;
snprintf(dstpath, sizeof(dstpath), "%s/%s", datadir_target, path);
if (symlink(link, dstpath) != 0)
pg_fatal("could not create symbolic link at \"%s\": %s\n",
dstpath, strerror(errno));
}
static void
remove_target_symlink(const char *path)
{
char dstpath[MAXPGPATH];
if (dry_run)
return;
snprintf(dstpath, sizeof(dstpath), "%s/%s", datadir_target, path);
if (unlink(dstpath) != 0)
pg_fatal("could not remove symbolic link \"%s\": %s\n",
dstpath, strerror(errno));
}
/*
* Read a file into memory. The file to be read is <datadir>/<path>.
* The file contents are returned in a malloc'd buffer, and *filesize
* is set to the length of the file.
*
* The returned buffer is always zero-terminated; the size of the returned
* buffer is actually *filesize + 1. That's handy when reading a text file.
* This function can be used to read binary files as well, you can just
* ignore the zero-terminator in that case.
*
* This function is used to implement the fetchFile function in the "fetch"
* interface (see fetch.c), but is also called directly.
*/
char *
slurpFile(const char *datadir, const char *path, size_t *filesize)
{
int fd;
char *buffer;
struct stat statbuf;
char fullpath[MAXPGPATH];
int len;
snprintf(fullpath, sizeof(fullpath), "%s/%s", datadir, path);
if ((fd = open(fullpath, O_RDONLY | PG_BINARY, 0)) == -1)
pg_fatal("could not open file \"%s\" for reading: %s\n",
fullpath, strerror(errno));
if (fstat(fd, &statbuf) < 0)
pg_fatal("could not open file \"%s\" for reading: %s\n",
fullpath, strerror(errno));
len = statbuf.st_size;
buffer = pg_malloc(len + 1);
if (read(fd, buffer, len) != len)
pg_fatal("could not read file \"%s\": %s\n",
fullpath, strerror(errno));
close(fd);
/* Zero-terminate the buffer. */
buffer[len] = '\0';
if (filesize)
*filesize = len;
return buffer;
}
/*-------------------------------------------------------------------------
*
* file_ops.h
* Helper functions for operating on files
*
* Copyright (c) 2013-2015, PostgreSQL Global Development Group
*
*-------------------------------------------------------------------------
*/
#ifndef FILE_OPS_H
#define FILE_OPS_H
#include "filemap.h"
extern void open_target_file(const char *path, bool trunc);
extern void write_target_range(char *buf, off_t begin, size_t size);
extern void close_target_file(void);
extern void truncate_target_file(const char *path, off_t newsize);
extern void create_target(file_entry_t *t);
extern void remove_target(file_entry_t *t);
extern char *slurpFile(const char *datadir, const char *path, size_t *filesize);
#endif /* FILE_OPS_H */
This diff is collapsed.
/*-------------------------------------------------------------------------
*
* filemap.h
*
* Copyright (c) 2013-2015, PostgreSQL Global Development Group
*-------------------------------------------------------------------------
*/
#ifndef FILEMAP_H
#define FILEMAP_H
#include "storage/relfilenode.h"
#include "storage/block.h"
#include "datapagemap.h"
/*
* For every file found in the local or remote system, we have a file entry
* which says what we are going to do with the file. For relation files,
* there is also a page map, marking pages in the file that were changed
* locally.
*
* The enum values are sorted in the order we want actions to be processed.
*/
typedef enum
{
FILE_ACTION_CREATE, /* create local directory or symbolic link */
FILE_ACTION_COPY, /* copy whole file, overwriting if exists */
FILE_ACTION_COPY_TAIL, /* copy tail from 'oldsize' to 'newsize' */
FILE_ACTION_NONE, /* no action (we might still copy modified blocks
* based on the parsed WAL) */
FILE_ACTION_TRUNCATE, /* truncate local file to 'newsize' bytes */
FILE_ACTION_REMOVE, /* remove local file / directory / symlink */
} file_action_t;
typedef enum
{
FILE_TYPE_REGULAR,
FILE_TYPE_DIRECTORY,
FILE_TYPE_SYMLINK
} file_type_t;
struct file_entry_t
{
char *path;
file_type_t type;
file_action_t action;
/* for a regular file */
size_t oldsize;
size_t newsize;
bool isrelfile; /* is it a relation data file? */
datapagemap_t pagemap;
/* for a symlink */
char *link_target;
struct file_entry_t *next;
};
typedef struct file_entry_t file_entry_t;
struct filemap_t
{
/*
* New entries are accumulated to a linked list, in process_remote_file
* and process_local_file.
*/
file_entry_t *first;
file_entry_t *last;
int nlist;
/*
* After processing all the remote files, the entries in the linked list
* are moved to this array. After processing local files, too, all the
* local entries are added to the array by filemap_finalize, and sorted
* in the final order. After filemap_finalize, all the entries are in
* the array, and the linked list is empty.
*/
file_entry_t **array;
int narray;
/*
* Summary information. total_size is the total size of the source cluster,
* and fetch_size is the number of bytes that needs to be copied.
*/
uint64 total_size;
uint64 fetch_size;
};
typedef struct filemap_t filemap_t;
extern filemap_t * filemap;
extern filemap_t *filemap_create(void);
extern void calculate_totals(void);
extern void print_filemap(void);
/* Functions for populating the filemap */
extern void process_remote_file(const char *path, file_type_t type, size_t newsize, const char *link_target);
extern void process_local_file(const char *path, file_type_t type, size_t newsize, const char *link_target);
extern void process_block_change(ForkNumber forknum, RelFileNode rnode, BlockNumber blkno);
extern void filemap_finalize(void);
#endif /* FILEMAP_H */
This diff is collapsed.
/*-------------------------------------------------------------------------
*
* logging.c
* logging functions
*
* Copyright (c) 2010-2015, PostgreSQL Global Development Group
*
*-------------------------------------------------------------------------
*/
#include "postgres_fe.h"
#include <unistd.h>
#include <time.h>
#include "pg_rewind.h"
#include "logging.h"
#include "pgtime.h"
/* Progress counters */
uint64 fetch_size;
uint64 fetch_done;
static pg_time_t last_progress_report = 0;
#define QUERY_ALLOC 8192
#define MESSAGE_WIDTH 60
static
pg_attribute_printf(2, 0)
void
pg_log_v(eLogType type, const char *fmt, va_list ap)
{
char message[QUERY_ALLOC];
vsnprintf(message, sizeof(message), fmt, ap);
switch (type)
{
case PG_DEBUG:
if (debug)
printf("%s", _(message));
break;
case PG_PROGRESS:
if (showprogress)
printf("%s", _(message));
break;
case PG_WARNING:
printf("%s", _(message));
break;
case PG_FATAL:
printf("\n%s", _(message));
printf("%s", _("Failure, exiting\n"));
exit(1);
break;
default:
break;
}
fflush(stdout);
}
void
pg_log(eLogType type, const char *fmt,...)
{
va_list args;
va_start(args, fmt);
pg_log_v(type, fmt, args);
va_end(args);
}
void
pg_fatal(const char *fmt,...)
{
va_list args;
va_start(args, fmt);
pg_log_v(PG_FATAL, fmt, args);
va_end(args);
/* should not get here, pg_log_v() exited already */
exit(1);
}
/*
* Print a progress report based on the global variables.
*
* Progress report is written at maximum once per second, unless the
* force parameter is set to true.
*/
void
progress_report(bool force)
{
int percent;
char fetch_done_str[32];
char fetch_size_str[32];
pg_time_t now;
if (!showprogress)
return;
now = time(NULL);
if (now == last_progress_report && !force)
return; /* Max once per second */
last_progress_report = now;
percent = fetch_size ? (int) ((fetch_done) * 100 / fetch_size) : 0;
/*
* Avoid overflowing past 100% or the full size. This may make the total
* size number change as we approach the end of the backup (the estimate
* will always be wrong if WAL is included), but that's better than having
* the done column be bigger than the total.
*/
if (percent > 100)
percent = 100;
if (fetch_done > fetch_size)
fetch_size = fetch_done;
/*
* Separate step to keep platform-dependent format code out of
* translatable strings. And we only test for INT64_FORMAT availability
* in snprintf, not fprintf.
*/
snprintf(fetch_done_str, sizeof(fetch_done_str), INT64_FORMAT,
fetch_done / 1024);
snprintf(fetch_size_str, sizeof(fetch_size_str), INT64_FORMAT,
fetch_size / 1024);
pg_log(PG_PROGRESS, "%*s/%s kB (%d%%) copied\r",
(int) strlen(fetch_size_str), fetch_done_str, fetch_size_str,
percent);
}
/*-------------------------------------------------------------------------
*
* logging.h
* prototypes for logging functions
*
*
* Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
*-------------------------------------------------------------------------
*/
#ifndef PG_REWIND_LOGGING_H
#define PG_REWIND_LOGGING_H
/* progress counters */
extern uint64 fetch_size;
extern uint64 fetch_done;
/*
* Enumeration to denote pg_log modes
*/
typedef enum
{
PG_DEBUG,
PG_PROGRESS,
PG_WARNING,
PG_FATAL
} eLogType;
extern void pg_log(eLogType type, const char *fmt,...)
pg_attribute_printf(2, 3);
extern void pg_fatal(const char *fmt,...)
pg_attribute_printf(1, 2) pg_attribute_noreturn;
extern void progress_report(bool force);
#endif
# src/bin/pg_rewind/nls.mk
CATALOG_NAME = pg_rewind
AVAIL_LANGUAGES =
GETTEXT_FILES = copy_fetch.c datapagemap.c fetch.c filemap.c libpq_fetch.c logging.c parsexlog.c pg_rewind.c timeline.c ../../common/fe_memutils.c ../../../src/backend/access/transam/xlogreader.c
GETTEXT_TRIGGERS = pg_log pg_fatal report_invalid_record:2
GETTEXT_FLAGS = pg_log:2:c-format \
pg_fatal:1:c-format \
report_invalid_record:2:c-format
This diff is collapsed.
This diff is collapsed.
/*-------------------------------------------------------------------------
*
* pg_rewind.h
*
*
* Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
*-------------------------------------------------------------------------
*/
#ifndef PG_REWIND_H
#define PG_REWIND_H
#include "c.h"
#include "datapagemap.h"
#include "access/timeline.h"
#include "storage/block.h"
#include "storage/relfilenode.h"
/* Configuration options */
extern char *datadir_target;
extern char *datadir_source;
extern char *connstr_source;
extern bool debug;
extern bool showprogress;
extern bool dry_run;
/* in parsexlog.c */
extern void extractPageMap(const char *datadir, XLogRecPtr startpoint,
TimeLineID tli, XLogRecPtr endpoint);
extern void findLastCheckpoint(const char *datadir, XLogRecPtr searchptr,
TimeLineID tli,
XLogRecPtr *lastchkptrec, TimeLineID *lastchkpttli,
XLogRecPtr *lastchkptredo);
extern XLogRecPtr readOneRecord(const char *datadir, XLogRecPtr ptr,
TimeLineID tli);
/* in timeline.c */
extern TimeLineHistoryEntry *rewind_parseTimeLineHistory(char *buffer,
TimeLineID targetTLI, int *nentries);
#endif /* PG_REWIND_H */
use strict;
use warnings;
use TestLib;
use Test::More tests => 4;
use RewindTest;
my $testmode = shift;
RewindTest::init_rewind_test('basic', $testmode);
RewindTest::setup_cluster();
# Create a test table and insert a row in master.
master_psql("CREATE TABLE tbl1 (d text)");
master_psql("INSERT INTO tbl1 VALUES ('in master')");
# This test table will be used to test truncation, i.e. the table
# is extended in the old master after promotion
master_psql("CREATE TABLE trunc_tbl (d text)");
master_psql("INSERT INTO trunc_tbl VALUES ('in master')");
# This test table will be used to test the "copy-tail" case, i.e. the
# table is truncated in the old master after promotion
master_psql("CREATE TABLE tail_tbl (id integer, d text)");
master_psql("INSERT INTO tail_tbl VALUES (0, 'in master')");
master_psql("CHECKPOINT");
RewindTest::create_standby();
# Insert additional data on master that will be replicated to standby
master_psql("INSERT INTO tbl1 values ('in master, before promotion')");
master_psql("INSERT INTO trunc_tbl values ('in master, before promotion')");
master_psql("INSERT INTO tail_tbl SELECT g, 'in master, before promotion: ' || g FROM generate_series(1, 10000) g");
master_psql('CHECKPOINT');
RewindTest::promote_standby();
# Insert a row in the old master. This causes the master and standby
# to have "diverged", it's no longer possible to just apply the
# standy's logs over master directory - you need to rewind.
master_psql("INSERT INTO tbl1 VALUES ('in master, after promotion')");
# Also insert a new row in the standby, which won't be present in the
# old master.
standby_psql("INSERT INTO tbl1 VALUES ('in standby, after promotion')");
# Insert enough rows to trunc_tbl to extend the file. pg_rewind should
# truncate it back to the old size.
master_psql("INSERT INTO trunc_tbl SELECT 'in master, after promotion: ' || g FROM generate_series(1, 10000) g");
# Truncate tail_tbl. pg_rewind should copy back the truncated part
# (We cannot use an actual TRUNCATE command here, as that creates a
# whole new relfilenode)
master_psql("DELETE FROM tail_tbl WHERE id > 10");
master_psql("VACUUM tail_tbl");
RewindTest::run_pg_rewind();
check_query('SELECT * FROM tbl1',
qq(in master
in master, before promotion
in standby, after promotion
),
'table content');
check_query('SELECT * FROM trunc_tbl',
qq(in master
in master, before promotion
),
'truncation');
check_query('SELECT count(*) FROM tail_tbl',
qq(10001
),
'tail-copy');
exit(0);
use strict;
use warnings;
use TestLib;
use Test::More tests => 2;
use RewindTest;
my $testmode = shift;
RewindTest::init_rewind_test('databases', $testmode);
RewindTest::setup_cluster();
# Create a database in master.
master_psql('CREATE DATABASE inmaster');
RewindTest::create_standby();
# Create another database, the creation is replicated to the standby
master_psql('CREATE DATABASE beforepromotion');
RewindTest::promote_standby();
# Create databases in the old master and the new promoted standby.
master_psql('CREATE DATABASE master_afterpromotion');
standby_psql('CREATE DATABASE standby_afterpromotion');
# The clusters are now diverged.
RewindTest::run_pg_rewind();
# Check that the correct databases are present after pg_rewind.
check_query('SELECT datname FROM pg_database',
qq(template1
template0
postgres
inmaster
beforepromotion
standby_afterpromotion
),
'database names');
exit(0);
# Test how pg_rewind reacts to extra files and directories in the data dirs.
use strict;
use warnings;
use TestLib;
use Test::More tests => 2;
use File::Find;
use RewindTest;
my $testmode = shift;
RewindTest::init_rewind_test('extrafiles', $testmode);
RewindTest::setup_cluster();
# Create a subdir and files that will be present in both
mkdir "$test_master_datadir/tst_both_dir";
append_to_file "$test_master_datadir/tst_both_dir/both_file1", "in both1";
append_to_file "$test_master_datadir/tst_both_dir/both_file2", "in both2";
mkdir "$test_master_datadir/tst_both_dir/both_subdir/";
append_to_file "$test_master_datadir/tst_both_dir/both_subdir/both_file3", "in both3";
RewindTest::create_standby();
# Create different subdirs and files in master and standby
mkdir "$test_standby_datadir/tst_standby_dir";
append_to_file "$test_standby_datadir/tst_standby_dir/standby_file1", "in standby1";
append_to_file "$test_standby_datadir/tst_standby_dir/standby_file2", "in standby2";
mkdir "$test_standby_datadir/tst_standby_dir/standby_subdir/";
append_to_file "$test_standby_datadir/tst_standby_dir/standby_subdir/standby_file3", "in standby3";
mkdir "$test_master_datadir/tst_master_dir";
append_to_file "$test_master_datadir/tst_master_dir/master_file1", "in master1";
append_to_file "$test_master_datadir/tst_master_dir/master_file2", "in master2";
mkdir "$test_master_datadir/tst_master_dir/master_subdir/";
append_to_file "$test_master_datadir/tst_master_dir/master_subdir/master_file3", "in master3";
RewindTest::promote_standby();
RewindTest::run_pg_rewind();
# List files in the data directory after rewind.
my @paths;
find(sub {push @paths, $File::Find::name if $File::Find::name =~ m/.*tst_.*/},
$test_master_datadir);
@paths = sort @paths;
is_deeply(\@paths,
["$test_master_datadir/tst_both_dir",
"$test_master_datadir/tst_both_dir/both_file1",
"$test_master_datadir/tst_both_dir/both_file2",
"$test_master_datadir/tst_both_dir/both_subdir",
"$test_master_datadir/tst_both_dir/both_subdir/both_file3",
"$test_master_datadir/tst_standby_dir",
"$test_master_datadir/tst_standby_dir/standby_file1",
"$test_master_datadir/tst_standby_dir/standby_file2",
"$test_master_datadir/tst_standby_dir/standby_subdir",
"$test_master_datadir/tst_standby_dir/standby_subdir/standby_file3"],
"file lists match");
exit(0);
/*-------------------------------------------------------------------------
*
* timeline.c
* timeline-related functions.
*
* Portions Copyright (c) 1996-2015, PostgreSQL Global Development Group
*
*-------------------------------------------------------------------------
*/
#include "postgres_fe.h"
#include "pg_rewind.h"
#include "access/timeline.h"
#include "access/xlog_internal.h"
/*
* This is copy-pasted from the backend readTimeLineHistory, modified to
* return a malloc'd array and to work without backend functions.
*/
/*
* Try to read a timeline's history file.
*
* If successful, return the list of component TLIs (the given TLI followed by
* its ancestor TLIs). If we can't find the history file, assume that the
* timeline has no parents, and return a list of just the specified timeline
* ID.
*/
TimeLineHistoryEntry *
rewind_parseTimeLineHistory(char *buffer, TimeLineID targetTLI, int *nentries)
{
char *fline;
TimeLineHistoryEntry *entry;
TimeLineHistoryEntry *entries = NULL;
int nlines = 0;
TimeLineID lasttli = 0;
XLogRecPtr prevend;
char *bufptr;
bool lastline = false;
/*
* Parse the file...
*/
prevend = InvalidXLogRecPtr;
bufptr = buffer;
while (!lastline)
{
char *ptr;
TimeLineID tli;
uint32 switchpoint_hi;
uint32 switchpoint_lo;
int nfields;
fline = bufptr;
while (*bufptr && *bufptr != '\n')
bufptr++;
if (!(*bufptr))
lastline = true;
else
*bufptr++ = '\0';
/* skip leading whitespace and check for # comment */
for (ptr = fline; *ptr; ptr++)
{
if (!isspace((unsigned char) *ptr))
break;
}
if (*ptr == '\0' || *ptr == '#')
continue;
nfields = sscanf(fline, "%u\t%X/%X", &tli, &switchpoint_hi, &switchpoint_lo);
if (nfields < 1)
{
/* expect a numeric timeline ID as first field of line */
printf(_("syntax error in history file: %s\n"), fline);
printf(_("Expected a numeric timeline ID.\n"));
exit(1);
}
if (nfields != 3)
{
printf(_("syntax error in history file: %s\n"), fline);
printf(_("Expected an XLOG switchpoint location.\n"));
exit(1);
}
if (entries && tli <= lasttli)
{
printf(_("invalid data in history file: %s\n"), fline);
printf(_("Timeline IDs must be in increasing sequence.\n"));
exit(1);
}
lasttli = tli;
nlines++;
entries = pg_realloc(entries, nlines * sizeof(TimeLineHistoryEntry));
entry = &entries[nlines - 1];
entry->tli = tli;
entry->begin = prevend;
entry->end = ((uint64) (switchpoint_hi)) << 32 | (uint64) switchpoint_lo;
prevend = entry->end;
/* we ignore the remainder of each line */
}
if (entries && targetTLI <= lasttli)
{
printf(_("invalid data in history file\n"));
printf(_("Timeline IDs must be less than child timeline's ID.\n"));
exit(1);
}
/*
* Create one more entry for the "tip" of the timeline, which has no entry
* in the history file.
*/
nlines++;
if (entries)
entries = pg_realloc(entries, nlines * sizeof(TimeLineHistoryEntry));
else
entries = pg_malloc(1 * sizeof(TimeLineHistoryEntry));
entry = &entries[nlines - 1];
entry->tli = targetTLI;
entry->begin = prevend;
entry->end = InvalidXLogRecPtr;
*nentries = nlines;
return entries;
}
...@@ -65,7 +65,8 @@ my $frontend_extraincludes = { ...@@ -65,7 +65,8 @@ my $frontend_extraincludes = {
'initdb' => ['src\timezone'], 'initdb' => ['src\timezone'],
'psql' => [ 'src\bin\pg_dump', 'src\backend' ] }; 'psql' => [ 'src\bin\pg_dump', 'src\backend' ] };
my $frontend_extrasource = { 'psql' => ['src\bin\psql\psqlscan.l'] }; my $frontend_extrasource = { 'psql' => ['src\bin\psql\psqlscan.l'] };
my @frontend_excludes = ('pgevent', 'pg_basebackup', 'pg_dump', 'scripts'); my @frontend_excludes =
('pgevent', 'pg_basebackup', 'pg_rewind', 'pg_dump', 'scripts');
sub mkvcbuild sub mkvcbuild
{ {
...@@ -422,6 +423,11 @@ sub mkvcbuild ...@@ -422,6 +423,11 @@ sub mkvcbuild
$pgrecvlogical->AddFile('src\bin\pg_basebackup\pg_recvlogical.c'); $pgrecvlogical->AddFile('src\bin\pg_basebackup\pg_recvlogical.c');
$pgrecvlogical->AddLibrary('ws2_32.lib'); $pgrecvlogical->AddLibrary('ws2_32.lib');
my $pgrewind = AddSimpleFrontend('pg_rewind', 1);
$pgrewind->{name} = 'pg_rewind';
$pgrewind->AddFile('src\backend\access\transam\xlogreader.c');
$pgrewind->AddLibrary('ws2_32.lib');
my $pgevent = $solution->AddProject('pgevent', 'dll', 'bin'); my $pgevent = $solution->AddProject('pgevent', 'dll', 'bin');
$pgevent->AddFiles('src\bin\pgevent', 'pgevent.c', 'pgmsgevent.rc'); $pgevent->AddFiles('src\bin\pgevent', 'pgevent.c', 'pgmsgevent.rc');
$pgevent->AddResourceFile('src\bin\pgevent', 'Eventlog message formatter', $pgevent->AddResourceFile('src\bin\pgevent', 'Eventlog message formatter',
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment