Commit 223431cb authored by Heikki Linnakangas's avatar Heikki Linnakangas

Request XLOG switch before writing checkpoint in pg_start_backup(). Otherwise

you can end up with an unrecoverable backup if you start a new base backup
right after finishing archive recovery. In that scenario, the redo pointer of
the checkpoint that pg_start_backup() writes points to the XLOG segment where
the timeline-changing end-of-archive-recovery checkpoint is. The beginning
of that segment contains pages with the old timeline ID, and we don't accept
that in recovery unless we find a history file covering the old timeline ID.
If you omit pg_xlog from the base backup and clear the archive directory
before starting the backup, there will be no such history file available.

The bug is present in all versions since PITR was introduced in 8.0, but I'm
back-patching only back to 8.2. Earlier versions didn't have XLOG switch
records, making this fix unfeasible. Given the lack of reports until now,
it doesn't seem worthwhile to spend more effort to fix 8.0 and 8.1.

Per report and suggestion by Mikael Krantz
parent 1f36fece
......@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2009, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/backend/access/transam/xlog.c,v 1.336 2009/04/22 19:51:12 heikki Exp $
* $PostgreSQL: pgsql/src/backend/access/transam/xlog.c,v 1.337 2009/05/07 11:25:25 heikki Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -6987,6 +6987,19 @@ pg_start_backup(PG_FUNCTION_ARGS)
XLogCtl->Insert.forcePageWrites = true;
LWLockRelease(WALInsertLock);
/*
* Force an XLOG file switch before the checkpoint, to ensure that the WAL
* segment the checkpoint is written to doesn't contain pages with old
* timeline IDs. That would otherwise happen if you called
* pg_start_backup() right after restoring from a PITR archive: the first
* WAL segment containing the startup checkpoint has pages in the
* beginning with the old timeline ID. That can cause trouble at recovery:
* we won't have a history file covering the old timeline if pg_xlog
* directory was not included in the base backup and the WAL archive was
* cleared too before starting the backup.
*/
RequestXLogSwitch();
/* Ensure we release forcePageWrites if fail below */
PG_ENSURE_ERROR_CLEANUP(pg_start_backup_callback, (Datum) 0);
{
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment