• Andres Freund's avatar
    Fix (re-)starting from a basebackup taken off a standby after a failure. · c303e9e7
    Andres Freund authored
    When starting up from a basebackup taken off a standby extra logic has
    to be applied to compute the point where the data directory is
    consistent. Normal base backups use a WAL record for that purpose, but
    that isn't possible on a standby.
    
    That logic had a error check ensuring that the cluster's control file
    indicates being in recovery. Unfortunately that check was too strict,
    disregarding the fact that the control file could also indicate that
    the cluster was shut down while in recovery.
    
    That's possible when the a cluster starting from a basebackup is shut
    down before the backup label has been removed. When everything goes
    well that's a short window, but when either restore_command or
    primary_conninfo isn't configured correctly the window can get much
    wider. That's because inbetween reading and unlinking the label we
    restore the last checkpoint from WAL which can need additional WAL.
    
    To fix simply also allow starting when the control file indicates
    "shutdown in recovery". There's nicer fixes imaginable, but they'd be
    more invasive.
    
    Backpatch to 9.2 where support for taking basebackups from standbys
    was added.
    c303e9e7
xlog.c 332 KB