• Heikki Linnakangas's avatar
    Fix more issues with cascading replication and timeline switches. · 990fe3c4
    Heikki Linnakangas authored
    When a standby server follows the master using WAL archive, and it chooses
    a new timeline (recovery_target_timeline='latest'), it only fetches the
    timeline history file for the chosen target timeline, not any other history
    files that might be missing from pg_xlog. For example, if the current
    timeline is 2, and we choose 4 as the new recovery target timeline, the
    history file for timeline 3 is not fetched, even if it's part of this
    server's history. That's enough for the standby itself - the history file
    for timeline 4 includes timeline 3 as well - but if a cascading standby
    server wants to recover to timeline 3, it needs the history file. To fix,
    when a new recovery target timeline is chosen, try to copy any missing
    history files from the archive to pg_xlog between the old and new target
    timeline.
    
    A second similar issue was with the WAL files. When a standby recovers from
    archive, and it reaches a segment that contains a switch to a new timeline,
    recovery fetches only the WAL file labelled with the new timeline's ID. The
    file from the new timeline contains a copy of the WAL from the old timeline
    up to the point where the switch happened, and recovery recovers it from the
    new file. But in streaming replication, walsender only tries to read it
    from the old timeline's file. To fix, change walsender to read it from the
    new file, so that it behaves the same as recovery in that sense, and doesn't
    try to open the possibly nonexistent file with the old timeline's ID.
    990fe3c4
timeline.c 14.8 KB