• Heikki Linnakangas's avatar
    Make standby server continuously retry restoring the next WAL segment with · 1bb25580
    Heikki Linnakangas authored
    restore_command, if the connection to the primary server is lost. This
    ensures that the standby can recover automatically, if the connection is
    lost for a long time and standby falls behind so much that the required
    WAL segments have been archived and deleted in the master.
    
    This also makes standby_mode useful without streaming replication; the
    server will keep retrying restore_command every few seconds until the
    trigger file is found. That's the same basic functionality pg_standby
    offers, but without the bells and whistles.
    
    To implement that, refactor the ReadRecord/FetchRecord functions. The
    FetchRecord() function introduced in the original streaming replication
    patch is removed, and all the retry logic is now in a new function called
    XLogReadPage(). XLogReadPage() is now responsible for executing
    restore_command, launching walreceiver, and waiting for new WAL to arrive
    from primary, as required.
    
    This also changes the life cycle of walreceiver. When launched, it now only
    tries to connect to the master once, and exits if the connection fails, or
    is lost during streaming for any reason. The startup process detects the
    death, and re-launches walreceiver if necessary.
    1bb25580
xlog.c 264 KB