• Peter Eisentraut's avatar
    Prevent panic during shutdown checkpoint · 086221cf
    Peter Eisentraut authored
    When the checkpointer writes the shutdown checkpoint, it checks
    afterwards whether any WAL has been written since it started and throws
    a PANIC if so.  At that point, only walsenders are still active, so one
    might think this could not happen, but walsenders can also generate WAL,
    for instance in BASE_BACKUP and certain variants of
    CREATE_REPLICATION_SLOT.  So they can trigger this panic if such a
    command is run while the shutdown checkpoint is being written.
    
    To fix this, divide the walsender shutdown into two phases.  First, the
    postmaster sends a SIGUSR2 signal to all walsenders.  The walsenders
    then put themselves into the "stopping" state.  In this state, they
    reject any new commands.  (For simplicity, we reject all new commands,
    so that in the future we do not have to track meticulously which
    commands might generate WAL.)  The checkpointer waits for all walsenders
    to reach this state before proceeding with the shutdown checkpoint.
    After the shutdown checkpoint is done, the postmaster sends
    SIGINT (previously unused) to the walsenders.  This triggers the
    existing shutdown behavior of sending out the shutdown checkpoint record
    and then terminating.
    
    Author: Michael Paquier <michael.paquier@gmail.com>
    Reported-by: default avatarFujii Masao <masao.fujii@gmail.com>
    086221cf
monitoring.sgml 152 KB