• Tom Lane's avatar
    Fix postmaster's handling of a startup-process crash. · 45811be9
    Tom Lane authored
    Ordinarily, a failure (unexpected exit status) of the startup subprocess
    should be considered fatal, so the postmaster should just close up shop
    and quit.  However, if we sent the startup process a SIGQUIT or SIGKILL
    signal, the failure is hardly "unexpected", and we should attempt restart;
    this is necessary for recovery from ordinary backend crashes in hot-standby
    scenarios.  I attempted to implement the latter rule with a two-line patch
    in commit 442231d7, but it now emerges that
    that patch was a few bricks shy of a load: it failed to distinguish the
    case of a signaled startup process from the case where the new startup
    process crashes before reaching database consistency.  That resulted in
    infinitely respawning a new startup process only to have it crash again.
    
    To handle this properly, we really must track whether we have sent the
    *current* startup process a kill signal.  Rather than add yet another
    ad-hoc boolean to the postmaster's state, I chose to unify this with the
    existing RecoveryError flag into an enum tracking the startup process's
    state.  That seems more consistent with the postmaster's general state
    machine design.
    
    Back-patch to 9.0, like the previous patch.
    45811be9
postmaster.c 168 KB