• Tom Lane's avatar
    Fix race condition between shutdown and unstarted background workers. · 7519bd16
    Tom Lane authored
    If a database shutdown (smart or fast) is commanded between the time
    some process decides to request a new background worker and the time
    that the postmaster can launch that worker, then nothing happens
    because the postmaster won't launch any bgworkers once it's exited
    PM_RUN state.  This is fine ... unless the requesting process is
    waiting for that worker to finish (or even for it to start); in that
    case the requestor is stuck, and only manual intervention will get us
    to the point of being able to shut down.
    
    To fix, cancel pending requests for workers when the postmaster sends
    shutdown (SIGTERM) signals, and similarly cancel any new requests that
    arrive after that point.  (We can optimize things slightly by only
    doing the cancellation for workers that have waiters.)  To fit within
    the existing bgworker APIs, the "cancel" is made to look like the
    worker was started and immediately stopped, causing deregistration of
    the bgworker entry.  Waiting processes would have to deal with
    premature worker exit anyway, so this should introduce no bugs that
    weren't there before.  We do have a side effect that registration
    records for restartable bgworkers might disappear when theoretically
    they should have remained in place; but since we're shutting down,
    that shouldn't matter.
    
    Back-patch to v10.  There might be value in putting this into 9.6
    as well, but the management of bgworkers is a bit different there
    (notably see 8ff51869) and I'm not convinced it's worth the effort
    to validate the patch for that branch.
    
    Discussion: https://postgr.es/m/661570.1608673226@sss.pgh.pa.us
    7519bd16
bgworker.c 38 KB