• Tom Lane's avatar
    Handle elog(FATAL) during ROLLBACK more robustly. · 5b6289c1
    Tom Lane authored
    Stress testing by Andreas Seltenreich disclosed longstanding problems that
    occur if a FATAL exit (e.g. due to receipt of SIGTERM) occurs while we are
    trying to execute a ROLLBACK of an already-failed transaction.  In such a
    case, xact.c is in TBLOCK_ABORT state, so that AbortOutOfAnyTransaction
    would skip AbortTransaction and go straight to CleanupTransaction.  This
    led to an assert failure in an assert-enabled build (due to the ROLLBACK's
    portal still having a cleanup hook) or without assertions, to a FATAL exit
    complaining about "cannot drop active portal".  The latter's not
    disastrous, perhaps, but it's messy enough to want to improve it.
    
    We don't really want to run all of AbortTransaction in this code path.
    The minimum required to clean up the open portal safely is to do
    AtAbort_Memory and AtAbort_Portals.  It seems like a good idea to
    do AtAbort_Memory unconditionally, to be entirely sure that we are
    starting with a safe CurrentMemoryContext.  That means that if the
    main loop in AbortOutOfAnyTransaction does nothing, we need an extra
    step at the bottom to restore CurrentMemoryContext = TopMemoryContext,
    which I chose to do by invoking AtCleanup_Memory.  This'll result in
    calling AtCleanup_Memory twice in many of the paths through this function,
    but that seems harmless and reasonably inexpensive.
    
    The original motivation for the assertion in AtCleanup_Portals was that
    we wanted to be sure that any user-defined code executed as a consequence
    of the cleanup hook runs during AbortTransaction not CleanupTransaction.
    That still seems like a valid concern, and now that we've seen one case
    of the assertion firing --- which means that exactly that would have
    happened in a production build --- let's replace the Assert with a runtime
    check.  If we see the cleanup hook still set, we'll emit a WARNING and
    just drop the hook unexecuted.
    
    This has been like this a long time, so back-patch to all supported
    branches.
    
    Discussion: https://postgr.es/m/877ey7bmun.fsf@ansel.ydns.eu
    5b6289c1
portalmem.c 32.1 KB