Commit 8e19a826 authored by Heikki Linnakangas's avatar Heikki Linnakangas

Don't run atexit callbacks in quickdie signal handlers.

exit() is not async-signal safe. Even if the libc implementation is, 3rd
party libraries might have installed unsafe atexit() callbacks. After
receiving SIGQUIT, we really just want to exit as quickly as possible, so
we don't really want to run the atexit() callbacks anyway.

The original report by Jimmy Yih was a self-deadlock in startup_die().
However, this patch doesn't address that scenario; the signal handling
while waiting for the startup packet is more complicated. But at least this
alleviates similar problems in the SIGQUIT handlers, like that reported
by Asim R P later in the same thread.

Backpatch to 9.3 (all supported versions).

Discussion: https://www.postgresql.org/message-id/CAOMx_OAuRUHiAuCg2YgicZLzPVv5d9_H4KrL_OFsFP%3DVPekigA%40mail.gmail.com
parent 11e22e48
...@@ -644,28 +644,21 @@ SanityCheckBackgroundWorker(BackgroundWorker *worker, int elevel) ...@@ -644,28 +644,21 @@ SanityCheckBackgroundWorker(BackgroundWorker *worker, int elevel)
static void static void
bgworker_quickdie(SIGNAL_ARGS) bgworker_quickdie(SIGNAL_ARGS)
{ {
sigaddset(&BlockSig, SIGQUIT); /* prevent nested calls */
PG_SETMASK(&BlockSig);
/*
* We DO NOT want to run proc_exit() callbacks -- we're here because
* shared memory may be corrupted, so we don't want to try to clean up our
* transaction. Just nail the windows shut and get out of town. Now that
* there's an atexit callback to prevent third-party code from breaking
* things by calling exit() directly, we have to reset the callbacks
* explicitly to make this work as intended.
*/
on_exit_reset();
/* /*
* Note we do exit(2) not exit(0). This is to force the postmaster into a * We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random * because shared memory may be corrupted, so we don't want to try to
* clean up our transaction. Just nail the windows shut and get out of
* town. The callbacks wouldn't be safe to run from a signal handler,
* anyway.
*
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
* a system reset cycle if someone sends a manual SIGQUIT to a random
* backend. This is necessary precisely because we don't clean up our * backend. This is necessary precisely because we don't clean up our
* shared memory state. (The "dead man switch" mechanism in pmsignal.c * shared memory state. (The "dead man switch" mechanism in pmsignal.c
* should ensure the postmaster sees this as a crash, too, but no harm in * should ensure the postmaster sees this as a crash, too, but no harm in
* being doubly sure.) * being doubly sure.)
*/ */
exit(2); _exit(2);
} }
/* /*
......
...@@ -399,27 +399,21 @@ BackgroundWriterMain(void) ...@@ -399,27 +399,21 @@ BackgroundWriterMain(void)
static void static void
bg_quickdie(SIGNAL_ARGS) bg_quickdie(SIGNAL_ARGS)
{ {
PG_SETMASK(&BlockSig);
/* /*
* We DO NOT want to run proc_exit() callbacks -- we're here because * We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
* shared memory may be corrupted, so we don't want to try to clean up our * because shared memory may be corrupted, so we don't want to try to
* transaction. Just nail the windows shut and get out of town. Now that * clean up our transaction. Just nail the windows shut and get out of
* there's an atexit callback to prevent third-party code from breaking * town. The callbacks wouldn't be safe to run from a signal handler,
* things by calling exit() directly, we have to reset the callbacks * anyway.
* explicitly to make this work as intended. *
*/ * Note we do _exit(2) not _exit(0). This is to force the postmaster into
on_exit_reset(); * a system reset cycle if someone sends a manual SIGQUIT to a random
/*
* Note we do exit(2) not exit(0). This is to force the postmaster into a
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
* backend. This is necessary precisely because we don't clean up our * backend. This is necessary precisely because we don't clean up our
* shared memory state. (The "dead man switch" mechanism in pmsignal.c * shared memory state. (The "dead man switch" mechanism in pmsignal.c
* should ensure the postmaster sees this as a crash, too, but no harm in * should ensure the postmaster sees this as a crash, too, but no harm in
* being doubly sure.) * being doubly sure.)
*/ */
exit(2); _exit(2);
} }
/* SIGHUP: set flag to re-read config file at next convenient time */ /* SIGHUP: set flag to re-read config file at next convenient time */
......
...@@ -813,27 +813,21 @@ IsCheckpointOnSchedule(double progress) ...@@ -813,27 +813,21 @@ IsCheckpointOnSchedule(double progress)
static void static void
chkpt_quickdie(SIGNAL_ARGS) chkpt_quickdie(SIGNAL_ARGS)
{ {
PG_SETMASK(&BlockSig);
/* /*
* We DO NOT want to run proc_exit() callbacks -- we're here because * We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
* shared memory may be corrupted, so we don't want to try to clean up our * because shared memory may be corrupted, so we don't want to try to
* transaction. Just nail the windows shut and get out of town. Now that * clean up our transaction. Just nail the windows shut and get out of
* there's an atexit callback to prevent third-party code from breaking * town. The callbacks wouldn't be safe to run from a signal handler,
* things by calling exit() directly, we have to reset the callbacks * anyway.
* explicitly to make this work as intended. *
*/ * Note we do _exit(2) not _exit(0). This is to force the postmaster into
on_exit_reset(); * a system reset cycle if someone sends a manual SIGQUIT to a random
/*
* Note we do exit(2) not exit(0). This is to force the postmaster into a
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
* backend. This is necessary precisely because we don't clean up our * backend. This is necessary precisely because we don't clean up our
* shared memory state. (The "dead man switch" mechanism in pmsignal.c * shared memory state. (The "dead man switch" mechanism in pmsignal.c
* should ensure the postmaster sees this as a crash, too, but no harm in * should ensure the postmaster sees this as a crash, too, but no harm in
* being doubly sure.) * being doubly sure.)
*/ */
exit(2); _exit(2);
} }
/* SIGHUP: set flag to re-read config file at next convenient time */ /* SIGHUP: set flag to re-read config file at next convenient time */
......
...@@ -69,27 +69,21 @@ static void StartupProcSigHupHandler(SIGNAL_ARGS); ...@@ -69,27 +69,21 @@ static void StartupProcSigHupHandler(SIGNAL_ARGS);
static void static void
startupproc_quickdie(SIGNAL_ARGS) startupproc_quickdie(SIGNAL_ARGS)
{ {
PG_SETMASK(&BlockSig);
/*
* We DO NOT want to run proc_exit() callbacks -- we're here because
* shared memory may be corrupted, so we don't want to try to clean up our
* transaction. Just nail the windows shut and get out of town. Now that
* there's an atexit callback to prevent third-party code from breaking
* things by calling exit() directly, we have to reset the callbacks
* explicitly to make this work as intended.
*/
on_exit_reset();
/* /*
* Note we do exit(2) not exit(0). This is to force the postmaster into a * We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random * because shared memory may be corrupted, so we don't want to try to
* clean up our transaction. Just nail the windows shut and get out of
* town. The callbacks wouldn't be safe to run from a signal handler,
* anyway.
*
* Note we do _exit(2) not _exit(0). This is to force the postmaster into
* a system reset cycle if someone sends a manual SIGQUIT to a random
* backend. This is necessary precisely because we don't clean up our * backend. This is necessary precisely because we don't clean up our
* shared memory state. (The "dead man switch" mechanism in pmsignal.c * shared memory state. (The "dead man switch" mechanism in pmsignal.c
* should ensure the postmaster sees this as a crash, too, but no harm in * should ensure the postmaster sees this as a crash, too, but no harm in
* being doubly sure.) * being doubly sure.)
*/ */
exit(2); _exit(2);
} }
......
...@@ -309,27 +309,21 @@ WalWriterMain(void) ...@@ -309,27 +309,21 @@ WalWriterMain(void)
static void static void
wal_quickdie(SIGNAL_ARGS) wal_quickdie(SIGNAL_ARGS)
{ {
PG_SETMASK(&BlockSig);
/* /*
* We DO NOT want to run proc_exit() callbacks -- we're here because * We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
* shared memory may be corrupted, so we don't want to try to clean up our * because shared memory may be corrupted, so we don't want to try to
* transaction. Just nail the windows shut and get out of town. Now that * clean up our transaction. Just nail the windows shut and get out of
* there's an atexit callback to prevent third-party code from breaking * town. The callbacks wouldn't be safe to run from a signal handler,
* things by calling exit() directly, we have to reset the callbacks * anyway.
* explicitly to make this work as intended. *
*/ * Note we do _exit(2) not _exit(0). This is to force the postmaster into
on_exit_reset(); * a system reset cycle if someone sends a manual SIGQUIT to a random
/*
* Note we do exit(2) not exit(0). This is to force the postmaster into a
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
* backend. This is necessary precisely because we don't clean up our * backend. This is necessary precisely because we don't clean up our
* shared memory state. (The "dead man switch" mechanism in pmsignal.c * shared memory state. (The "dead man switch" mechanism in pmsignal.c
* should ensure the postmaster sees this as a crash, too, but no harm in * should ensure the postmaster sees this as a crash, too, but no harm in
* being doubly sure.) * being doubly sure.)
*/ */
exit(2); _exit(2);
} }
/* SIGHUP: set flag to re-read config file at next convenient time */ /* SIGHUP: set flag to re-read config file at next convenient time */
......
...@@ -854,27 +854,21 @@ WalRcvShutdownHandler(SIGNAL_ARGS) ...@@ -854,27 +854,21 @@ WalRcvShutdownHandler(SIGNAL_ARGS)
static void static void
WalRcvQuickDieHandler(SIGNAL_ARGS) WalRcvQuickDieHandler(SIGNAL_ARGS)
{ {
PG_SETMASK(&BlockSig);
/* /*
* We DO NOT want to run proc_exit() callbacks -- we're here because * We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
* shared memory may be corrupted, so we don't want to try to clean up our * because shared memory may be corrupted, so we don't want to try to
* transaction. Just nail the windows shut and get out of town. Now that * clean up our transaction. Just nail the windows shut and get out of
* there's an atexit callback to prevent third-party code from breaking * town. The callbacks wouldn't be safe to run from a signal handler,
* things by calling exit() directly, we have to reset the callbacks * anyway.
* explicitly to make this work as intended. *
*/ * Note we use _exit(2) not _exit(0). This is to force the postmaster
on_exit_reset(); * into a system reset cycle if someone sends a manual SIGQUIT to a
* random backend. This is necessary precisely because we don't clean up
/* * our shared memory state. (The "dead man switch" mechanism in
* Note we do exit(2) not exit(0). This is to force the postmaster into a * pmsignal.c should ensure the postmaster sees this as a crash, too, but
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random * no harm in being doubly sure.)
* backend. This is necessary precisely because we don't clean up our
* shared memory state. (The "dead man switch" mechanism in pmsignal.c
* should ensure the postmaster sees this as a crash, too, but no harm in
* being doubly sure.)
*/ */
exit(2); _exit(2);
} }
/* /*
......
...@@ -2616,6 +2616,16 @@ quickdie(SIGNAL_ARGS) ...@@ -2616,6 +2616,16 @@ quickdie(SIGNAL_ARGS)
whereToSendOutput = DestNone; whereToSendOutput = DestNone;
/* /*
* Notify the client before exiting, to give a clue on what happened.
*
* It's dubious to call ereport() from a signal handler. It is certainly
* not async-signal safe. But it seems better to try, than to disconnect
* abruptly and leave the client wondering what happened. It's remotely
* possible that we crash or hang while trying to send the message, but
* receiving a SIGQUIT is a sign that something has already gone badly
* wrong, so there's not much to lose. Assuming the postmaster is still
* running, it will SIGKILL us soon if we get stuck for some reason.
*
* Ideally this should be ereport(FATAL), but then we'd not get control * Ideally this should be ereport(FATAL), but then we'd not get control
* back... * back...
*/ */
...@@ -2630,24 +2640,20 @@ quickdie(SIGNAL_ARGS) ...@@ -2630,24 +2640,20 @@ quickdie(SIGNAL_ARGS)
" database and repeat your command."))); " database and repeat your command.")));
/* /*
* We DO NOT want to run proc_exit() callbacks -- we're here because * We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
* shared memory may be corrupted, so we don't want to try to clean up our * because shared memory may be corrupted, so we don't want to try to
* transaction. Just nail the windows shut and get out of town. Now that * clean up our transaction. Just nail the windows shut and get out of
* there's an atexit callback to prevent third-party code from breaking * town. The callbacks wouldn't be safe to run from a signal handler,
* things by calling exit() directly, we have to reset the callbacks * anyway.
* explicitly to make this work as intended. *
*/ * Note we do _exit(2) not _exit(0). This is to force the postmaster into
on_exit_reset(); * a system reset cycle if someone sends a manual SIGQUIT to a random
/*
* Note we do exit(2) not exit(0). This is to force the postmaster into a
* system reset cycle if some idiot DBA sends a manual SIGQUIT to a random
* backend. This is necessary precisely because we don't clean up our * backend. This is necessary precisely because we don't clean up our
* shared memory state. (The "dead man switch" mechanism in pmsignal.c * shared memory state. (The "dead man switch" mechanism in pmsignal.c
* should ensure the postmaster sees this as a crash, too, but no harm in * should ensure the postmaster sees this as a crash, too, but no harm in
* being doubly sure.) * being doubly sure.)
*/ */
exit(2); _exit(2);
} }
/* /*
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment