Make timeout.c more robust against missed timer interrupts.
Commit 09cf1d52 taught schedule_alarm() to not do anything if the next requested event is after when we expect the next interrupt to fire. However, if somehow an interrupt gets lost, we'll continue to not do anything indefinitely, even after the "next interrupt" time is obviously in the past. Thus, one missed interrupt can break timeout scheduling for the life of the session. Michael Harris reported a scenario where a bug in a user-defined function caused this to happen, so you don't even need to assume kernel bugs exist to think this is worth fixing. We can make things more robust at little cost by detecting the case where signal_due_at is before "now" and forcing a new setitimer call to occur. This isn't a completely bulletproof fix of course; but in our typical usage pattern where we frequently set timeouts and clear them before they are reached, the interrupt will get re-enabled after at most one timeout interval, which with a little luck will be before we really need it. While here, let's mark signal_due_at as volatile, since the signal handler can both examine and set it. I'm not sure there's any actual risk given that signal_pending is already volatile, but it's surely questionable. Backpatch to v14 where this logic came in. Michael Harris and Tom Lane Discussion: https://postgr.es/m/CADofcAWbMrvgwSMqO4iG_iD3E2v8ZUrC-_crB41my=VMM02-CA@mail.gmail.com
Showing
Please register or sign in to comment