• Tom Lane's avatar
    Create a function to reliably identify which sessions block which others. · 52f5d578
    Tom Lane authored
    This patch introduces "pg_blocking_pids(int) returns int[]", which returns
    the PIDs of any sessions that are blocking the session with the given PID.
    Historically people have obtained such information using a self-join on
    the pg_locks view, but it's unreasonably tedious to do it that way with any
    modicum of correctness, and the addition of parallel queries has pretty
    much broken that approach altogether.  (Given some more columns in the view
    than there are today, you could imagine handling parallel-query cases with
    a 4-way join; but ugh.)
    
    The new function has the following behaviors that are painful or impossible
    to get right via pg_locks:
    
    1. Correctly understands which lock modes block which other ones.
    
    2. In soft-block situations (two processes both waiting for conflicting lock
    modes), only the one that's in front in the wait queue is reported to
    block the other.
    
    3. In parallel-query cases, reports all sessions blocking any member of
    the given PID's lock group, and reports a session by naming its leader
    process's PID, which will be the pg_backend_pid() value visible to
    clients.
    
    The motivation for doing this right now is mostly to fix the isolation
    tests.  Commit 38f8bdca lobotomized
    isolationtester's is-it-waiting query by removing its ability to recognize
    nonconflicting lock modes, as a crude workaround for the inability to
    handle soft-block situations properly.  But even without the lock mode
    tests, the old query was excessively slow, particularly in
    CLOBBER_CACHE_ALWAYS builds; some of our buildfarm animals fail the new
    deadlock-hard test because the deadlock timeout elapses before they can
    probe the waiting status of all eight sessions.  Replacing the pg_locks
    self-join with use of pg_blocking_pids() is not only much more correct, but
    a lot faster: I measure it at about 9X faster in a typical dev build with
    Asserts, and 3X faster in CLOBBER_CACHE_ALWAYS builds.  That should provide
    enough headroom for the slower CLOBBER_CACHE_ALWAYS animals to pass the
    test, without having to lengthen deadlock_timeout yet more and thus slow
    down the test for everyone else.
    52f5d578
func.sgml 663 KB