Commit 52f5d578 authored by Tom Lane's avatar Tom Lane

Create a function to reliably identify which sessions block which others.

This patch introduces "pg_blocking_pids(int) returns int[]", which returns
the PIDs of any sessions that are blocking the session with the given PID.
Historically people have obtained such information using a self-join on
the pg_locks view, but it's unreasonably tedious to do it that way with any
modicum of correctness, and the addition of parallel queries has pretty
much broken that approach altogether.  (Given some more columns in the view
than there are today, you could imagine handling parallel-query cases with
a 4-way join; but ugh.)

The new function has the following behaviors that are painful or impossible
to get right via pg_locks:

1. Correctly understands which lock modes block which other ones.

2. In soft-block situations (two processes both waiting for conflicting lock
modes), only the one that's in front in the wait queue is reported to
block the other.

3. In parallel-query cases, reports all sessions blocking any member of
the given PID's lock group, and reports a session by naming its leader
process's PID, which will be the pg_backend_pid() value visible to
clients.

The motivation for doing this right now is mostly to fix the isolation
tests.  Commit 38f8bdca lobotomized
isolationtester's is-it-waiting query by removing its ability to recognize
nonconflicting lock modes, as a crude workaround for the inability to
handle soft-block situations properly.  But even without the lock mode
tests, the old query was excessively slow, particularly in
CLOBBER_CACHE_ALWAYS builds; some of our buildfarm animals fail the new
deadlock-hard test because the deadlock timeout elapses before they can
probe the waiting status of all eight sessions.  Replacing the pg_locks
self-join with use of pg_blocking_pids() is not only much more correct, but
a lot faster: I measure it at about 9X faster in a typical dev build with
Asserts, and 3X faster in CLOBBER_CACHE_ALWAYS builds.  That should provide
enough headroom for the slower CLOBBER_CACHE_ALWAYS animals to pass the
test, without having to lengthen deadlock_timeout yet more and thus slow
down the test for everyone else.
parent 73bf8715
...@@ -7376,7 +7376,7 @@ ...@@ -7376,7 +7376,7 @@
<row> <row>
<entry><link linkend="view-pg-locks"><structname>pg_locks</structname></link></entry> <entry><link linkend="view-pg-locks"><structname>pg_locks</structname></link></entry>
<entry>currently held locks</entry> <entry>locks currently held or awaited</entry>
</row> </row>
<row> <row>
...@@ -8015,16 +8015,16 @@ ...@@ -8015,16 +8015,16 @@
<para> <para>
The view <structname>pg_locks</structname> provides access to The view <structname>pg_locks</structname> provides access to
information about the locks held by open transactions within the information about the locks held by active processes within the
database server. See <xref linkend="mvcc"> for more discussion database server. See <xref linkend="mvcc"> for more discussion
of locking. of locking.
</para> </para>
<para> <para>
<structname>pg_locks</structname> contains one row per active lockable <structname>pg_locks</structname> contains one row per active lockable
object, requested lock mode, and relevant transaction. Thus, the same object, requested lock mode, and relevant process. Thus, the same
lockable object might lockable object might
appear many times, if multiple transactions are holding or waiting appear many times, if multiple processes are holding or waiting
for locks on it. However, an object that currently has no locks on it for locks on it. However, an object that currently has no locks on it
will not appear at all. will not appear at all.
</para> </para>
...@@ -8200,31 +8200,31 @@ ...@@ -8200,31 +8200,31 @@
<para> <para>
<structfield>granted</structfield> is true in a row representing a lock <structfield>granted</structfield> is true in a row representing a lock
held by the indicated transaction. False indicates that this transaction is held by the indicated process. False indicates that this process is
currently waiting to acquire this lock, which implies that some other currently waiting to acquire this lock, which implies that at least one
transaction is holding a conflicting lock mode on the same lockable object. other process is holding or waiting for a conflicting lock mode on the same
The waiting transaction will sleep until the other lock is released (or a lockable object. The waiting process will sleep until the other lock is
deadlock situation is detected). A single transaction can be waiting to released (or a deadlock situation is detected). A single process can be
acquire at most one lock at a time. waiting to acquire at most one lock at a time.
</para> </para>
<para> <para>
Every transaction holds an exclusive lock on its virtual transaction ID for Throughout running a transaction, a server process holds an exclusive lock
its entire duration. If a permanent ID is assigned to the transaction on the transaction's virtual transaction ID. If a permanent ID is assigned
(which normally happens only if the transaction changes the state of the to the transaction (which normally happens only if the transaction changes
database), it also holds an exclusive lock on its permanent transaction ID the state of the database), it also holds an exclusive lock on the
until it ends. When one transaction finds it necessary to wait specifically transaction's permanent transaction ID until it ends. When a process finds
for another transaction, it does so by attempting to acquire share lock on it necessary to wait specifically for another transaction to end, it does
the other transaction ID (either virtual or permanent ID depending on the so by attempting to acquire share lock on the other transaction's ID
situation). That will succeed only when the other transaction (either virtual or permanent ID depending on the situation). That will
terminates and releases its locks. succeed only when the other transaction terminates and releases its locks.
</para> </para>
<para> <para>
Although tuples are a lockable type of object, Although tuples are a lockable type of object,
information about row-level locks is stored on disk, not in memory, information about row-level locks is stored on disk, not in memory,
and therefore row-level locks normally do not appear in this view. and therefore row-level locks normally do not appear in this view.
If a transaction is waiting for a If a process is waiting for a
row-level lock, it will usually appear in the view as waiting for the row-level lock, it will usually appear in the view as waiting for the
permanent transaction ID of the current holder of that row lock. permanent transaction ID of the current holder of that row lock.
</para> </para>
...@@ -8260,7 +8260,7 @@ ...@@ -8260,7 +8260,7 @@
<structfield>pid</structfield> column of the <link <structfield>pid</structfield> column of the <link
linkend="pg-stat-activity-view"><structname>pg_stat_activity</structname></link> linkend="pg-stat-activity-view"><structname>pg_stat_activity</structname></link>
view to get more view to get more
information on the session holding or waiting to hold each lock, information on the session holding or awaiting each lock,
for example for example
<programlisting> <programlisting>
SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa SELECT * FROM pg_locks pl LEFT JOIN pg_stat_activity psa
...@@ -8280,6 +8280,20 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx ...@@ -8280,6 +8280,20 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
</programlisting> </programlisting>
</para> </para>
<para>
While it is possible to obtain information about which processes block
which other processes by joining <structname>pg_locks</structname> against
itself, this is very difficult to get right in detail. Such a query would
have to encode knowledge about which lock modes conflict with which
others. Worse, the <structname>pg_locks</structname> view does not expose
information about which processes are ahead of which others in lock wait
queues, nor information about which processes are parallel workers running
on behalf of which other client sessions. It is better to use
the <function>pg_blocking_pids()</> function
(see <xref linkend="functions-info-session-table">) to identify which
process(es) a waiting process is blocked behind.
</para>
<para> <para>
The <structname>pg_locks</structname> view displays data from both the The <structname>pg_locks</structname> view displays data from both the
regular lock manager and the predicate lock manager, which are regular lock manager and the predicate lock manager, which are
......
...@@ -14996,6 +14996,12 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n); ...@@ -14996,6 +14996,12 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
</entry> </entry>
</row> </row>
<row>
<entry><literal><function>pg_blocking_pids(<type>int</type>)</function></literal></entry>
<entry><type>int[]</type></entry>
<entry>Process ID(s) that are blocking specified server process ID</entry>
</row>
<row> <row>
<entry><literal><function>pg_conf_load_time()</function></literal></entry> <entry><literal><function>pg_conf_load_time()</function></literal></entry>
<entry><type>timestamp with time zone</type></entry> <entry><type>timestamp with time zone</type></entry>
...@@ -15183,6 +15189,29 @@ SET search_path TO <replaceable>schema</> <optional>, <replaceable>schema</>, .. ...@@ -15183,6 +15189,29 @@ SET search_path TO <replaceable>schema</> <optional>, <replaceable>schema</>, ..
Unix-domain socket. Unix-domain socket.
</para> </para>
<indexterm>
<primary>pg_blocking_pids</primary>
</indexterm>
<para>
<function>pg_blocking_pids</function> returns an array of the process IDs
of the sessions that are blocking the server process with the specified
process ID, or an empty array if there is no such server process or it is
not blocked. One server process blocks another if it either holds a lock
that conflicts with the blocked process's lock request (hard block), or is
waiting for a lock that would conflict with the blocked process's lock
request and is ahead of it in the wait queue (soft block). When using
parallel queries the result always lists client-visible process IDs (that
is, <function>pg_backend_pid</> results) even if the actual lock is held
or awaited by a child worker process. As a result of that, there may be
duplicated PIDs in the result. Also note that when a prepared transaction
holds a conflicting lock, it will be represented by a zero process ID in
the result of this function.
Frequent calls to this function could have some impact on database
performance, because it needs exclusive access to the lock manager's
shared state for a short time.
</para>
<indexterm> <indexterm>
<primary>pg_conf_load_time</primary> <primary>pg_conf_load_time</primary>
</indexterm> </indexterm>
......
...@@ -2312,6 +2312,29 @@ HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids) ...@@ -2312,6 +2312,29 @@ HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids)
*/ */
PGPROC * PGPROC *
BackendPidGetProc(int pid) BackendPidGetProc(int pid)
{
PGPROC *result;
if (pid == 0) /* never match dummy PGPROCs */
return NULL;
LWLockAcquire(ProcArrayLock, LW_SHARED);
result = BackendPidGetProcWithLock(pid);
LWLockRelease(ProcArrayLock);
return result;
}
/*
* BackendPidGetProcWithLock -- get a backend's PGPROC given its PID
*
* Same as above, except caller must be holding ProcArrayLock. The found
* entry, if any, can be assumed to be valid as long as the lock remains held.
*/
PGPROC *
BackendPidGetProcWithLock(int pid)
{ {
PGPROC *result = NULL; PGPROC *result = NULL;
ProcArrayStruct *arrayP = procArray; ProcArrayStruct *arrayP = procArray;
...@@ -2320,8 +2343,6 @@ BackendPidGetProc(int pid) ...@@ -2320,8 +2343,6 @@ BackendPidGetProc(int pid)
if (pid == 0) /* never match dummy PGPROCs */ if (pid == 0) /* never match dummy PGPROCs */
return NULL; return NULL;
LWLockAcquire(ProcArrayLock, LW_SHARED);
for (index = 0; index < arrayP->numProcs; index++) for (index = 0; index < arrayP->numProcs; index++)
{ {
PGPROC *proc = &allProcs[arrayP->pgprocnos[index]]; PGPROC *proc = &allProcs[arrayP->pgprocnos[index]];
...@@ -2333,8 +2354,6 @@ BackendPidGetProc(int pid) ...@@ -2333,8 +2354,6 @@ BackendPidGetProc(int pid)
} }
} }
LWLockRelease(ProcArrayLock);
return result; return result;
} }
......
...@@ -21,7 +21,7 @@ ...@@ -21,7 +21,7 @@
* *
* Interface: * Interface:
* *
* InitLocks(), GetLocksMethodTable(), * InitLocks(), GetLocksMethodTable(), GetLockTagsMethodTable(),
* LockAcquire(), LockRelease(), LockReleaseAll(), * LockAcquire(), LockRelease(), LockReleaseAll(),
* LockCheckConflicts(), GrantLock() * LockCheckConflicts(), GrantLock()
* *
...@@ -41,6 +41,7 @@ ...@@ -41,6 +41,7 @@
#include "pg_trace.h" #include "pg_trace.h"
#include "pgstat.h" #include "pgstat.h"
#include "storage/proc.h" #include "storage/proc.h"
#include "storage/procarray.h"
#include "storage/sinvaladt.h" #include "storage/sinvaladt.h"
#include "storage/spin.h" #include "storage/spin.h"
#include "storage/standby.h" #include "storage/standby.h"
...@@ -356,6 +357,8 @@ static void CleanUpLock(LOCK *lock, PROCLOCK *proclock, ...@@ -356,6 +357,8 @@ static void CleanUpLock(LOCK *lock, PROCLOCK *proclock,
static void LockRefindAndRelease(LockMethod lockMethodTable, PGPROC *proc, static void LockRefindAndRelease(LockMethod lockMethodTable, PGPROC *proc,
LOCKTAG *locktag, LOCKMODE lockmode, LOCKTAG *locktag, LOCKMODE lockmode,
bool decrement_strong_lock_count); bool decrement_strong_lock_count);
static void GetSingleProcBlockerStatusData(PGPROC *blocked_proc,
BlockedProcsData *data);
/* /*
...@@ -462,6 +465,18 @@ GetLocksMethodTable(const LOCK *lock) ...@@ -462,6 +465,18 @@ GetLocksMethodTable(const LOCK *lock)
return LockMethods[lockmethodid]; return LockMethods[lockmethodid];
} }
/*
* Fetch the lock method table associated with a given locktag
*/
LockMethod
GetLockTagsMethodTable(const LOCKTAG *locktag)
{
LOCKMETHODID lockmethodid = (LOCKMETHODID) locktag->locktag_lockmethodid;
Assert(0 < lockmethodid && lockmethodid < lengthof(LockMethods));
return LockMethods[lockmethodid];
}
/* /*
* Compute the hash code associated with a LOCKTAG. * Compute the hash code associated with a LOCKTAG.
...@@ -3406,7 +3421,10 @@ GetLockStatusData(void) ...@@ -3406,7 +3421,10 @@ GetLockStatusData(void)
* impractical (in particular, note MAX_SIMUL_LWLOCKS). It shouldn't * impractical (in particular, note MAX_SIMUL_LWLOCKS). It shouldn't
* matter too much, because none of these locks can be involved in lock * matter too much, because none of these locks can be involved in lock
* conflicts anyway - anything that might must be present in the main lock * conflicts anyway - anything that might must be present in the main lock
* table. * table. (For the same reason, we don't sweat about making leaderPid
* completely valid. We cannot safely dereference another backend's
* lockGroupLeader field without holding all lock partition locks, and
* it's not worth that.)
*/ */
for (i = 0; i < ProcGlobal->allProcCount; ++i) for (i = 0; i < ProcGlobal->allProcCount; ++i)
{ {
...@@ -3439,6 +3457,7 @@ GetLockStatusData(void) ...@@ -3439,6 +3457,7 @@ GetLockStatusData(void)
instance->backend = proc->backendId; instance->backend = proc->backendId;
instance->lxid = proc->lxid; instance->lxid = proc->lxid;
instance->pid = proc->pid; instance->pid = proc->pid;
instance->leaderPid = proc->pid;
instance->fastpath = true; instance->fastpath = true;
el++; el++;
...@@ -3466,6 +3485,7 @@ GetLockStatusData(void) ...@@ -3466,6 +3485,7 @@ GetLockStatusData(void)
instance->backend = proc->backendId; instance->backend = proc->backendId;
instance->lxid = proc->lxid; instance->lxid = proc->lxid;
instance->pid = proc->pid; instance->pid = proc->pid;
instance->leaderPid = proc->pid;
instance->fastpath = true; instance->fastpath = true;
el++; el++;
...@@ -3517,6 +3537,7 @@ GetLockStatusData(void) ...@@ -3517,6 +3537,7 @@ GetLockStatusData(void)
instance->backend = proc->backendId; instance->backend = proc->backendId;
instance->lxid = proc->lxid; instance->lxid = proc->lxid;
instance->pid = proc->pid; instance->pid = proc->pid;
instance->leaderPid = proclock->groupLeader->pid;
instance->fastpath = false; instance->fastpath = false;
el++; el++;
...@@ -3537,6 +3558,197 @@ GetLockStatusData(void) ...@@ -3537,6 +3558,197 @@ GetLockStatusData(void)
return data; return data;
} }
/*
* GetBlockerStatusData - Return a summary of the lock manager's state
* concerning locks that are blocking the specified PID or any member of
* the PID's lock group, for use in a user-level reporting function.
*
* For each PID within the lock group that is awaiting some heavyweight lock,
* the return data includes an array of LockInstanceData objects, which are
* the same data structure used by GetLockStatusData; but unlike that function,
* this one reports only the PROCLOCKs associated with the lock that that PID
* is blocked on. (Hence, all the locktags should be the same for any one
* blocked PID.) In addition, we return an array of the PIDs of those backends
* that are ahead of the blocked PID in the lock's wait queue. These can be
* compared with the PIDs in the LockInstanceData objects to determine which
* waiters are ahead of or behind the blocked PID in the queue.
*
* If blocked_pid isn't a valid backend PID or nothing in its lock group is
* waiting on any heavyweight lock, return empty arrays.
*
* The design goal is to hold the LWLocks for as short a time as possible;
* thus, this function simply makes a copy of the necessary data and releases
* the locks, allowing the caller to contemplate and format the data for as
* long as it pleases.
*/
BlockedProcsData *
GetBlockerStatusData(int blocked_pid)
{
BlockedProcsData *data;
PGPROC *proc;
int i;
data = (BlockedProcsData *) palloc(sizeof(BlockedProcsData));
/*
* Guess how much space we'll need, and preallocate. Most of the time
* this will avoid needing to do repalloc while holding the LWLocks. (We
* assume, but check with an Assert, that MaxBackends is enough entries
* for the procs[] array; the other two could need enlargement, though.)
*/
data->nprocs = data->nlocks = data->npids = 0;
data->maxprocs = data->maxlocks = data->maxpids = MaxBackends;
data->procs = (BlockedProcData *) palloc(sizeof(BlockedProcData) * data->maxprocs);
data->locks = (LockInstanceData *) palloc(sizeof(LockInstanceData) * data->maxlocks);
data->waiter_pids = (int *) palloc(sizeof(int) * data->maxpids);
/*
* In order to search the ProcArray for blocked_pid and assume that that
* entry won't immediately disappear under us, we must hold ProcArrayLock.
* In addition, to examine the lock grouping fields of any other backend,
* we must hold all the hash partition locks. (Only one of those locks is
* actually relevant for any one lock group, but we can't know which one
* ahead of time.) It's fairly annoying to hold all those locks
* throughout this, but it's no worse than GetLockStatusData(), and it
* does have the advantage that we're guaranteed to return a
* self-consistent instantaneous state.
*/
LWLockAcquire(ProcArrayLock, LW_SHARED);
proc = BackendPidGetProcWithLock(blocked_pid);
/* Nothing to do if it's gone */
if (proc != NULL)
{
/*
* Acquire lock on the entire shared lock data structure. See notes
* in GetLockStatusData().
*/
for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
LWLockAcquire(LockHashPartitionLockByIndex(i), LW_SHARED);
if (proc->lockGroupLeader == NULL)
{
/* Easy case, proc is not a lock group member */
GetSingleProcBlockerStatusData(proc, data);
}
else
{
/* Examine all procs in proc's lock group */
dlist_iter iter;
dlist_foreach(iter, &proc->lockGroupLeader->lockGroupMembers)
{
PGPROC *memberProc;
memberProc = dlist_container(PGPROC, lockGroupLink, iter.cur);
GetSingleProcBlockerStatusData(memberProc, data);
}
}
/*
* And release locks. See notes in GetLockStatusData().
*/
for (i = NUM_LOCK_PARTITIONS; --i >= 0;)
LWLockRelease(LockHashPartitionLockByIndex(i));
Assert(data->nprocs <= data->maxprocs);
}
LWLockRelease(ProcArrayLock);
return data;
}
/* Accumulate data about one possibly-blocked proc for GetBlockerStatusData */
static void
GetSingleProcBlockerStatusData(PGPROC *blocked_proc, BlockedProcsData *data)
{
LOCK *theLock = blocked_proc->waitLock;
BlockedProcData *bproc;
SHM_QUEUE *procLocks;
PROCLOCK *proclock;
PROC_QUEUE *waitQueue;
PGPROC *proc;
int queue_size;
int i;
/* Nothing to do if this proc is not blocked */
if (theLock == NULL)
return;
/* Set up a procs[] element */
bproc = &data->procs[data->nprocs++];
bproc->pid = blocked_proc->pid;
bproc->first_lock = data->nlocks;
bproc->first_waiter = data->npids;
/*
* We may ignore the proc's fast-path arrays, since nothing in those could
* be related to a contended lock.
*/
/* Collect all PROCLOCKs associated with theLock */
procLocks = &(theLock->procLocks);
proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
offsetof(PROCLOCK, lockLink));
while (proclock)
{
PGPROC *proc = proclock->tag.myProc;
LOCK *lock = proclock->tag.myLock;
LockInstanceData *instance;
if (data->nlocks >= data->maxlocks)
{
data->maxlocks += MaxBackends;
data->locks = (LockInstanceData *)
repalloc(data->locks, sizeof(LockInstanceData) * data->maxlocks);
}
instance = &data->locks[data->nlocks];
memcpy(&instance->locktag, &lock->tag, sizeof(LOCKTAG));
instance->holdMask = proclock->holdMask;
if (proc->waitLock == lock)
instance->waitLockMode = proc->waitLockMode;
else
instance->waitLockMode = NoLock;
instance->backend = proc->backendId;
instance->lxid = proc->lxid;
instance->pid = proc->pid;
instance->leaderPid = proclock->groupLeader->pid;
instance->fastpath = false;
data->nlocks++;
proclock = (PROCLOCK *) SHMQueueNext(procLocks, &proclock->lockLink,
offsetof(PROCLOCK, lockLink));
}
/* Enlarge waiter_pids[] if it's too small to hold all wait queue PIDs */
waitQueue = &(theLock->waitProcs);
queue_size = waitQueue->size;
if (queue_size > data->maxpids - data->npids)
{
data->maxpids = Max(data->maxpids + MaxBackends,
data->npids + queue_size);
data->waiter_pids = (int *) repalloc(data->waiter_pids,
sizeof(int) * data->maxpids);
}
/* Collect PIDs from the lock's wait queue, stopping at blocked_proc */
proc = (PGPROC *) waitQueue->links.next;
for (i = 0; i < queue_size; i++)
{
if (proc == blocked_proc)
break;
data->waiter_pids[data->npids++] = proc->pid;
proc = (PGPROC *) proc->links.next;
}
bproc->num_locks = data->nlocks - bproc->first_lock;
bproc->num_waiters = data->npids - bproc->first_waiter;
}
/* /*
* Returns a list of currently held AccessExclusiveLocks, for use by * Returns a list of currently held AccessExclusiveLocks, for use by
* LogStandbySnapshot(). The result is a palloc'd array, * LogStandbySnapshot(). The result is a palloc'd array,
......
...@@ -18,6 +18,7 @@ ...@@ -18,6 +18,7 @@
#include "funcapi.h" #include "funcapi.h"
#include "miscadmin.h" #include "miscadmin.h"
#include "storage/predicate_internals.h" #include "storage/predicate_internals.h"
#include "utils/array.h"
#include "utils/builtins.h" #include "utils/builtins.h"
...@@ -99,7 +100,7 @@ pg_lock_status(PG_FUNCTION_ARGS) ...@@ -99,7 +100,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx); oldcontext = MemoryContextSwitchTo(funcctx->multi_call_memory_ctx);
/* build tupdesc for result tuples */ /* build tupdesc for result tuples */
/* this had better match pg_locks view in system_views.sql */ /* this had better match function's declaration in pg_proc.h */
tupdesc = CreateTemplateTupleDesc(NUM_LOCK_STATUS_COLUMNS, false); tupdesc = CreateTemplateTupleDesc(NUM_LOCK_STATUS_COLUMNS, false);
TupleDescInitEntry(tupdesc, (AttrNumber) 1, "locktype", TupleDescInitEntry(tupdesc, (AttrNumber) 1, "locktype",
TEXTOID, -1, 0); TEXTOID, -1, 0);
...@@ -394,6 +395,128 @@ pg_lock_status(PG_FUNCTION_ARGS) ...@@ -394,6 +395,128 @@ pg_lock_status(PG_FUNCTION_ARGS)
} }
/*
* pg_blocking_pids - produce an array of the PIDs blocking given PID
*
* The reported PIDs are those that hold a lock conflicting with blocked_pid's
* current request (hard block), or are requesting such a lock and are ahead
* of blocked_pid in the lock's wait queue (soft block).
*
* In parallel-query cases, we report all PIDs blocking any member of the
* given PID's lock group, and the reported PIDs are those of the blocking
* PIDs' lock group leaders. This allows callers to compare the result to
* lists of clients' pg_backend_pid() results even during a parallel query.
*
* Parallel query makes it possible for there to be duplicate PIDs in the
* result (either because multiple waiters are blocked by same PID, or
* because multiple blockers have same group leader PID). We do not bother
* to eliminate such duplicates from the result.
*
* We need not consider predicate locks here, since those don't block anything.
*/
Datum
pg_blocking_pids(PG_FUNCTION_ARGS)
{
int blocked_pid = PG_GETARG_INT32(0);
Datum *arrayelems;
int narrayelems;
BlockedProcsData *lockData; /* state data from lmgr */
int i,
j;
/* Collect a snapshot of lock manager state */
lockData = GetBlockerStatusData(blocked_pid);
/* We can't need more output entries than there are reported PROCLOCKs */
arrayelems = (Datum *) palloc(lockData->nlocks * sizeof(Datum));
narrayelems = 0;
/* For each blocked proc in the lock group ... */
for (i = 0; i < lockData->nprocs; i++)
{
BlockedProcData *bproc = &lockData->procs[i];
LockInstanceData *instances = &lockData->locks[bproc->first_lock];
int *preceding_waiters = &lockData->waiter_pids[bproc->first_waiter];
LockInstanceData *blocked_instance;
LockMethod lockMethodTable;
int conflictMask;
/*
* Locate the blocked proc's own entry in the LockInstanceData array.
* There should be exactly one matching entry.
*/
blocked_instance = NULL;
for (j = 0; j < bproc->num_locks; j++)
{
LockInstanceData *instance = &(instances[j]);
if (instance->pid == bproc->pid)
{
Assert(blocked_instance == NULL);
blocked_instance = instance;
}
}
Assert(blocked_instance != NULL);
lockMethodTable = GetLockTagsMethodTable(&(blocked_instance->locktag));
conflictMask = lockMethodTable->conflictTab[blocked_instance->waitLockMode];
/* Now scan the PROCLOCK data for conflicting procs */
for (j = 0; j < bproc->num_locks; j++)
{
LockInstanceData *instance = &(instances[j]);
/* A proc never blocks itself, so ignore that entry */
if (instance == blocked_instance)
continue;
/* Members of same lock group never block each other, either */
if (instance->leaderPid == blocked_instance->leaderPid)
continue;
if (conflictMask & instance->holdMask)
{
/* hard block: blocked by lock already held by this entry */
}
else if (instance->waitLockMode != NoLock &&
(conflictMask & LOCKBIT_ON(instance->waitLockMode)))
{
/* conflict in lock requests; who's in front in wait queue? */
bool ahead = false;
int k;
for (k = 0; k < bproc->num_waiters; k++)
{
if (preceding_waiters[k] == instance->pid)
{
/* soft block: this entry is ahead of blocked proc */
ahead = true;
break;
}
}
if (!ahead)
continue; /* not blocked by this entry */
}
else
{
/* not blocked by this entry */
continue;
}
/* blocked by this entry, so emit a record */
arrayelems[narrayelems++] = Int32GetDatum(instance->leaderPid);
}
}
/* Assert we didn't overrun arrayelems[] */
Assert(narrayelems <= lockData->nlocks);
/* Construct array, using hardwired knowledge about int4 type */
PG_RETURN_ARRAYTYPE_P(construct_array(arrayelems, narrayelems,
INT4OID,
sizeof(int32), true, 'i'));
}
/* /*
* Functions for manipulating advisory locks * Functions for manipulating advisory locks
* *
......
...@@ -53,6 +53,6 @@ ...@@ -53,6 +53,6 @@
*/ */
/* yyyymmddN */ /* yyyymmddN */
#define CATALOG_VERSION_NO 201602201 #define CATALOG_VERSION_NO 201602221
#endif #endif
...@@ -3012,6 +3012,8 @@ DATA(insert OID = 3329 ( pg_show_all_file_settings PGNSP PGUID 12 1 1000 0 0 f ...@@ -3012,6 +3012,8 @@ DATA(insert OID = 3329 ( pg_show_all_file_settings PGNSP PGUID 12 1 1000 0 0 f
DESCR("show config file settings"); DESCR("show config file settings");
DATA(insert OID = 1371 ( pg_lock_status PGNSP PGUID 12 1 1000 0 0 f f f f t t v s 0 0 2249 "" "{25,26,26,23,21,25,28,26,26,21,25,23,25,16,16}" "{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}" "{locktype,database,relation,page,tuple,virtualxid,transactionid,classid,objid,objsubid,virtualtransaction,pid,mode,granted,fastpath}" _null_ _null_ pg_lock_status _null_ _null_ _null_ )); DATA(insert OID = 1371 ( pg_lock_status PGNSP PGUID 12 1 1000 0 0 f f f f t t v s 0 0 2249 "" "{25,26,26,23,21,25,28,26,26,21,25,23,25,16,16}" "{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}" "{locktype,database,relation,page,tuple,virtualxid,transactionid,classid,objid,objsubid,virtualtransaction,pid,mode,granted,fastpath}" _null_ _null_ pg_lock_status _null_ _null_ _null_ ));
DESCR("view system lock information"); DESCR("view system lock information");
DATA(insert OID = 2561 ( pg_blocking_pids PGNSP PGUID 12 1 0 0 0 f f f f t f v s 1 0 1007 "23" _null_ _null_ _null_ _null_ _null_ pg_blocking_pids _null_ _null_ _null_ ));
DESCR("get array of PIDs of sessions blocking specified backend PID");
DATA(insert OID = 1065 ( pg_prepared_xact PGNSP PGUID 12 1 1000 0 0 f f f f t t v s 0 0 2249 "" "{28,25,1184,26,26}" "{o,o,o,o,o}" "{transaction,gid,prepared,ownerid,dbid}" _null_ _null_ pg_prepared_xact _null_ _null_ _null_ )); DATA(insert OID = 1065 ( pg_prepared_xact PGNSP PGUID 12 1 1000 0 0 f f f f t t v s 0 0 2249 "" "{28,25,1184,26,26}" "{o,o,o,o,o}" "{transaction,gid,prepared,ownerid,dbid}" _null_ _null_ pg_prepared_xact _null_ _null_ _null_ ));
DESCR("view two-phase transactions"); DESCR("view two-phase transactions");
DATA(insert OID = 3819 ( pg_get_multixact_members PGNSP PGUID 12 1 1000 0 0 f f f f t t v s 1 0 2249 "28" "{28,28,25}" "{i,o,o}" "{multixid,xid,mode}" _null_ _null_ pg_get_multixact_members _null_ _null_ _null_ )); DATA(insert OID = 3819 ( pg_get_multixact_members PGNSP PGUID 12 1 1000 0 0 f f f f t t v s 1 0 2249 "28" "{28,28,25}" "{i,o,o}" "{multixid,xid,mode}" _null_ _null_ pg_get_multixact_members _null_ _null_ _null_ ));
......
...@@ -346,7 +346,7 @@ typedef struct PROCLOCK ...@@ -346,7 +346,7 @@ typedef struct PROCLOCK
PROCLOCKTAG tag; /* unique identifier of proclock object */ PROCLOCKTAG tag; /* unique identifier of proclock object */
/* data */ /* data */
PGPROC *groupLeader; /* group leader, or NULL if no lock group */ PGPROC *groupLeader; /* proc's lock group leader, or proc itself */
LOCKMASK holdMask; /* bitmask for lock types currently held */ LOCKMASK holdMask; /* bitmask for lock types currently held */
LOCKMASK releaseMask; /* bitmask for lock types to be released */ LOCKMASK releaseMask; /* bitmask for lock types to be released */
SHM_QUEUE lockLink; /* list link in LOCK's list of proclocks */ SHM_QUEUE lockLink; /* list link in LOCK's list of proclocks */
...@@ -423,21 +423,48 @@ typedef struct LOCALLOCK ...@@ -423,21 +423,48 @@ typedef struct LOCALLOCK
typedef struct LockInstanceData typedef struct LockInstanceData
{ {
LOCKTAG locktag; /* locked object */ LOCKTAG locktag; /* tag for locked object */
LOCKMASK holdMask; /* locks held by this PGPROC */ LOCKMASK holdMask; /* locks held by this PGPROC */
LOCKMODE waitLockMode; /* lock awaited by this PGPROC, if any */ LOCKMODE waitLockMode; /* lock awaited by this PGPROC, if any */
BackendId backend; /* backend ID of this PGPROC */ BackendId backend; /* backend ID of this PGPROC */
LocalTransactionId lxid; /* local transaction ID of this PGPROC */ LocalTransactionId lxid; /* local transaction ID of this PGPROC */
int pid; /* pid of this PGPROC */ int pid; /* pid of this PGPROC */
int leaderPid; /* pid of group leader; = pid if no group */
bool fastpath; /* taken via fastpath? */ bool fastpath; /* taken via fastpath? */
} LockInstanceData; } LockInstanceData;
typedef struct LockData typedef struct LockData
{ {
int nelements; /* The length of the array */ int nelements; /* The length of the array */
LockInstanceData *locks; LockInstanceData *locks; /* Array of per-PROCLOCK information */
} LockData; } LockData;
typedef struct BlockedProcData
{
int pid; /* pid of a blocked PGPROC */
/* Per-PROCLOCK information about PROCLOCKs of the lock the pid awaits */
/* (these fields refer to indexes in BlockedProcsData.locks[]) */
int first_lock; /* index of first relevant LockInstanceData */
int num_locks; /* number of relevant LockInstanceDatas */
/* PIDs of PGPROCs that are ahead of "pid" in the lock's wait queue */
/* (these fields refer to indexes in BlockedProcsData.waiter_pids[]) */
int first_waiter; /* index of first preceding waiter */
int num_waiters; /* number of preceding waiters */
} BlockedProcData;
typedef struct BlockedProcsData
{
BlockedProcData *procs; /* Array of per-blocked-proc information */
LockInstanceData *locks; /* Array of per-PROCLOCK information */
int *waiter_pids; /* Array of PIDs of other blocked PGPROCs */
int nprocs; /* # of valid entries in procs[] array */
int maxprocs; /* Allocated length of procs[] array */
int nlocks; /* # of valid entries in locks[] array */
int maxlocks; /* Allocated length of locks[] array */
int npids; /* # of valid entries in waiter_pids[] array */
int maxpids; /* Allocated length of waiter_pids[] array */
} BlockedProcsData;
/* Result codes for LockAcquire() */ /* Result codes for LockAcquire() */
typedef enum typedef enum
...@@ -489,6 +516,7 @@ typedef enum ...@@ -489,6 +516,7 @@ typedef enum
*/ */
extern void InitLocks(void); extern void InitLocks(void);
extern LockMethod GetLocksMethodTable(const LOCK *lock); extern LockMethod GetLocksMethodTable(const LOCK *lock);
extern LockMethod GetLockTagsMethodTable(const LOCKTAG *locktag);
extern uint32 LockTagHashCode(const LOCKTAG *locktag); extern uint32 LockTagHashCode(const LOCKTAG *locktag);
extern bool DoLockModesConflict(LOCKMODE mode1, LOCKMODE mode2); extern bool DoLockModesConflict(LOCKMODE mode1, LOCKMODE mode2);
extern LockAcquireResult LockAcquire(const LOCKTAG *locktag, extern LockAcquireResult LockAcquire(const LOCKTAG *locktag,
...@@ -521,6 +549,7 @@ extern void GrantAwaitedLock(void); ...@@ -521,6 +549,7 @@ extern void GrantAwaitedLock(void);
extern void RemoveFromWaitQueue(PGPROC *proc, uint32 hashcode); extern void RemoveFromWaitQueue(PGPROC *proc, uint32 hashcode);
extern Size LockShmemSize(void); extern Size LockShmemSize(void);
extern LockData *GetLockStatusData(void); extern LockData *GetLockStatusData(void);
extern BlockedProcsData *GetBlockerStatusData(int blocked_pid);
extern xl_standby_lock *GetRunningTransactionLocks(int *nlocks); extern xl_standby_lock *GetRunningTransactionLocks(int *nlocks);
extern const char *GetLockmodeName(LOCKMETHODID lockmethodid, LOCKMODE mode); extern const char *GetLockmodeName(LOCKMETHODID lockmethodid, LOCKMODE mode);
......
...@@ -61,6 +61,7 @@ extern VirtualTransactionId *GetVirtualXIDsDelayingChkpt(int *nvxids); ...@@ -61,6 +61,7 @@ extern VirtualTransactionId *GetVirtualXIDsDelayingChkpt(int *nvxids);
extern bool HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids); extern bool HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids);
extern PGPROC *BackendPidGetProc(int pid); extern PGPROC *BackendPidGetProc(int pid);
extern PGPROC *BackendPidGetProcWithLock(int pid);
extern int BackendXidGetPid(TransactionId xid); extern int BackendXidGetPid(TransactionId xid);
extern bool IsBackendPid(int pid); extern bool IsBackendPid(int pid);
......
...@@ -1157,6 +1157,7 @@ extern Datum row_security_active_name(PG_FUNCTION_ARGS); ...@@ -1157,6 +1157,7 @@ extern Datum row_security_active_name(PG_FUNCTION_ARGS);
/* lockfuncs.c */ /* lockfuncs.c */
extern Datum pg_lock_status(PG_FUNCTION_ARGS); extern Datum pg_lock_status(PG_FUNCTION_ARGS);
extern Datum pg_blocking_pids(PG_FUNCTION_ARGS);
extern Datum pg_advisory_lock_int8(PG_FUNCTION_ARGS); extern Datum pg_advisory_lock_int8(PG_FUNCTION_ARGS);
extern Datum pg_advisory_xact_lock_int8(PG_FUNCTION_ARGS); extern Datum pg_advisory_xact_lock_int8(PG_FUNCTION_ARGS);
extern Datum pg_advisory_lock_shared_int8(PG_FUNCTION_ARGS); extern Datum pg_advisory_lock_shared_int8(PG_FUNCTION_ARGS);
......
...@@ -227,27 +227,12 @@ main(int argc, char **argv) ...@@ -227,27 +227,12 @@ main(int argc, char **argv)
*/ */
initPQExpBuffer(&wait_query); initPQExpBuffer(&wait_query);
appendPQExpBufferStr(&wait_query, appendPQExpBufferStr(&wait_query,
"SELECT 1 FROM pg_locks holder, pg_locks waiter " "SELECT pg_catalog.pg_blocking_pids($1) && '{");
"WHERE NOT waiter.granted AND waiter.pid = $1 "
"AND holder.granted "
"AND holder.pid <> $1 AND holder.pid IN (");
/* The spec syntax requires at least one session; assume that here. */ /* The spec syntax requires at least one session; assume that here. */
appendPQExpBufferStr(&wait_query, backend_pids[1]); appendPQExpBufferStr(&wait_query, backend_pids[1]);
for (i = 2; i < nconns; i++) for (i = 2; i < nconns; i++)
appendPQExpBuffer(&wait_query, ", %s", backend_pids[i]); appendPQExpBuffer(&wait_query, ",%s", backend_pids[i]);
appendPQExpBufferStr(&wait_query, appendPQExpBufferStr(&wait_query, "}'::integer[]");
") "
"AND holder.locktype IS NOT DISTINCT FROM waiter.locktype "
"AND holder.database IS NOT DISTINCT FROM waiter.database "
"AND holder.relation IS NOT DISTINCT FROM waiter.relation "
"AND holder.page IS NOT DISTINCT FROM waiter.page "
"AND holder.tuple IS NOT DISTINCT FROM waiter.tuple "
"AND holder.virtualxid IS NOT DISTINCT FROM waiter.virtualxid "
"AND holder.transactionid IS NOT DISTINCT FROM waiter.transactionid "
"AND holder.classid IS NOT DISTINCT FROM waiter.classid "
"AND holder.objid IS NOT DISTINCT FROM waiter.objid "
"AND holder.objsubid IS NOT DISTINCT FROM waiter.objsubid ");
res = PQprepare(conns[0], PREP_WAITING, wait_query.data, 0, NULL); res = PQprepare(conns[0], PREP_WAITING, wait_query.data, 0, NULL);
if (PQresultStatus(res) != PGRES_COMMAND_OK) if (PQresultStatus(res) != PGRES_COMMAND_OK)
...@@ -745,21 +730,22 @@ try_complete_step(Step *step, int flags) ...@@ -745,21 +730,22 @@ try_complete_step(Step *step, int flags)
/* If it's OK for the step to block, check whether it has. */ /* If it's OK for the step to block, check whether it has. */
if (flags & STEP_NONBLOCK) if (flags & STEP_NONBLOCK)
{ {
int ntuples; bool waiting;
res = PQexecPrepared(conns[0], PREP_WAITING, 1, res = PQexecPrepared(conns[0], PREP_WAITING, 1,
&backend_pids[step->session + 1], &backend_pids[step->session + 1],
NULL, NULL, 0); NULL, NULL, 0);
if (PQresultStatus(res) != PGRES_TUPLES_OK) if (PQresultStatus(res) != PGRES_TUPLES_OK ||
PQntuples(res) != 1)
{ {
fprintf(stderr, "lock wait query failed: %s", fprintf(stderr, "lock wait query failed: %s",
PQerrorMessage(conn)); PQerrorMessage(conn));
exit_nicely(); exit_nicely();
} }
ntuples = PQntuples(res); waiting = ((PQgetvalue(res, 0, 0))[0] == 't');
PQclear(res); PQclear(res);
if (ntuples >= 1) /* waiting to acquire a lock */ if (waiting) /* waiting to acquire a lock */
{ {
if (!(flags & STEP_RETRY)) if (!(flags & STEP_RETRY))
printf("step %s: %s <waiting ...>\n", printf("step %s: %s <waiting ...>\n",
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment