Commit a794fb06 authored by Tom Lane's avatar Tom Lane

Convert the lock manager to use the new dynahash.c support for partitioned

hash tables, instead of the previous kluge involving multiple hash tables.
This partially undoes my patch of last December.
parent b25dc481
$PostgreSQL: pgsql/src/backend/storage/lmgr/README,v 1.19 2005/12/11 21:02:18 tgl Exp $ $PostgreSQL: pgsql/src/backend/storage/lmgr/README,v 1.20 2006/07/23 23:08:46 tgl Exp $
LOCKING OVERVIEW LOCKING OVERVIEW
...@@ -148,13 +148,21 @@ The lock manager's PROCLOCK objects contain: ...@@ -148,13 +148,21 @@ The lock manager's PROCLOCK objects contain:
tag - tag -
The key fields that are used for hashing entries in the shared memory The key fields that are used for hashing entries in the shared memory
PROCLOCK hash table. This is declared as a separate struct to ensure that PROCLOCK hash table. This is declared as a separate struct to ensure that
we always zero out the correct number of bytes. we always zero out the correct number of bytes. It is critical that any
alignment-padding bytes the compiler might insert in the struct be zeroed
out, else the hash computation will be random. (Currently, we are careful
to define struct PROCLOCKTAG so that there are no padding bytes.)
tag.lock tag.myLock
SHMEM offset of the LOCK object this PROCLOCK is for. Pointer to the shared LOCK object this PROCLOCK is for.
tag.proc tag.myProc
SHMEM offset of PGPROC of backend process that owns this PROCLOCK. Pointer to the PGPROC of backend process that owns this PROCLOCK.
Note: it's OK to use pointers here because a PROCLOCK never outlives
either its lock or its proc. The tag is therefore unique for as long
as it needs to be, even though the same tag values might mean something
else at other times.
holdMask - holdMask -
A bitmask for the lock modes successfully acquired by this PROCLOCK. A bitmask for the lock modes successfully acquired by this PROCLOCK.
...@@ -191,12 +199,18 @@ Most operations only need to lock the single partition they are working in. ...@@ -191,12 +199,18 @@ Most operations only need to lock the single partition they are working in.
Here are the details: Here are the details:
* Each possible lock is assigned to one partition according to a hash of * Each possible lock is assigned to one partition according to a hash of
its LOCKTAG value (see LockTagToPartition()). The partition's LWLock is its LOCKTAG value. The partition's LWLock is considered to protect all the
considered to protect all the LOCK objects of that partition as well as LOCK objects of that partition as well as their subsidiary PROCLOCKs.
their subsidiary PROCLOCKs. The shared-memory hash tables for LOCKs and
PROCLOCKs are divided into separate hash tables for each partition, and * The shared-memory hash tables for LOCKs and PROCLOCKs are organized
operations on each hash table are likewise protected by the partition so that different partitions use different hash chains, and thus there
lock. is no conflict in working with objects in different partitions. This
is supported directly by dynahash.c's "partitioned table" mechanism
for the LOCK table: we need only ensure that the partition number is
taken from the low-order bits of the dynahash hash value for the LOCKTAG.
To make it work for PROCLOCKs, we have to ensure that a PROCLOCK's hash
value has the same low-order bits as its associated LOCK. This requires
a specialized hash function (see proclock_hash).
* Formerly, each PGPROC had a single list of PROCLOCKs belonging to it. * Formerly, each PGPROC had a single list of PROCLOCKs belonging to it.
This has now been split into per-partition lists, so that access to a This has now been split into per-partition lists, so that access to a
...@@ -226,9 +240,10 @@ deadlock checking should not occur often enough to be performance-critical, ...@@ -226,9 +240,10 @@ deadlock checking should not occur often enough to be performance-critical,
trying to make this work does not seem a productive use of effort. trying to make this work does not seem a productive use of effort.
A backend's internal LOCALLOCK hash table is not partitioned. We do store A backend's internal LOCALLOCK hash table is not partitioned. We do store
the partition number in LOCALLOCK table entries, but this is a straight a copy of the locktag hash code in LOCALLOCK table entries, from which the
speed-for-space tradeoff: we could instead recalculate the partition partition number can be computed, but this is a straight speed-for-space
number from the LOCKTAG when needed. tradeoff: we could instead recalculate the partition number from the LOCKTAG
when needed.
THE DEADLOCK DETECTION ALGORITHM THE DEADLOCK DETECTION ALGORITHM
......
...@@ -12,7 +12,7 @@ ...@@ -12,7 +12,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/storage/lmgr/deadlock.c,v 1.40 2006/07/14 14:52:23 momjian Exp $ * $PostgreSQL: pgsql/src/backend/storage/lmgr/deadlock.c,v 1.41 2006/07/23 23:08:46 tgl Exp $
* *
* Interface: * Interface:
* *
...@@ -480,7 +480,7 @@ FindLockCycleRecurse(PGPROC *checkProc, ...@@ -480,7 +480,7 @@ FindLockCycleRecurse(PGPROC *checkProc,
while (proclock) while (proclock)
{ {
proc = (PGPROC *) MAKE_PTR(proclock->tag.proc); proc = proclock->tag.myProc;
/* A proc never blocks itself */ /* A proc never blocks itself */
if (proc != checkProc) if (proc != checkProc)
......
This diff is collapsed.
...@@ -8,7 +8,7 @@ ...@@ -8,7 +8,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/storage/lmgr/proc.c,v 1.177 2006/07/14 14:52:23 momjian Exp $ * $PostgreSQL: pgsql/src/backend/storage/lmgr/proc.c,v 1.178 2006/07/23 23:08:46 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
...@@ -461,13 +461,13 @@ LockWaitCancel(void) ...@@ -461,13 +461,13 @@ LockWaitCancel(void)
disable_sig_alarm(false); disable_sig_alarm(false);
/* Unlink myself from the wait queue, if on it (might not be anymore!) */ /* Unlink myself from the wait queue, if on it (might not be anymore!) */
partitionLock = FirstLockMgrLock + lockAwaited->partition; partitionLock = LockHashPartitionLock(lockAwaited->hashcode);
LWLockAcquire(partitionLock, LW_EXCLUSIVE); LWLockAcquire(partitionLock, LW_EXCLUSIVE);
if (MyProc->links.next != INVALID_OFFSET) if (MyProc->links.next != INVALID_OFFSET)
{ {
/* We could not have been granted the lock yet */ /* We could not have been granted the lock yet */
RemoveFromWaitQueue(MyProc, lockAwaited->partition); RemoveFromWaitQueue(MyProc, lockAwaited->hashcode);
} }
else else
{ {
...@@ -673,8 +673,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable) ...@@ -673,8 +673,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
LOCKMODE lockmode = locallock->tag.mode; LOCKMODE lockmode = locallock->tag.mode;
LOCK *lock = locallock->lock; LOCK *lock = locallock->lock;
PROCLOCK *proclock = locallock->proclock; PROCLOCK *proclock = locallock->proclock;
int partition = locallock->partition; uint32 hashcode = locallock->hashcode;
LWLockId partitionLock = FirstLockMgrLock + partition; LWLockId partitionLock = LockHashPartitionLock(hashcode);
PROC_QUEUE *waitQueue = &(lock->waitProcs); PROC_QUEUE *waitQueue = &(lock->waitProcs);
LOCKMASK myHeldLocks = MyProc->heldLocks; LOCKMASK myHeldLocks = MyProc->heldLocks;
bool early_deadlock = false; bool early_deadlock = false;
...@@ -776,7 +776,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable) ...@@ -776,7 +776,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
*/ */
if (early_deadlock) if (early_deadlock)
{ {
RemoveFromWaitQueue(MyProc, partition); RemoveFromWaitQueue(MyProc, hashcode);
return STATUS_ERROR; return STATUS_ERROR;
} }
...@@ -1025,7 +1025,7 @@ CheckDeadLock(void) ...@@ -1025,7 +1025,7 @@ CheckDeadLock(void)
* ProcSleep will report an error after we return from the signal handler. * ProcSleep will report an error after we return from the signal handler.
*/ */
Assert(MyProc->waitLock != NULL); Assert(MyProc->waitLock != NULL);
RemoveFromWaitQueue(MyProc, LockTagToPartition(&(MyProc->waitLock->tag))); RemoveFromWaitQueue(MyProc, LockTagHashCode(&(MyProc->waitLock->tag)));
/* /*
* Unlock my semaphore so that the interrupted ProcSleep() call can * Unlock my semaphore so that the interrupted ProcSleep() call can
......
...@@ -6,7 +6,7 @@ ...@@ -6,7 +6,7 @@
* Copyright (c) 2002-2006, PostgreSQL Global Development Group * Copyright (c) 2002-2006, PostgreSQL Global Development Group
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/utils/adt/lockfuncs.c,v 1.23 2006/07/14 14:52:24 momjian Exp $ * $PostgreSQL: pgsql/src/backend/utils/adt/lockfuncs.c,v 1.24 2006/07/23 23:08:46 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
...@@ -152,7 +152,7 @@ pg_lock_status(PG_FUNCTION_ARGS) ...@@ -152,7 +152,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
*/ */
if (!granted) if (!granted)
{ {
if (proc->waitLock == (LOCK *) MAKE_PTR(proclock->tag.lock)) if (proc->waitLock == proclock->tag.myLock)
{ {
/* Yes, so report it with proper mode */ /* Yes, so report it with proper mode */
mode = proc->waitLockMode; mode = proc->waitLockMode;
......
...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2006, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2006, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* $PostgreSQL: pgsql/src/include/storage/lock.h,v 1.95 2006/07/23 03:07:58 tgl Exp $ * $PostgreSQL: pgsql/src/include/storage/lock.h,v 1.96 2006/07/23 23:08:46 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
...@@ -266,7 +266,9 @@ typedef struct LOCK ...@@ -266,7 +266,9 @@ typedef struct LOCK
* *
* PROCLOCKTAG is the key information needed to look up a PROCLOCK item in the * PROCLOCKTAG is the key information needed to look up a PROCLOCK item in the
* proclock hashtable. A PROCLOCKTAG value uniquely identifies the combination * proclock hashtable. A PROCLOCKTAG value uniquely identifies the combination
* of a lockable object and a holder/waiter for that object. * of a lockable object and a holder/waiter for that object. (We can use
* pointers here because the PROCLOCKTAG need only be unique for the lifespan
* of the PROCLOCK, and it will never outlive the lock or the proc.)
* *
* Internally to a backend, it is possible for the same lock to be held * Internally to a backend, it is possible for the same lock to be held
* for different purposes: the backend tracks transaction locks separately * for different purposes: the backend tracks transaction locks separately
...@@ -292,8 +294,9 @@ typedef struct LOCK ...@@ -292,8 +294,9 @@ typedef struct LOCK
*/ */
typedef struct PROCLOCKTAG typedef struct PROCLOCKTAG
{ {
SHMEM_OFFSET lock; /* link to per-lockable-object information */ /* NB: we assume this struct contains no padding! */
SHMEM_OFFSET proc; /* link to PGPROC of owning backend */ LOCK *myLock; /* link to per-lockable-object information */
PGPROC *myProc; /* link to PGPROC of owning backend */
} PROCLOCKTAG; } PROCLOCKTAG;
typedef struct PROCLOCK typedef struct PROCLOCK
...@@ -309,7 +312,7 @@ typedef struct PROCLOCK ...@@ -309,7 +312,7 @@ typedef struct PROCLOCK
} PROCLOCK; } PROCLOCK;
#define PROCLOCK_LOCKMETHOD(proclock) \ #define PROCLOCK_LOCKMETHOD(proclock) \
LOCK_LOCKMETHOD(*((LOCK *) MAKE_PTR((proclock).tag.lock))) LOCK_LOCKMETHOD(*((proclock).tag.myLock))
/* /*
* Each backend also maintains a local hash table with information about each * Each backend also maintains a local hash table with information about each
...@@ -347,7 +350,7 @@ typedef struct LOCALLOCK ...@@ -347,7 +350,7 @@ typedef struct LOCALLOCK
LOCK *lock; /* associated LOCK object in shared mem */ LOCK *lock; /* associated LOCK object in shared mem */
PROCLOCK *proclock; /* associated PROCLOCK object in shmem */ PROCLOCK *proclock; /* associated PROCLOCK object in shmem */
bool isTempObject; /* true if lock is on a temporary object */ bool isTempObject; /* true if lock is on a temporary object */
int partition; /* ID of partition containing this lock */ uint32 hashcode; /* copy of LOCKTAG's hash value */
int nLocks; /* total number of times lock is held */ int nLocks; /* total number of times lock is held */
int numLockOwners; /* # of relevant ResourceOwners */ int numLockOwners; /* # of relevant ResourceOwners */
int maxLockOwners; /* allocated size of array */ int maxLockOwners; /* allocated size of array */
...@@ -360,15 +363,14 @@ typedef struct LOCALLOCK ...@@ -360,15 +363,14 @@ typedef struct LOCALLOCK
/* /*
* This struct holds information passed from lmgr internals to the lock * This struct holds information passed from lmgr internals to the lock
* listing user-level functions (lockfuncs.c). For each PROCLOCK in the * listing user-level functions (lockfuncs.c). For each PROCLOCK in the
* system, the SHMEM_OFFSET, PROCLOCK itself, and associated PGPROC and * system, copies of the PROCLOCK object and associated PGPROC and
* LOCK objects are stored. (Note there will often be multiple copies * LOCK objects are stored. Note there will often be multiple copies
* of the same PGPROC or LOCK.) We do not store the SHMEM_OFFSET of the * of the same PGPROC or LOCK --- to detect whether two are the same,
* PGPROC or LOCK separately, since they're in the PROCLOCK's tag fields. * compare the PROCLOCK tag fields.
*/ */
typedef struct typedef struct LockData
{ {
int nelements; /* The length of each of the arrays */ int nelements; /* The length of each of the arrays */
SHMEM_OFFSET *proclockaddrs;
PROCLOCK *proclocks; PROCLOCK *proclocks;
PGPROC *procs; PGPROC *procs;
LOCK *locks; LOCK *locks;
...@@ -384,12 +386,24 @@ typedef enum ...@@ -384,12 +386,24 @@ typedef enum
} LockAcquireResult; } LockAcquireResult;
/*
* The lockmgr's shared hash tables are partitioned to reduce contention.
* To determine which partition a given locktag belongs to, compute the tag's
* hash code with LockTagHashCode(), then apply one of these macros.
* NB: NUM_LOCK_PARTITIONS must be a power of 2!
*/
#define LockHashPartition(hashcode) \
((hashcode) % NUM_LOCK_PARTITIONS)
#define LockHashPartitionLock(hashcode) \
((LWLockId) (FirstLockMgrLock + LockHashPartition(hashcode)))
/* /*
* function prototypes * function prototypes
*/ */
extern void InitLocks(void); extern void InitLocks(void);
extern LockMethod GetLocksMethodTable(const LOCK *lock); extern LockMethod GetLocksMethodTable(const LOCK *lock);
extern int LockTagToPartition(const LOCKTAG *locktag); extern uint32 LockTagHashCode(const LOCKTAG *locktag);
extern LockAcquireResult LockAcquire(const LOCKTAG *locktag, extern LockAcquireResult LockAcquire(const LOCKTAG *locktag,
bool isTempObject, bool isTempObject,
LOCKMODE lockmode, LOCKMODE lockmode,
...@@ -407,7 +421,7 @@ extern int LockCheckConflicts(LockMethod lockMethodTable, ...@@ -407,7 +421,7 @@ extern int LockCheckConflicts(LockMethod lockMethodTable,
LOCK *lock, PROCLOCK *proclock, PGPROC *proc); LOCK *lock, PROCLOCK *proclock, PGPROC *proc);
extern void GrantLock(LOCK *lock, PROCLOCK *proclock, LOCKMODE lockmode); extern void GrantLock(LOCK *lock, PROCLOCK *proclock, LOCKMODE lockmode);
extern void GrantAwaitedLock(void); extern void GrantAwaitedLock(void);
extern void RemoveFromWaitQueue(PGPROC *proc, int partition); extern void RemoveFromWaitQueue(PGPROC *proc, uint32 hashcode);
extern Size LockShmemSize(void); extern Size LockShmemSize(void);
extern bool DeadLockCheck(PGPROC *proc); extern bool DeadLockCheck(PGPROC *proc);
extern void DeadLockReport(void); extern void DeadLockReport(void);
......
...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2006, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2006, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* $PostgreSQL: pgsql/src/include/storage/lwlock.h,v 1.29 2006/07/23 03:07:58 tgl Exp $ * $PostgreSQL: pgsql/src/include/storage/lwlock.h,v 1.30 2006/07/23 23:08:46 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
...@@ -24,7 +24,8 @@ ...@@ -24,7 +24,8 @@
#define NUM_BUFFER_PARTITIONS 16 #define NUM_BUFFER_PARTITIONS 16
/* Number of partitions the shared lock tables are divided into */ /* Number of partitions the shared lock tables are divided into */
#define NUM_LOCK_PARTITIONS 16 #define LOG2_NUM_LOCK_PARTITIONS 4
#define NUM_LOCK_PARTITIONS (1 << LOG2_NUM_LOCK_PARTITIONS)
/* /*
* We have a number of predefined LWLocks, plus a bunch of LWLocks that are * We have a number of predefined LWLocks, plus a bunch of LWLocks that are
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment