Commit a794fb06 authored by Tom Lane's avatar Tom Lane

Convert the lock manager to use the new dynahash.c support for partitioned

hash tables, instead of the previous kluge involving multiple hash tables.
This partially undoes my patch of last December.
parent b25dc481
$PostgreSQL: pgsql/src/backend/storage/lmgr/README,v 1.19 2005/12/11 21:02:18 tgl Exp $
$PostgreSQL: pgsql/src/backend/storage/lmgr/README,v 1.20 2006/07/23 23:08:46 tgl Exp $
LOCKING OVERVIEW
......@@ -148,13 +148,21 @@ The lock manager's PROCLOCK objects contain:
tag -
The key fields that are used for hashing entries in the shared memory
PROCLOCK hash table. This is declared as a separate struct to ensure that
we always zero out the correct number of bytes.
we always zero out the correct number of bytes. It is critical that any
alignment-padding bytes the compiler might insert in the struct be zeroed
out, else the hash computation will be random. (Currently, we are careful
to define struct PROCLOCKTAG so that there are no padding bytes.)
tag.lock
SHMEM offset of the LOCK object this PROCLOCK is for.
tag.myLock
Pointer to the shared LOCK object this PROCLOCK is for.
tag.proc
SHMEM offset of PGPROC of backend process that owns this PROCLOCK.
tag.myProc
Pointer to the PGPROC of backend process that owns this PROCLOCK.
Note: it's OK to use pointers here because a PROCLOCK never outlives
either its lock or its proc. The tag is therefore unique for as long
as it needs to be, even though the same tag values might mean something
else at other times.
holdMask -
A bitmask for the lock modes successfully acquired by this PROCLOCK.
......@@ -191,12 +199,18 @@ Most operations only need to lock the single partition they are working in.
Here are the details:
* Each possible lock is assigned to one partition according to a hash of
its LOCKTAG value (see LockTagToPartition()). The partition's LWLock is
considered to protect all the LOCK objects of that partition as well as
their subsidiary PROCLOCKs. The shared-memory hash tables for LOCKs and
PROCLOCKs are divided into separate hash tables for each partition, and
operations on each hash table are likewise protected by the partition
lock.
its LOCKTAG value. The partition's LWLock is considered to protect all the
LOCK objects of that partition as well as their subsidiary PROCLOCKs.
* The shared-memory hash tables for LOCKs and PROCLOCKs are organized
so that different partitions use different hash chains, and thus there
is no conflict in working with objects in different partitions. This
is supported directly by dynahash.c's "partitioned table" mechanism
for the LOCK table: we need only ensure that the partition number is
taken from the low-order bits of the dynahash hash value for the LOCKTAG.
To make it work for PROCLOCKs, we have to ensure that a PROCLOCK's hash
value has the same low-order bits as its associated LOCK. This requires
a specialized hash function (see proclock_hash).
* Formerly, each PGPROC had a single list of PROCLOCKs belonging to it.
This has now been split into per-partition lists, so that access to a
......@@ -226,9 +240,10 @@ deadlock checking should not occur often enough to be performance-critical,
trying to make this work does not seem a productive use of effort.
A backend's internal LOCALLOCK hash table is not partitioned. We do store
the partition number in LOCALLOCK table entries, but this is a straight
speed-for-space tradeoff: we could instead recalculate the partition
number from the LOCKTAG when needed.
a copy of the locktag hash code in LOCALLOCK table entries, from which the
partition number can be computed, but this is a straight speed-for-space
tradeoff: we could instead recalculate the partition number from the LOCKTAG
when needed.
THE DEADLOCK DETECTION ALGORITHM
......
......@@ -12,7 +12,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/storage/lmgr/deadlock.c,v 1.40 2006/07/14 14:52:23 momjian Exp $
* $PostgreSQL: pgsql/src/backend/storage/lmgr/deadlock.c,v 1.41 2006/07/23 23:08:46 tgl Exp $
*
* Interface:
*
......@@ -480,7 +480,7 @@ FindLockCycleRecurse(PGPROC *checkProc,
while (proclock)
{
proc = (PGPROC *) MAKE_PTR(proclock->tag.proc);
proc = proclock->tag.myProc;
/* A proc never blocks itself */
if (proc != checkProc)
......
This diff is collapsed.
......@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/storage/lmgr/proc.c,v 1.177 2006/07/14 14:52:23 momjian Exp $
* $PostgreSQL: pgsql/src/backend/storage/lmgr/proc.c,v 1.178 2006/07/23 23:08:46 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -461,13 +461,13 @@ LockWaitCancel(void)
disable_sig_alarm(false);
/* Unlink myself from the wait queue, if on it (might not be anymore!) */
partitionLock = FirstLockMgrLock + lockAwaited->partition;
partitionLock = LockHashPartitionLock(lockAwaited->hashcode);
LWLockAcquire(partitionLock, LW_EXCLUSIVE);
if (MyProc->links.next != INVALID_OFFSET)
{
/* We could not have been granted the lock yet */
RemoveFromWaitQueue(MyProc, lockAwaited->partition);
RemoveFromWaitQueue(MyProc, lockAwaited->hashcode);
}
else
{
......@@ -673,8 +673,8 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
LOCKMODE lockmode = locallock->tag.mode;
LOCK *lock = locallock->lock;
PROCLOCK *proclock = locallock->proclock;
int partition = locallock->partition;
LWLockId partitionLock = FirstLockMgrLock + partition;
uint32 hashcode = locallock->hashcode;
LWLockId partitionLock = LockHashPartitionLock(hashcode);
PROC_QUEUE *waitQueue = &(lock->waitProcs);
LOCKMASK myHeldLocks = MyProc->heldLocks;
bool early_deadlock = false;
......@@ -776,7 +776,7 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
*/
if (early_deadlock)
{
RemoveFromWaitQueue(MyProc, partition);
RemoveFromWaitQueue(MyProc, hashcode);
return STATUS_ERROR;
}
......@@ -1025,7 +1025,7 @@ CheckDeadLock(void)
* ProcSleep will report an error after we return from the signal handler.
*/
Assert(MyProc->waitLock != NULL);
RemoveFromWaitQueue(MyProc, LockTagToPartition(&(MyProc->waitLock->tag)));
RemoveFromWaitQueue(MyProc, LockTagHashCode(&(MyProc->waitLock->tag)));
/*
* Unlock my semaphore so that the interrupted ProcSleep() call can
......
......@@ -6,7 +6,7 @@
* Copyright (c) 2002-2006, PostgreSQL Global Development Group
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/utils/adt/lockfuncs.c,v 1.23 2006/07/14 14:52:24 momjian Exp $
* $PostgreSQL: pgsql/src/backend/utils/adt/lockfuncs.c,v 1.24 2006/07/23 23:08:46 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -152,7 +152,7 @@ pg_lock_status(PG_FUNCTION_ARGS)
*/
if (!granted)
{
if (proc->waitLock == (LOCK *) MAKE_PTR(proclock->tag.lock))
if (proc->waitLock == proclock->tag.myLock)
{
/* Yes, so report it with proper mode */
mode = proc->waitLockMode;
......
......@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2006, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/storage/lock.h,v 1.95 2006/07/23 03:07:58 tgl Exp $
* $PostgreSQL: pgsql/src/include/storage/lock.h,v 1.96 2006/07/23 23:08:46 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -266,7 +266,9 @@ typedef struct LOCK
*
* PROCLOCKTAG is the key information needed to look up a PROCLOCK item in the
* proclock hashtable. A PROCLOCKTAG value uniquely identifies the combination
* of a lockable object and a holder/waiter for that object.
* of a lockable object and a holder/waiter for that object. (We can use
* pointers here because the PROCLOCKTAG need only be unique for the lifespan
* of the PROCLOCK, and it will never outlive the lock or the proc.)
*
* Internally to a backend, it is possible for the same lock to be held
* for different purposes: the backend tracks transaction locks separately
......@@ -292,8 +294,9 @@ typedef struct LOCK
*/
typedef struct PROCLOCKTAG
{
SHMEM_OFFSET lock; /* link to per-lockable-object information */
SHMEM_OFFSET proc; /* link to PGPROC of owning backend */
/* NB: we assume this struct contains no padding! */
LOCK *myLock; /* link to per-lockable-object information */
PGPROC *myProc; /* link to PGPROC of owning backend */
} PROCLOCKTAG;
typedef struct PROCLOCK
......@@ -309,7 +312,7 @@ typedef struct PROCLOCK
} PROCLOCK;
#define PROCLOCK_LOCKMETHOD(proclock) \
LOCK_LOCKMETHOD(*((LOCK *) MAKE_PTR((proclock).tag.lock)))
LOCK_LOCKMETHOD(*((proclock).tag.myLock))
/*
* Each backend also maintains a local hash table with information about each
......@@ -347,7 +350,7 @@ typedef struct LOCALLOCK
LOCK *lock; /* associated LOCK object in shared mem */
PROCLOCK *proclock; /* associated PROCLOCK object in shmem */
bool isTempObject; /* true if lock is on a temporary object */
int partition; /* ID of partition containing this lock */
uint32 hashcode; /* copy of LOCKTAG's hash value */
int nLocks; /* total number of times lock is held */
int numLockOwners; /* # of relevant ResourceOwners */
int maxLockOwners; /* allocated size of array */
......@@ -360,15 +363,14 @@ typedef struct LOCALLOCK
/*
* This struct holds information passed from lmgr internals to the lock
* listing user-level functions (lockfuncs.c). For each PROCLOCK in the
* system, the SHMEM_OFFSET, PROCLOCK itself, and associated PGPROC and
* LOCK objects are stored. (Note there will often be multiple copies
* of the same PGPROC or LOCK.) We do not store the SHMEM_OFFSET of the
* PGPROC or LOCK separately, since they're in the PROCLOCK's tag fields.
* system, copies of the PROCLOCK object and associated PGPROC and
* LOCK objects are stored. Note there will often be multiple copies
* of the same PGPROC or LOCK --- to detect whether two are the same,
* compare the PROCLOCK tag fields.
*/
typedef struct
typedef struct LockData
{
int nelements; /* The length of each of the arrays */
SHMEM_OFFSET *proclockaddrs;
PROCLOCK *proclocks;
PGPROC *procs;
LOCK *locks;
......@@ -384,12 +386,24 @@ typedef enum
} LockAcquireResult;
/*
* The lockmgr's shared hash tables are partitioned to reduce contention.
* To determine which partition a given locktag belongs to, compute the tag's
* hash code with LockTagHashCode(), then apply one of these macros.
* NB: NUM_LOCK_PARTITIONS must be a power of 2!
*/
#define LockHashPartition(hashcode) \
((hashcode) % NUM_LOCK_PARTITIONS)
#define LockHashPartitionLock(hashcode) \
((LWLockId) (FirstLockMgrLock + LockHashPartition(hashcode)))
/*
* function prototypes
*/
extern void InitLocks(void);
extern LockMethod GetLocksMethodTable(const LOCK *lock);
extern int LockTagToPartition(const LOCKTAG *locktag);
extern uint32 LockTagHashCode(const LOCKTAG *locktag);
extern LockAcquireResult LockAcquire(const LOCKTAG *locktag,
bool isTempObject,
LOCKMODE lockmode,
......@@ -407,7 +421,7 @@ extern int LockCheckConflicts(LockMethod lockMethodTable,
LOCK *lock, PROCLOCK *proclock, PGPROC *proc);
extern void GrantLock(LOCK *lock, PROCLOCK *proclock, LOCKMODE lockmode);
extern void GrantAwaitedLock(void);
extern void RemoveFromWaitQueue(PGPROC *proc, int partition);
extern void RemoveFromWaitQueue(PGPROC *proc, uint32 hashcode);
extern Size LockShmemSize(void);
extern bool DeadLockCheck(PGPROC *proc);
extern void DeadLockReport(void);
......
......@@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2006, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* $PostgreSQL: pgsql/src/include/storage/lwlock.h,v 1.29 2006/07/23 03:07:58 tgl Exp $
* $PostgreSQL: pgsql/src/include/storage/lwlock.h,v 1.30 2006/07/23 23:08:46 tgl Exp $
*
*-------------------------------------------------------------------------
*/
......@@ -24,7 +24,8 @@
#define NUM_BUFFER_PARTITIONS 16
/* Number of partitions the shared lock tables are divided into */
#define NUM_LOCK_PARTITIONS 16
#define LOG2_NUM_LOCK_PARTITIONS 4
#define NUM_LOCK_PARTITIONS (1 << LOG2_NUM_LOCK_PARTITIONS)
/*
* We have a number of predefined LWLocks, plus a bunch of LWLocks that are
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment