Commit e8d029a3 authored by Tom Lane's avatar Tom Lane

Do unlocked prechecks in bufmgr.c loops that scan the whole buffer pool.

DropRelFileNodeBuffers, DropDatabaseBuffers, FlushRelationBuffers, and
FlushDatabaseBuffers have to scan the whole shared_buffers pool because
we have no index structure that would find the target buffers any more
efficiently than that.  This gets expensive with large NBuffers.  We can
shave some cycles from these loops by prechecking to see if the current
buffer is interesting before we acquire the buffer header lock.
Ordinarily such a test would be unsafe, but in these cases it should be
safe because we are already assuming that the caller holds a lock that
prevents any new target pages from being loaded into the buffer pool
concurrently.  Therefore, no buffer tag should be changing to a value of
interest, only away from a value of interest.  So a false negative match
is impossible, while a false positive is safe because we'll recheck after
acquiring the buffer lock.  Initial testing says that this speeds these
loops by a factor of 2X to 3X on common Intel hardware.

Patch for DropRelFileNodeBuffers by Jeff Janes (based on an idea of
Heikki's); extended to the remaining sequential scans by Tom Lane
parent 2c8a4e9b
......@@ -2048,6 +2048,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
{
int i;
/* If it's a local relation, it's localbuf.c's problem. */
if (rnode.backend != InvalidBackendId)
{
if (rnode.backend == MyBackendId)
......@@ -2059,6 +2060,25 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
{
volatile BufferDesc *bufHdr = &BufferDescriptors[i];
/*
* We can make this a tad faster by prechecking the buffer tag before
* we attempt to lock the buffer; this saves a lot of lock
* acquisitions in typical cases. It should be safe because the
* caller must have AccessExclusiveLock on the relation, or some other
* reason to be certain that no one is loading new pages of the rel
* into the buffer pool. (Otherwise we might well miss such pages
* entirely.) Therefore, while the tag might be changing while we
* look at it, it can't be changing *to* a value we care about, only
* *away* from such a value. So false negatives are impossible, and
* false positives are safe because we'll recheck after getting the
* buffer lock.
*
* We could check forkNum and blockNum as well as the rnode, but the
* incremental win from doing so seems small.
*/
if (!RelFileNodeEquals(bufHdr->tag.rnode, rnode.node))
continue;
LockBufHdr(bufHdr);
if (RelFileNodeEquals(bufHdr->tag.rnode, rnode.node) &&
bufHdr->tag.forkNum == forkNum &&
......@@ -2084,7 +2104,6 @@ void
DropDatabaseBuffers(Oid dbid)
{
int i;
volatile BufferDesc *bufHdr;
/*
* We needn't consider local buffers, since by assumption the target
......@@ -2093,7 +2112,15 @@ DropDatabaseBuffers(Oid dbid)
for (i = 0; i < NBuffers; i++)
{
bufHdr = &BufferDescriptors[i];
volatile BufferDesc *bufHdr = &BufferDescriptors[i];
/*
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
if (bufHdr->tag.rnode.dbNode != dbid)
continue;
LockBufHdr(bufHdr);
if (bufHdr->tag.rnode.dbNode == dbid)
InvalidateBuffer(bufHdr); /* releases spinlock */
......@@ -2220,6 +2247,14 @@ FlushRelationBuffers(Relation rel)
for (i = 0; i < NBuffers; i++)
{
bufHdr = &BufferDescriptors[i];
/*
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
if (!RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node))
continue;
LockBufHdr(bufHdr);
if (RelFileNodeEquals(bufHdr->tag.rnode, rel->rd_node) &&
(bufHdr->flags & BM_VALID) && (bufHdr->flags & BM_DIRTY))
......@@ -2262,6 +2297,14 @@ FlushDatabaseBuffers(Oid dbid)
for (i = 0; i < NBuffers; i++)
{
bufHdr = &BufferDescriptors[i];
/*
* As in DropRelFileNodeBuffers, an unlocked precheck should be safe
* and saves some cycles.
*/
if (bufHdr->tag.rnode.dbNode != dbid)
continue;
LockBufHdr(bufHdr);
if (bufHdr->tag.rnode.dbNode == dbid &&
(bufHdr->flags & BM_VALID) && (bufHdr->flags & BM_DIRTY))
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment