src/backend/access/common/heapvalid.c · 159f8c63ad34db0582e375b4e04916128be2a0e4 · Abuhujair Javed / Postgres FD Implementation

From: Dan McGuirk <mcguirk@indirect.com> · 159f8c63

Marc G. Fournier authored 27 years ago

Reply-To: hackers@hub.org, Dan McGuirk <mcguirk@indirect.com>
To: hackers@hub.org
Subject: [HACKERS] tmin writeback optimization

I was doing some profiling of the backend, and noticed that during a certain
benchmark I was running somewhere between 30% and 75% of the backend's CPU
time was being spent in calls to TransactionIdDidCommit() from
HeapTupleSatisfiesNow() or HeapTupleSatisfiesItself() to determine that
changed rows' transactions had in fact been committed even though the rows'
tmin values had not yet been set.

When a query looks at a given row, it needs to figure out whether the
transaction that changed the row has been committed and hence it should pay
attention to the row, or whether on the other hand the transaction is still
in progress or has been aborted and hence the row should be ignored. If
a tmin value is set, it is known definitively that the row's transaction
has been committed. However, if tmin is not set, the transaction
referred to in xmin must be looked up in pg_log, and this is what the
backend was spending a lot of time doing during my benchmark.

So, implementing a method suggested by Vadim, I created the following
patch that, the first time a query finds a committed row whose tmin value
is not set, sets it, and marks the buffer where the row is stored as
dirty. (It works for tmax, too.) This doesn't result in the boost in
real time performance I was hoping for, however it does decrease backend
CPU usage by up to two-thirds in certain situations, so it could be
rather beneficial in high-concurrency settings.

159f8c63

heapvalid.c 3.67 KB

Replace heapvalid.c