Commit 22b27b4c authored by Tom Lane's avatar Tom Lane

Avoid useless closely-spaced writes of statistics files.

The original intent in the stats collector was that we should not write out
stats data oftener than every PGSTAT_STAT_INTERVAL msec.  Backends will not
make requests at all if they see the existing data is newer than that, and
the stats collector is supposed to disregard requests having a cutoff_time
older than its most recently written data, so that close-together requests
don't result in multiple writes.  But the latter part of that got broken
in commit 187492b6, so that if two backends concurrently decide
the existing stats are too old, the collector would write the data twice.
(In principle the collector's logic would still merge requests as long as
the second one arrives before we've actually written data ... but since
the message collection loop would write data immediately after processing
a single inquiry message, that never happened in practice, and in any case
the window in which it might work would be much shorter than
PGSTAT_STAT_INTERVAL.)

To fix, improve pgstat_recv_inquiry so that it checks whether the cutoff
time is too old, and doesn't add a request to the queue if so.  This means
that we do not need DBWriteRequest.request_time, because the decision is
taken before making a queue entry.  And that means that we don't really
need the DBWriteRequest data structure at all; an OID list of database
OIDs will serve and allow removal of some rather verbose and crufty code.

In passing, improve the comments in this area, which have been rather
neglected.  Also change backend_read_statsfile so that it's not silently
relying on MyDatabaseId to have some particular value in the autovacuum
launcher process.  It accidentally worked as desired because MyDatabaseId
is zero in that process; but that does not seem like a dependency we want,
especially with no documentation about it.

Although this patch is mine, it turns out I'd rediscovered a known bug,
for which Tomas Vondra had already submitted a patch that's functionally
equivalent to the non-cosmetic aspects of this patch.  Thanks to Tomas
for reviewing this version.

Back-patch to 9.3 where the bug was introduced.

Prior-Discussion: <1718942738eb65c8407fcd864883f4c8@fuzzy.cz>
Patch: <4625.1464202586@sss.pgh.pa.us>
parent aa14bc41
This diff is collapsed.
...@@ -219,7 +219,20 @@ typedef struct PgStat_MsgDummy ...@@ -219,7 +219,20 @@ typedef struct PgStat_MsgDummy
/* ---------- /* ----------
* PgStat_MsgInquiry Sent by a backend to ask the collector * PgStat_MsgInquiry Sent by a backend to ask the collector
* to write the stats file. * to write the stats file(s).
*
* Ordinarily, an inquiry message prompts writing of the global stats file,
* the stats file for shared catalogs, and the stats file for the specified
* database. If databaseid is InvalidOid, only the first two are written.
*
* New file(s) will be written only if the existing file has a timestamp
* older than the specified cutoff_time; this prevents duplicated effort
* when multiple requests arrive at nearly the same time, assuming that
* backends send requests with cutoff_times a little bit in the past.
*
* clock_time should be the requestor's current local time; the collector
* uses this to check for the system clock going backward, but it has no
* effect unless that occurs. We assume clock_time >= cutoff_time, though.
* ---------- * ----------
*/ */
...@@ -228,7 +241,7 @@ typedef struct PgStat_MsgInquiry ...@@ -228,7 +241,7 @@ typedef struct PgStat_MsgInquiry
PgStat_MsgHdr m_hdr; PgStat_MsgHdr m_hdr;
TimestampTz clock_time; /* observed local clock time */ TimestampTz clock_time; /* observed local clock time */
TimestampTz cutoff_time; /* minimum acceptable file timestamp */ TimestampTz cutoff_time; /* minimum acceptable file timestamp */
Oid databaseid; /* requested DB (InvalidOid => all DBs) */ Oid databaseid; /* requested DB (InvalidOid => shared only) */
} PgStat_MsgInquiry; } PgStat_MsgInquiry;
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment