Commit bc49d932 authored by Tom Lane's avatar Tom Lane

Fix rare startup failure induced by MVCC-catalog-scans patch.

While a new backend nominally participates in sinval signaling starting
from the SharedInvalBackendInit call near the top of InitPostgres, it
cannot recognize sinval messages for unshared catalogs of its database
until it has set up MyDatabaseId.  This is not problematic for the catcache
or relcache, which by definition won't have loaded any data from or about
such catalogs before that point.  However, commit 568d4138
introduced a mechanism for re-using MVCC snapshots for catalog scans, and
made invalidation of those depend on recognizing relevant sinval messages.
So it's possible to establish a catalog snapshot to read pg_authid and
pg_database, then before we set MyDatabaseId, receive sinval messages that
should result in invalidating that snapshot --- but do not, because we
don't realize they are for our database.  This mechanism explains the
intermittent buildfarm failures we've seen since commit 31eae602.
That commit was not itself at fault, but it introduced a new regression
test that does reconnections concurrently with the "vacuum full pg_am"
command in vacuum.sql.  This allowed the pre-existing error to be exposed,
given just the right timing, because we'd fail to update our information
about how to access pg_am.  In principle any VACUUM FULL on a system
catalog could have created a similar hazard for concurrent incoming
connections.  Perhaps there are more subtle failure cases as well.

To fix, force invalidation of the catalog snapshot as soon as we've
set MyDatabaseId.

Back-patch to 9.4 where the error was introduced.
parent 5e3d289f
...@@ -852,6 +852,14 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username, ...@@ -852,6 +852,14 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
/* (We assume this is an atomic store so no lock is needed) */ /* (We assume this is an atomic store so no lock is needed) */
MyProc->databaseId = MyDatabaseId; MyProc->databaseId = MyDatabaseId;
/*
* We established a catalog snapshot while reading pg_authid and/or
* pg_database; but until we have set up MyDatabaseId, we won't react to
* incoming sinval messages for unshared catalogs, so we won't realize it
* if the snapshot has been invalidated. Assume it's no good anymore.
*/
InvalidateCatalogSnapshot();
/* /*
* Now, take a writer's lock on the database we are trying to connect to. * Now, take a writer's lock on the database we are trying to connect to.
* If there is a concurrently running DROP DATABASE on that database, this * If there is a concurrently running DROP DATABASE on that database, this
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment