Commit f4d7f1ad authored by Tom Lane's avatar Tom Lane

Fix incorrect order of operations during sinval reset processing.

We have to be sure that we have revalidated each nailed-in-cache relcache
entry before we try to use it to load data for some other relcache entry.
The introduction of "mapped relations" in 9.0 broke this, because although
we updated the state kept in relmapper.c early enough, we failed to
propagate that information into relcache entries soon enough; in
particular, we could try to fetch pg_class rows out of pg_class before
we'd updated its relcache entry's rd_node.relNode value from the map.

This bug accounts for Dave Gould's report of failures after "vacuum full
pg_class", and I believe that there is risk for other system catalogs
as well.

The core part of the fix is to copy relmapper data into the relcache
entries during "phase 1" in RelationCacheInvalidate(), before they'll be
used in "phase 2".  To try to future-proof the code against other similar
bugs, I also rearranged the order in which nailed relations are visited
during phase 2: now it's pg_class first, then pg_class_oid_index, then
other nailed relations.  This should ensure that RelationClearRelation can
apply RelationReloadIndexInfo to all nailed indexes without risking use
of not-yet-revalidated relcache entries.

Back-patch to 9.0 where the relation mapper was introduced.
parent 7b0d0e93
...@@ -2104,11 +2104,11 @@ RelationCacheInvalidateEntry(Oid relationId) ...@@ -2104,11 +2104,11 @@ RelationCacheInvalidateEntry(Oid relationId)
* so hash_seq_search will complete safely; (b) during the second pass we * so hash_seq_search will complete safely; (b) during the second pass we
* only hold onto pointers to nondeletable entries. * only hold onto pointers to nondeletable entries.
* *
* The two-phase approach also makes it easy to ensure that we process * The two-phase approach also makes it easy to update relfilenodes for
* nailed-in-cache indexes before other nondeletable items, and that we * mapped relations before we do anything else, and to ensure that the
* process pg_class_oid_index first of all. In scenarios where a nailed * second pass processes nailed-in-cache items before other nondeletable
* index has been given a new relfilenode, we have to detect that update * items. This should ensure that system catalogs are up to date before
* before the nailed index is used in reloading any other relcache entry. * we attempt to use them to reload information about other open relations.
*/ */
void void
RelationCacheInvalidate(void) RelationCacheInvalidate(void)
...@@ -2120,6 +2120,11 @@ RelationCacheInvalidate(void) ...@@ -2120,6 +2120,11 @@ RelationCacheInvalidate(void)
List *rebuildList = NIL; List *rebuildList = NIL;
ListCell *l; ListCell *l;
/*
* Reload relation mapping data before starting to reconstruct cache.
*/
RelationMapInvalidateAll();
/* Phase 1 */ /* Phase 1 */
hash_seq_init(&status, RelationIdCache); hash_seq_init(&status, RelationIdCache);
...@@ -2130,7 +2135,7 @@ RelationCacheInvalidate(void) ...@@ -2130,7 +2135,7 @@ RelationCacheInvalidate(void)
/* Must close all smgr references to avoid leaving dangling ptrs */ /* Must close all smgr references to avoid leaving dangling ptrs */
RelationCloseSmgr(relation); RelationCloseSmgr(relation);
/* Ignore new relations, since they are never SI targets */ /* Ignore new relations, since they are never cross-backend targets */
if (relation->rd_createSubid != InvalidSubTransactionId) if (relation->rd_createSubid != InvalidSubTransactionId)
continue; continue;
...@@ -2144,22 +2149,33 @@ RelationCacheInvalidate(void) ...@@ -2144,22 +2149,33 @@ RelationCacheInvalidate(void)
} }
else else
{ {
/*
* If it's a mapped relation, immediately update its rd_node in
* case its relfilenode changed. We must do this during phase 1
* in case the relation is consulted during rebuild of other
* relcache entries in phase 2. It's safe since consulting the
* map doesn't involve any access to relcache entries.
*/
if (RelationIsMapped(relation))
RelationInitPhysicalAddr(relation);
/* /*
* Add this entry to list of stuff to rebuild in second pass. * Add this entry to list of stuff to rebuild in second pass.
* pg_class_oid_index goes on the front of rebuildFirstList, other * pg_class goes to the front of rebuildFirstList while
* nailed indexes on the back, and everything else into * pg_class_oid_index goes to the back of rebuildFirstList, so
* rebuildList (in no particular order). * they are done first and second respectively. Other nailed
* relations go to the front of rebuildList, so they'll be done
* next in no particular order; and everything else goes to the
* back of rebuildList.
*/ */
if (relation->rd_isnailed && if (RelationGetRelid(relation) == RelationRelationId)
relation->rd_rel->relkind == RELKIND_INDEX) rebuildFirstList = lcons(relation, rebuildFirstList);
{ else if (RelationGetRelid(relation) == ClassOidIndexId)
if (RelationGetRelid(relation) == ClassOidIndexId) rebuildFirstList = lappend(rebuildFirstList, relation);
rebuildFirstList = lcons(relation, rebuildFirstList); else if (relation->rd_isnailed)
else
rebuildFirstList = lappend(rebuildFirstList, relation);
}
else
rebuildList = lcons(relation, rebuildList); rebuildList = lcons(relation, rebuildList);
else
rebuildList = lappend(rebuildList, relation);
} }
} }
...@@ -2170,11 +2186,6 @@ RelationCacheInvalidate(void) ...@@ -2170,11 +2186,6 @@ RelationCacheInvalidate(void)
*/ */
smgrcloseall(); smgrcloseall();
/*
* Reload relation mapping data before starting to reconstruct cache.
*/
RelationMapInvalidateAll();
/* Phase 2: rebuild the items found to need rebuild in phase 1 */ /* Phase 2: rebuild the items found to need rebuild in phase 1 */
foreach(l, rebuildFirstList) foreach(l, rebuildFirstList)
{ {
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment