• Tom Lane's avatar
    Use a safer method for determining whether relcache init file is stale. · f3b5565d
    Tom Lane authored
    When we invalidate the relcache entry for a system catalog or index, we
    must also delete the relcache "init file" if the init file contains a copy
    of that rel's entry.  The old way of doing this relied on a specially
    maintained list of the OIDs of relations present in the init file: we made
    the list either when reading the file in, or when writing the file out.
    The problem is that when writing the file out, we included only rels
    present in our local relcache, which might have already suffered some
    deletions due to relcache inval events.  In such cases we correctly decided
    not to overwrite the real init file with incomplete data --- but we still
    used the incomplete initFileRelationIds list for the rest of the current
    session.  This could result in wrong decisions about whether the session's
    own actions require deletion of the init file, potentially allowing an init
    file created by some other concurrent session to be left around even though
    it's been made stale.
    
    Since we don't support changing the schema of a system catalog at runtime,
    the only likely scenario in which this would cause a problem in the field
    involves a "vacuum full" on a catalog concurrently with other activity, and
    even then it's far from easy to provoke.  Remarkably, this has been broken
    since 2002 (in commit 78634044), but we had
    never seen a reproducible test case until recently.  If it did happen in
    the field, the symptoms would probably involve unexpected "cache lookup
    failed" errors to begin with, then "could not open file" failures after the
    next checkpoint, as all accesses to the affected catalog stopped working.
    Recovery would require manually removing the stale "pg_internal.init" file.
    
    To fix, get rid of the initFileRelationIds list, and instead consult
    syscache.c's list of relations used in catalog caches to decide whether a
    relation is included in the init file.  This should be a tad more efficient
    anyway, since we're replacing linear search of a list with ~100 entries
    with a binary search.  It's a bit ugly that the init file contents are now
    so directly tied to the catalog caches, but in practice that won't make
    much difference.
    
    Back-patch to all supported branches.
    f3b5565d
relcache.c 162 KB