• Tom Lane's avatar
    Fix bogus cache-invalidation logic in logical replication worker. · 3d65b059
    Tom Lane authored
    The code recorded cache invalidation events by zeroing the "localreloid"
    field of affected cache entries.  However, it's possible for an inval
    event to occur even while we have the entry open and locked.  So an
    ill-timed inval could result in "cache lookup failed for relation 0"
    errors, if the worker's code tried to use the cleared field.  We can
    fix that by creating a separate bool field to record whether the entry
    needs to be revalidated.  (In the back branches, cram the bool into
    what had been padding space, to avoid an ABI break in the somewhat
    unlikely event that any extension is looking at this struct.)
    
    Also, rearrange the logic in logicalrep_rel_open so that it
    does the right thing in cases where table_open would fail.
    We should retry the lookup by name in that case, but we didn't.
    
    The real-world impact of this is probably small.  In the first place,
    the error conditions are very low probability, and in the second place,
    the worker would just exit and get restarted.  We only noticed because
    in a CLOBBER_CACHE_ALWAYS build, the failure can occur repeatedly,
    preventing the worker from making progress.  Nonetheless, it's clearly
    a bug, and it impedes a useful type of testing; so back-patch to v10
    where this code was introduced.
    
    Discussion: https://postgr.es/m/1032727.1600096803@sss.pgh.pa.us
    3d65b059
logicalrelation.h 1.68 KB