• Tom Lane's avatar
    Fix transient clobbering of shared buffers during WAL replay. · 17118825
    Tom Lane authored
    RestoreBkpBlocks was in the habit of zeroing and refilling the target
    buffer; which was perfectly safe when the code was written, but is unsafe
    during Hot Standby operation.  The reason is that we have coding rules
    that allow backends to continue accessing a tuple in a heap relation while
    holding only a pin on its buffer.  Such a backend could see transiently
    zeroed data, if WAL replay had occasion to change other data on the page.
    This has been shown to be the cause of bug #6425 from Duncan Rance (who
    deserves kudos for developing a sufficiently-reproducible test case) as
    well as Bridget Frey's re-report of bug #6200.  It most likely explains the
    original report as well, though we don't yet have confirmation of that.
    
    To fix, change the code so that only bytes that are supposed to change will
    change, even transiently.  This actually saves cycles in RestoreBkpBlocks,
    since it's not writing the same bytes twice.
    
    Also fix seq_redo, which has the same disease, though it has to work a bit
    harder to meet the requirement.
    
    So far as I can tell, no other WAL replay routines have this type of bug.
    In particular, the index-related replay routines, which would certainly be
    broken if they had to meet the same standard, are not at risk because we
    do not have coding rules that allow access to an index page when not
    holding a buffer lock on it.
    
    Back-patch to 9.0 where Hot Standby was added.
    17118825
xlog.c 314 KB