• Alvaro Herrera's avatar
    Fix race condition in reading commit timestamps · 8eace46d
    Alvaro Herrera authored
    If a user requests the commit timestamp for a transaction old enough
    that its data is concurrently being truncated away by vacuum at just the
    right time, they would receive an ugly internal file-not-found error
    message from slru.c rather than the expected NULL return value.
    
    In a primary server, the window for the race is very small: the lookup
    has to occur exactly between the two calls by vacuum, and there's not a
    lot that happens between them (mostly just a multixact truncate).  In a
    standby server, however, the window is larger because the truncation is
    executed as soon as the WAL record for it is replayed, but the advance
    of the oldest-Xid is not executed until the next checkpoint record.
    
    To fix in the primary, simply reverse the order of operations in
    vac_truncate_clog.  To fix in the standby, augment the WAL truncation
    record so that the standby is aware of the new oldest-XID value and can
    apply the update immediately.  WAL version bumped because of this.
    
    No backpatch, because of the low importance of the bug and its rarity.
    
    Author: Craig Ringer
    Reviewed-By: Petr Jelínek, Peter Eisentraut
    Discussion: https://postgr.es/m/CAMsr+YFhVtRQT1VAwC+WGbbxZZRzNou=N9Ed-FrCqkwQ8H8oJQ@mail.gmail.com
    8eace46d
commit_ts.c 28.7 KB