• Michael Paquier's avatar
    Improve handling of corrupted two-phase state files at recovery · 8582b4d0
    Michael Paquier authored
    When a corrupted two-phase state file is found by WAL replay, be it for
    crash recovery or archive recovery, then the file is simply skipped and
    a WARNING is logged to the user, causing the transaction to be silently
    lost.  Facing an on-disk WAL file which is corrupted is as likely to
    happen as what is stored in WAL records, but WAL records are already
    able to fail hard if there is a CRC mismatch.  On-disk two-phase state
    files, on the contrary, are simply ignored if corrupted.  Note that when
    restoring the initial two-phase data state at recovery, files newer than
    the horizon XID are discarded hence no files present in pg_twophase/
    should be torned and have been made durable by a previous checkpoint, so
    recovery should never see any corrupted two-phase state file by design.
    
    The situation got better since 978b2f65 which has added two-phase state
    information directly in WAL instead of using on-disk files, so the risk
    is limited to two-phase transactions which live across at least one
    checkpoint for long periods.  Backups having legit two-phase state files
    on-disk could also lose silently transactions when restored if things
    get corrupted.
    
    This behavior exists since two-phase commit has been introduced, no
    back-patch is done for now per the lack of complaints about this
    problem.
    
    Author: Michael Paquier
    Discussion: https://postgr.es/m/20180709050309.GM1467@paquier.xyz
    8582b4d0
twophase.c 71 KB