• Michael Paquier's avatar
    Avoid duplicate XIDs at recovery when building initial snapshot · 1df21ddb
    Michael Paquier authored
    On a primary, sets of XLOG_RUNNING_XACTS records are generated on a
    periodic basis to allow recovery to build the initial state of
    transactions for a hot standby.  The set of transaction IDs is created
    by scanning all the entries in ProcArray.  However it happens that its
    logic never counted on the fact that two-phase transactions finishing to
    prepare can put ProcArray in a state where there are two entries with
    the same transaction ID, one for the initial transaction which gets
    cleared when prepare finishes, and a second, dummy, entry to track that
    the transaction is still running after prepare finishes.  This way
    ensures a continuous presence of the transaction so as callers of for
    example TransactionIdIsInProgress() are always able to see it as alive.
    
    So, if a XLOG_RUNNING_XACTS takes a standby snapshot while a two-phase
    transaction finishes to prepare, the record can finish with duplicated
    XIDs, which is a state expected by design.  If this record gets applied
    on a standby to initial its recovery state, then it would simply fail,
    so the odds of facing this failure are very low in practice.  It would
    be tempting to change the generation of XLOG_RUNNING_XACTS so as
    duplicates are removed on the source, but this requires to hold on
    ProcArrayLock for longer and this would impact all workloads,
    particularly those using heavily two-phase transactions.
    
    XLOG_RUNNING_XACTS is also actually used only to initialize the standby
    state at recovery, so instead the solution is taken to discard
    duplicates when applying the initial snapshot.
    
    Diagnosed-by: Konstantin Knizhnik
    Author: Michael Paquier
    Discussion: https://postgr.es/m/0c96b653-4696-d4b4-6b5d-78143175d113@postgrespro.ru
    Backpatch-through: 9.3
    1df21ddb
procarray.c 121 KB