• Alvaro Herrera's avatar
    Fix a couple of bugs in MultiXactId freezing · 2393c7d1
    Alvaro Herrera authored
    Both heap_freeze_tuple() and heap_tuple_needs_freeze() neglected to look
    into a multixact to check the members against cutoff_xid.  This means
    that a very old Xid could survive hidden within a multi, possibly
    outliving its CLOG storage.  In the distant future, this would cause
    clog lookup failures:
    ERROR:  could not access status of transaction 3883960912
    DETAIL:  Could not open file "pg_clog/0E78": No such file or directory.
    
    This mostly was problematic when the updating transaction aborted, since
    in that case the row wouldn't get pruned away earlier in vacuum and the
    multixact could possibly survive for a long time.  In many cases, data
    that is inaccessible for this reason way can be brought back
    heuristically.
    
    As a second bug, heap_freeze_tuple() didn't properly handle multixacts
    that need to be frozen according to cutoff_multi, but whose updater xid
    is still alive.  Instead of preserving the update Xid, it just set Xmax
    invalid, which leads to both old and new tuple versions becoming
    visible.  This is pretty rare in practice, but a real threat
    nonetheless.  Existing corrupted rows, unfortunately, cannot be repaired
    in an automated fashion.
    
    Existing physical replicas might have already incorrectly frozen tuples
    because of different behavior than in master, which might only become
    apparent in the future once pg_multixact/ is truncated; it is
    recommended that all clones be rebuilt after upgrading.
    
    Following code analysis caused by bug report by J Smith in message
    CADFUPgc5bmtv-yg9znxV-vcfkb+JPRqs7m2OesQXaM_4Z1JpdQ@mail.gmail.com
    and privately by F-Secure.
    
    Backpatch to 9.3, where freezing of MultiXactIds was introduced.
    
    Analysis and patch by Andres Freund, with some tweaks by Álvaro.
    2393c7d1
heapam.c 216 KB