• Andres Freund's avatar
    Rework the way multixact truncations work. · 4f627f89
    Andres Freund authored
    The fact that multixact truncations are not WAL logged has caused a fair
    share of problems. Amongst others it requires to do computations during
    recovery while the database is not in a consistent state, delaying
    truncations till checkpoints, and handling members being truncated, but
    offset not.
    
    We tried to put bandaids on lots of these issues over the last years,
    but it seems time to change course. Thus this patch introduces WAL
    logging for multixact truncations.
    
    This allows:
    1) to perform the truncation directly during VACUUM, instead of delaying it
       to the checkpoint.
    2) to avoid looking at the offsets SLRU for truncation during recovery,
       we can just use the master's values.
    3) simplify a fair amount of logic to keep in memory limits straight,
       this has gotten much easier
    
    During the course of fixing this a bunch of additional bugs had to be
    fixed:
    1) Data was not purged from memory the member's SLRU before deleting
       segments. This happened to be hard or impossible to hit due to the
       interlock between checkpoints and truncation.
    2) find_multixact_start() relied on SimpleLruDoesPhysicalPageExist - but
       that doesn't work for offsets that haven't yet been flushed to
       disk. Add code to flush the SLRUs to fix. Not pretty, but it feels
       slightly safer to only make decisions based on actual on-disk state.
    3) find_multixact_start() could be called concurrently with a truncation
       and thus fail. Via SetOffsetVacuumLimit() that could lead to a round
       of emergency vacuuming. The problem remains in
       pg_get_multixact_members(), but that's quite harmless.
    
    For now this is going to only get applied to 9.5+, leaving the issues in
    the older branches in place. It is quite possible that we need to
    backpatch at a later point though.
    
    For the case this gets backpatched we need to handle that an updated
    standby may be replaying WAL from a not-yet upgraded primary. We have to
    recognize that situation and use "old style" truncation (i.e. looking at
    the SLRUs) during WAL replay. In contrast to before, this now happens in
    the startup process, when replaying a checkpoint record, instead of the
    checkpointer. Doing truncation in the restartpoint is incorrect, they
    can happen much later than the original checkpoint, thereby leading to
    wraparound.  To avoid "multixact_redo: unknown op code 48" errors
    standbys would have to be upgraded before primaries.
    
    A later patch will bump the WAL page magic, and remove the legacy
    truncation codepaths. Legacy truncation support is just included to make
    a possible future backpatch easier.
    
    Discussion: 20150621192409.GA4797@alap3.anarazel.de
    Reviewed-By: Robert Haas, Alvaro Herrera, Thomas Munro
    Backpatch: 9.5 for now
    4f627f89
xlog.c 359 KB