• Alvaro Herrera's avatar
    Fix race when updating a tuple concurrently locked by another process · 1a917ae8
    Alvaro Herrera authored
    If a tuple is locked, and this lock is later upgraded either to an
    update or to a stronger lock, and in the meantime some other process
    tries to lock, update or delete the same tuple, it (the tuple) could end
    up being updated twice, or having conflicting locks held.
    
    The reason for this is that the second updater checks for a change in
    Xmax value, or in the HEAP_XMAX_IS_MULTI infomask bit, after noticing
    the first lock; and if there's a change, it restarts and re-evaluates
    its ability to update the tuple.  But it neglected to check for changes
    in lock strength or in lock-vs-update status when those two properties
    stayed the same.  This would lead it to take the wrong decision and
    continue with its own update, when in reality it shouldn't do so but
    instead restart from the top.
    
    This could lead to either an assertion failure much later (when a
    multixact containing multiple updates is detected), or duplicate copies
    of tuples.
    
    To fix, make sure to compare the other relevant infomask bits alongside
    the Xmax value and HEAP_XMAX_IS_MULTI bit, and restart from the top if
    necessary.
    
    Also, in the belt-and-suspenders spirit, add a check to
    MultiXactCreateFromMembers that a multixact being created does not have
    two or more members that are claimed to be updates.  This should protect
    against other bugs that might cause similar bogus situations.
    
    Backpatch to 9.3, where the possibility of multixacts containing updates
    was introduced.  (In prior versions it was possible to have the tuple
    lock upgraded from shared to exclusive, and an update would not restart
    from the top; yet we're protected against a bug there because there's
    always a sleep to wait for the locking transaction to complete before
    continuing to do anything.  Really, the fact that tuple locks always
    conflicted with concurrent updates is what protected against bugs here.)
    
    Per report from Andrew Dunstan and Josh Berkus in thread at
    http://www.postgresql.org/message-id/534C8B33.9050807@pgexperts.com
    
    Bug analysis by Andres Freund.
    1a917ae8
multixact.h 4.53 KB