• Tom Lane's avatar
    Fix longstanding crash-safety bug with newly-created-or-reset sequences. · af026b5d
    Tom Lane authored
    If a crash occurred immediately after the first nextval() call for a serial
    column, WAL replay would restore the sequence to a state in which it
    appeared that no nextval() had been done, thus allowing the first sequence
    value to be returned again by the next nextval() call; as reported in
    bug #6748 from Xiangming Mei.
    
    More generally, the problem would occur if an ALTER SEQUENCE was executed
    on a freshly created or reset sequence.  (The manifestation with serial
    columns was introduced in 8.2 when we added an ALTER SEQUENCE OWNED BY step
    to serial column creation.)  The cause is that sequence creation attempted
    to save one WAL entry by writing out a WAL record that made it appear that
    the first nextval() had already happened (viz, with is_called = true),
    while marking the sequence's in-database state with log_cnt = 1 to show
    that the first nextval() need not emit a WAL record.  However, ALTER
    SEQUENCE would emit a new WAL entry reflecting the actual in-database state
    (with is_called = false).  Then, nextval would allocate the first sequence
    value and set is_called = true, but it would trust the log_cnt value and
    not emit any WAL record.  A crash at this point would thus restore the
    sequence to its post-ALTER state, causing the next nextval() call to return
    the first sequence value again.
    
    To fix, get rid of the idea of logging an is_called status different from
    reality.  This means that the first nextval-driven WAL record will happen
    at the first nextval call not the second, but the marginal cost of that is
    pretty negligible.  In addition, make sure that ALTER SEQUENCE resets
    log_cnt to zero in any case where it touches sequence parameters that
    affect future nextval results.  This will result in some user-visible
    changes in the contents of a sequence's log_cnt column, as reflected in the
    patch's regression test changes; but no application should be depending on
    that anyway, since it was already true that log_cnt changes rather
    unpredictably depending on checkpoint timing.
    
    In addition, make some basically-cosmetic improvements to get rid of
    sequence.c's undesirable intimacy with page layout details.  It was always
    really trying to WAL-log the contents of the sequence tuple, so we should
    have it do that directly using a HeapTuple's t_data and t_len, rather than
    backing into it with some magic assumptions about where the tuple would be
    on the sequence's page.
    
    Back-patch to all supported branches.
    af026b5d
sequence_1.out 10.7 KB