• Michael Paquier's avatar
    Fix race in TAP test 002_archiving.pl when restoring history file · 8bcf90c7
    Michael Paquier authored
    This test, introduced in df86e52c, uses a second standby to check that
    it is able to remove correctly RECOVERYHISTORY and RECOVERYXLOG at the
    end of recovery.  This standby uses the archives of the primary to
    restore its contents, with some of the archive's contents coming from
    the first standby previously promoted.  In slow environments, it was
    possible that the test did not check what it should, as the history file
    generated by the promotion of the first standby may not be stored yet on
    the archives the second standby feeds on.  So, it could be possible that
    the second standby selects an incorrect timeline, without restoring a
    history file at all.
    
    This commits adds a wait phase to make sure that the history file
    required by the second standby is archived before this cluster is
    created.  This relies on poll_query_until() with pg_stat_file() and an
    absolute path, something not supported in REL_10_STABLE.
    
    While on it, this adds a new test to check that the history file has
    been restored by looking at the logs of the second standby.  This
    ensures that a RECOVERYHISTORY, whose removal needs to be checked,
    is created in the first place.  This should make the test more robust.
    
    This test has been introduced by df86e52c, but it came in light as an
    effect of the bug fixed by acf1dd42, where the extra restore_command
    calls made the test much slower.
    
    Reported-by: Andres Freund
    Discussion: https://postgr.es/m/YlT23IvsXkGuLzFi@paquier.xyz
    Backpatch-through: 11
    8bcf90c7
002_archiving.pl 3.59 KB