1. 16 Jul, 2021 1 commit
  2. 30 Jun, 2021 1 commit
  3. 16 Jun, 2021 1 commit
  4. 09 Jun, 2021 1 commit
    • Robert Haas's avatar
      Fix corner case failure of new standby to follow new primary. · caba8f0d
      Robert Haas authored
      This only happens if (1) the new standby has no WAL available locally,
      (2) the new standby is starting from the old timeline, (3) the promotion
      happened in the WAL segment from which the new standby is starting,
      (4) the timeline history file for the new timeline is available from
      the archive but the WAL files for are not (i.e. this is a race),
      (5) the WAL files for the new timeline are available via streaming,
      and (6) recovery_target_timeline='latest'.
      
      Commit ee994272 introduced this
      logic and was an improvement over the previous code, but it mishandled
      this case. If recovery_target_timeline='latest' and restore_command is
      set, validateRecoveryParameters() can change recoveryTargetTLI to be
      different from receiveTLI. If streaming is then tried afterward,
      expectedTLEs gets initialized with the history of the wrong timeline.
      It's supposed to be a list of entries explaining how to get to the
      target timeline, but in this case it ends up with a list of entries
      explaining how to get to the new standby's original timeline, which
      isn't right.
      
      Dilip Kumar and Robert Haas, reviewed by Kyotaro Horiguchi.
      
      Discussion: http://postgr.es/m/CAFiTN-sE-jr=LB8jQuxeqikd-Ux+jHiXyh4YDiZMPedgQKup0g@mail.gmail.com
      caba8f0d
  5. 19 May, 2021 1 commit
  6. 12 May, 2021 1 commit
    • Tom Lane's avatar
      Initial pgindent and pgperltidy run for v14. · def5b065
      Tom Lane authored
      Also "make reformat-dat-files".
      
      The only change worthy of note is that pgindent messed up the formatting
      of launcher.c's struct LogicalRepWorkerId, which led me to notice that
      that struct wasn't used at all anymore, so I just took it out.
      def5b065
  7. 10 May, 2021 1 commit
  8. 08 Apr, 2021 3 commits
  9. 06 Apr, 2021 1 commit
    • Fujii Masao's avatar
      Stop archive recovery if WAL generated with wal_level=minimal is found. · 9de9294b
      Fujii Masao authored
      Previously if hot standby was enabled, archive recovery exited with
      an error when it found WAL generated with wal_level=minimal.
      But if hot standby was disabled, it just reported a warning and
      continued in that case. Which could lead to data loss or errors
      during normal operation. A warning was emitted, but users could
      easily miss that and not notice this serious situation until
      they encountered the actual errors.
      
      To improve this situation, this commit changes archive recovery
      so that it exits with FATAL error when it finds WAL generated with
      wal_level=minimal whatever the setting of hot standby. This enables
      users to notice the serious situation soon.
      
      The FATAL error is thrown if archive recovery starts from a base
      backup taken before wal_level is changed to minimal. When archive
      recovery exits with the error, if users have a base backup taken
      after setting wal_level to higher than minimal, they can recover
      the database by starting archive recovery from that newer backup.
      But note that if such backup doesn't exist, there is no easy way to
      complete archive recovery, which may make the database server
      unstartable and users may lose whole database. The commit adds
      the note about this risk into the document.
      
      Even in the case of unstartable database server, previously by just
      disabling hot standby users could avoid the error during archive
      recovery, forcibly start up the server and salvage data from it.
      But note that this commit makes this procedure unavailable at all.
      
      Author: Takamichi Osumi
      Reviewed-by: Laurenz Albe, Kyotaro Horiguchi, David Steele, Fujii Masao
      Discussion: https://postgr.es/m/OSBPR01MB4888CBE1DA08818FD2D90ED8EDF90@OSBPR01MB4888.jpnprd01.prod.outlook.com
      9de9294b
  10. 17 Mar, 2021 2 commits
    • Tom Lane's avatar
      Code review for server's handling of "tablespace map" files. · 8620a7f6
      Tom Lane authored
      While looking at Robert Foggia's report, I noticed a passel of
      other issues in the same area:
      
      * The scheme for backslash-quoting newlines in pathnames is just
      wrong; it will misbehave if the last ordinary character in a pathname
      is a backslash.  I'm not sure why we're bothering to allow newlines
      in tablespace paths, but if we're going to do it we should do it
      without introducing other problems.  Hence, backslashes themselves
      have to be backslashed too.
      
      * The author hadn't read the sscanf man page very carefully, because
      this code would drop any leading whitespace from the path.  (I doubt
      that a tablespace path with leading whitespace could happen in
      practice; but if we're bothering to allow newlines in the path, it
      sure seems like leading whitespace is little less implausible.)  Using
      sscanf for the task of finding the first space is overkill anyway.
      
      * While I'm not 100% sure what the rationale for escaping both \r and
      \n is, if the idea is to allow Windows newlines in the file then this
      code failed, because it'd throw an error if it saw \r followed by \n.
      
      * There's no cross-check for an incomplete final line in the map file,
      which would be a likely apparent symptom of the improper-escaping
      bug.
      
      On the generation end, aside from the escaping issue we have:
      
      * If needtblspcmapfile is true then do_pg_start_backup will pass back
      escaped strings in tablespaceinfo->path values, which no caller wants
      or is prepared to deal with.  I'm not sure if there's a live bug from
      that, but it looks like there might be (given the dubious assumption
      that anyone actually has newlines in their tablespace paths).
      
      * It's not being very paranoid about the possibility of random stuff
      in the pg_tblspc directory.  IMO we should ignore anything without an
      OID-like name.
      
      The escaping rule change doesn't seem back-patchable: it'll require
      doubling of backslashes in the tablespace_map file, which is basically
      a basebackup format change.  The odds of that causing trouble are
      considerably more than the odds of the existing bug causing trouble.
      The rest of this seems somewhat unlikely to cause problems too,
      so no back-patch.
      8620a7f6
    • Tom Lane's avatar
      Prevent buffer overrun in read_tablespace_map(). · a50e4fd0
      Tom Lane authored
      Robert Foggia of Trustwave reported that read_tablespace_map()
      fails to prevent an overrun of its on-stack input buffer.
      Since the tablespace map file is presumed trustworthy, this does
      not seem like an interesting security vulnerability, but still
      we should fix it just in the name of robustness.
      
      While here, document that pg_basebackup's --tablespace-mapping option
      doesn't work with tar-format output, because it doesn't.  To make it
      work, we'd have to modify the tablespace_map file within the tarball
      sent by the server, which might be possible but I'm not volunteering.
      (Less-painful solutions would require changing the basebackup protocol
      so that the source server could adjust the map.  That's not very
      appetizing either.)
      a50e4fd0
  11. 12 Mar, 2021 1 commit
  12. 11 Mar, 2021 1 commit
    • Robert Haas's avatar
      Be clear about whether a recovery pause has taken effect. · 32fd2b57
      Robert Haas authored
      Previously, the code and documentation seem to have essentially
      assumed than a call to pg_wal_replay_pause() would take place
      immediately, but that's not the case, because we only check for a
      pause in certain places. This means that a tool that uses this
      function and then wants to do something else afterward that is
      dependent on the pause having taken effect doesn't know how long it
      needs to wait to be sure that no more WAL is going to be replayed.
      
      To avoid that, add a new function pg_get_wal_replay_pause_state()
      which returns either 'not paused', 'paused requested', or 'paused'.
      After calling pg_wal_replay_pause() the status will immediate change
      from 'not paused' to 'pause requested'; when the startup process
      has noticed this, the status will change to 'pause'.  For backward
      compatibility, pg_is_wal_replay_paused() still exists and returns
      the same thing as before: true if a pause has been requested,
      whether or not it has taken effect yet; and false if not.
      The documentation is updated to clarify.
      
      To improve the changes that a pause request is quickly confirmed
      effective, adjust things so that WaitForWALToBecomeAvailable will
      swiftly reach a call to recoveryPausesHere() when a pause request
      is made.
      
      Dilip Kumar, reviewed by Simon Riggs, Kyotaro Horiguchi, Yugo Nagata,
      Masahiko Sawada, and Bharath Rupireddy.
      
      Discussion: http://postgr.es/m/CAFiTN-vcLLWEm8Zr%3DYK83rgYrT9pbC8VJCfa1kY9vL3AUPfu6g%40mail.gmail.com
      32fd2b57
  13. 09 Mar, 2021 1 commit
    • Fujii Masao's avatar
      Track total amounts of times spent writing and syncing WAL data to disk. · ff99918c
      Fujii Masao authored
      This commit adds new GUC track_wal_io_timing. When this is enabled,
      the total amounts of time XLogWrite writes and issue_xlog_fsync syncs
      WAL data to disk are counted in pg_stat_wal. This information would be
      useful to check how much WAL write and sync affect the performance.
      
      Enabling track_wal_io_timing will make the server query the operating
      system for the current time every time WAL is written or synced,
      which may cause significant overhead on some platforms. To avoid such
      additional overhead in the server with track_io_timing enabled,
      this commit introduces track_wal_io_timing as a separate parameter from
      track_io_timing.
      
      Note that WAL write and sync activity by walreceiver has not been tracked yet.
      
      This commit makes the server also track the numbers of times XLogWrite
      writes and issue_xlog_fsync syncs WAL data to disk, in pg_stat_wal,
      regardless of the setting of track_wal_io_timing. This counters can be
      used to calculate the WAL write and sync time per request, for example.
      
      Bump PGSTAT_FILE_FORMAT_ID.
      
      Bump catalog version.
      
      Author: Masahiro Ikeda
      Reviewed-By: Japin Li, Hayato Kuroda, Masahiko Sawada, David Johnston, Fujii Masao
      Discussion: https://postgr.es/m/0509ad67b585a5b86a83d445dfa75392@oss.nttdata.com
      ff99918c
  14. 23 Feb, 2021 1 commit
  15. 17 Feb, 2021 1 commit
  16. 06 Feb, 2021 1 commit
  17. 29 Jan, 2021 1 commit
  18. 27 Jan, 2021 1 commit
  19. 25 Jan, 2021 1 commit
    • Robert Haas's avatar
      Remove CheckpointLock. · d18e7566
      Robert Haas authored
      Up until now, we've held this lock when performing a checkpoint or
      restartpoint, but commit 076a055a back
      in 2004 and commit 7e48b77b from 2009,
      taken together, have removed all need for this. In the present code,
      there's only ever one process entitled to attempt a checkpoint: either
      the checkpointer, during normal operation, or the postmaster, during
      single-user operation. So, we don't need the lock.
      
      One possible concern in making this change is that it means that
      a substantial amount of code where HOLD_INTERRUPTS() was previously
      in effect due to the preceding LWLockAcquire() will now be
      running without that. This could mean that ProcessInterrupts()
      gets called in places from which it didn't before. However, this
      seems unlikely to do very much, because the checkpointer doesn't
      have any signal mapped to die(), so it's not clear how,
      for example, ProcDiePending = true could happen in the first
      place. Similarly with ClientConnectionLost and recovery conflicts.
      
      Also, if there are any such problems, we might want to fix them
      rather than reverting this, since running lots of code with
      interrupt handling suspended is generally bad.
      
      Patch by me, per an inquiry by Amul Sul. Review by Tom Lane
      and Michael Paquier.
      
      Discussion: http://postgr.es/m/CAAJ_b97XnBBfYeSREDJorFsyoD1sHgqnNuCi=02mNQBUMnA=FA@mail.gmail.com
      d18e7566
  20. 18 Jan, 2021 1 commit
  21. 15 Jan, 2021 1 commit
  22. 11 Jan, 2021 1 commit
  23. 02 Jan, 2021 1 commit
  24. 28 Dec, 2020 1 commit
  25. 25 Dec, 2020 1 commit
  26. 24 Dec, 2020 1 commit
  27. 17 Dec, 2020 1 commit
  28. 14 Dec, 2020 1 commit
    • Michael Paquier's avatar
      Add some checkpoint/restartpoint status to ps display · df9274ad
      Michael Paquier authored
      This is done for end-of-recovery and shutdown checkpoints/restartpoints
      (end-of-recovery restartpoints don't exist) rather than all types of
      checkpoints, in cases where it may not be possible to rely on
      pg_stat_activity to get a status from the startup or checkpointer
      processes.
      
      For example, at the end of a crash recovery, this is useful to know if a
      checkpoint is running in the startup process, while previously the ps
      display may only show some information about "recovering" something,
      that can be confusing while a checkpoint runs.
      
      Author: Justin Pryzby
      Reviewed-by: Nathan Bossart, Kirk Jamison, Fujii Masao, Michael Paquier
      Discussion: https://postgr.es/m/20200818225238.GP17022@telsasoft.com
      df9274ad
  29. 04 Dec, 2020 1 commit
  30. 20 Nov, 2020 1 commit
  31. 16 Nov, 2020 1 commit
    • Fujii Masao's avatar
      Make the standby server promptly handle interrupt signals. · 2945a488
      Fujii Masao authored
      This commit changes the startup process in the standby server so that
      it handles the interrupt signals after waiting for wal_retrieve_retry_interval
      on the latch and resetting it, before entering another wait on the latch.
      This change causes the standby server to promptly handle interrupt signals.
      
      Otherwise, previously, there was the case where the standby needs to
      wait extra five seconds to shutdown when the shutdown request arrived
      while the startup process was waiting for wal_retrieve_retry_interval
      on the latch.
      
      Author: Fujii Masao, but implementation idea is from Soumyadeep Chakraborty
      Reviewed-by: Soumyadeep Chakraborty
      Discussion: https://postgr.es/m/9d7e6ab0-8a53-ddb9-63cd-289bcb25fe0e@oss.nttdata.com
      2945a488
  32. 11 Nov, 2020 1 commit
    • Tom Lane's avatar
      Fix and simplify some usages of TimestampDifference(). · ec29427c
      Tom Lane authored
      Introduce TimestampDifferenceMilliseconds() to simplify callers
      that would rather have the difference in milliseconds, instead of
      the select()-oriented seconds-and-microseconds format.  This gets
      rid of at least one integer division per call, and it eliminates
      some apparently-easy-to-mess-up arithmetic.
      
      Two of these call sites were in fact wrong:
      
      * pg_prewarm's autoprewarm_main() forgot to multiply the seconds
      by 1000, thus ending up with a delay 1000X shorter than intended.
      That doesn't quite make it a busy-wait, but close.
      
      * postgres_fdw's pgfdw_get_cleanup_result() thought it needed to compute
      microseconds not milliseconds, thus ending up with a delay 1000X longer
      than intended.  Somebody along the way had noticed this problem but
      misdiagnosed the cause, and imposed an ad-hoc 60-second limit rather
      than fixing the units.  This was relatively harmless in context, because
      we don't care that much about exactly how long this delay is; still,
      it's wrong.
      
      There are a few more callers of TimestampDifference() that don't
      have a direct need for seconds-and-microseconds, but can't use
      TimestampDifferenceMilliseconds() either because they do need
      microsecond precision or because they might possibly deal with
      intervals long enough to overflow 32-bit milliseconds.  It might be
      worth inventing another API to improve that, but that seems outside
      the scope of this patch; so those callers are untouched here.
      
      Given the fact that we are fixing some bugs, and the likelihood
      that future patches might want to back-patch code that uses this
      new API, back-patch to all supported branches.
      
      Alexey Kondratov and Tom Lane
      
      Discussion: https://postgr.es/m/3b1c053a21c07c1ed5e00be3b2b855ef@postgrespro.ru
      ec29427c
  33. 04 Nov, 2020 2 commits
  34. 06 Oct, 2020 1 commit
  35. 02 Oct, 2020 2 commits
    • Fujii Masao's avatar
      Add pg_stat_wal statistics view. · 8d9a9359
      Fujii Masao authored
      This view shows the statistics about WAL activity. Currently it has only
      two columns: wal_buffers_full and stats_reset. wal_buffers_full column
      indicates the number of times WAL data was written to the disk because
      WAL buffers got full. This information is useful when tuning wal_buffers.
      stats_reset column indicates the time at which these statistics were
      last reset.
      
      pg_stat_wal view is also the basic infrastructure to expose other
      various statistics about WAL activity later.
      
      Bump PGSTAT_FILE_FORMAT_ID due to the change in pgstat format.
      
      Bump catalog version.
      
      Author: Masahiro Ikeda
      Reviewed-by: Takayuki Tsunakawa, Kyotaro Horiguchi, Amit Kapila, Fujii Masao
      Discussion: https://postgr.es/m/188bd3f2d2233cf97753b5ced02bb050@oss.nttdata.com
      8d9a9359
    • Michael Paquier's avatar
      Add block information in error context of WAL REDO apply loop · 9d0bd95f
      Michael Paquier authored
      Providing this information can be useful for example when diagnosing
      problems related to recovery conflicts or for recovery issues without
      having to go through the output generated by pg_waldump to get some
      information about the blocks a WAL record works on.
      
      The block information is printed in the same format as pg_waldump.  This
      already existed in xlog.c for debugging purposes with -DWAL_DEBUG, so
      adding the block information in the callback has required just a small
      refactoring.
      
      Author: Bertrand Drouvot
      Reviewed-by: Michael Paquier, Masahiko Sawada
      Discussion: https://postgr.es/m/c31e2cba-efda-762c-f4ad-5c25e5dac3d0@amazon.com
      9d0bd95f