1. 28 Jan, 2022 1 commit
    • Fujii Masao's avatar
      Prevent memory context logging from sending log message to connected client. · 6e7ee55e
      Fujii Masao authored
      When pg_log_backend_memory_contexts() is executed, the target backend
      should use LOG_SERVER_ONLY to log its memory contexts, to prevent them
      from being sent to its connected client regardless of client_min_messages.
      But previously the backend unexpectedly used LOG to log the message
      "logging memory contexts of PID %d" and it could be sent to the client.
      This is a bug in memory context logging.
      
      To fix the bug, this commit changes that message so that it's logged with
      LOG_SERVER_ONLY.
      
      Back-patch to v14 where pg_log_backend_memory_contexts() was added.
      
      Author: Fujii Masao
      Reviewed-by: Bharath Rupireddy, Atsushi Torikoshi
      Discussion: https://postgr.es/m/82c12f36-86f7-5e72-79af-7f5c37f6cad7@oss.nttdata.com
      6e7ee55e
  2. 27 Jan, 2022 4 commits
    • Tomas Vondra's avatar
      Fix ordering of XIDs in ProcArrayApplyRecoveryInfo · fb2f8e53
      Tomas Vondra authored
      Commit 8431e296 reworked ProcArrayApplyRecoveryInfo to sort XIDs
      before adding them to KnownAssignedXids. But the XIDs are sorted using
      xidComparator, which compares the XIDs simply as uint32 values, not
      logically. KnownAssignedXidsAdd() however expects XIDs in logical order,
      and calls TransactionIdFollowsOrEquals() to enforce that. If there are
      XIDs for which the two orderings disagree, an error is raised and the
      recovery fails/restarts.
      
      Hitting this issue is fairly easy - you just need two transactions, one
      started before the 4B limit (e.g. XID 4294967290), the other sometime
      after it (e.g. XID 1000). Logically (4294967290 <= 1000) but when
      compared using xidComparator we try to add them in the opposite order.
      Which makes KnownAssignedXidsAdd() fail with an error like this:
      
        ERROR: out-of-order XID insertion in KnownAssignedXids
      
      This only happens during replica startup, while processing RUNNING_XACTS
      records to build the snapshot. Once we reach STANDBY_SNAPSHOT_READY, we
      skip these records. So this does not affect already running replicas,
      but if you restart (or create) a replica while there are transactions
      with XIDs for which the two orderings disagree, you may hit this.
      
      Long-running transactions and frequent replica restarts increase the
      likelihood of hitting this issue. Once the replica gets into this state,
      it can't be started (even if the old transactions are terminated).
      
      Fixed by sorting the XIDs logically - this is fine because we're dealing
      with normal XIDs (because it's XIDs assigned to backends) and from the
      same wraparound epoch (otherwise the backends could not be running at
      the same time on the primary node). So there are no problems with the
      triangle inequality, which is why xidComparator compares raw values.
      
      Investigation and root cause analysis by Abhijit Menon-Sen. Patch by me.
      
      This issue is present in all releases since 9.4, however releases up to
      9.6 are EOL already so backpatch to 10 only.
      
      Reviewed-by: Abhijit Menon-Sen
      Reviewed-by: Alvaro Herrera
      Backpatch-through: 10
      Discussion: https://postgr.es/m/36b8a501-5d73-277c-4972-f58a4dce088a%40enterprisedb.com
      fb2f8e53
    • Andrew Dunstan's avatar
      Improve msys2 detection for TAP tests · 999dc1d2
      Andrew Dunstan authored
      Perl instances on some msys toolchains (e.g. UCRT64) have their
      configured osname set to 'MSWin32' rather than 'msys'.  The test for
      the msys2 platform is adjusted accordingly.
      
      Backpatch to release 14.
      999dc1d2
    • Etsuro Fujita's avatar
      postgres_fdw: Fix handling of a pending asynchronous request in postgresReScanForeignScan(). · d1cca944
      Etsuro Fujita authored
      Commit 27e1f145 failed to process a pending asynchronous request made
      for a given ForeignScan node in postgresReScanForeignScan() (if any) in
      cases where we would only reset the next_tuple counter in that function,
      contradicting the assumption that there should be no pending
      asynchronous requests that have been made for async-capable subplans for
      the parent Append node after ReScan.  This led to an assert failure in
      an assert-enabled build.  I think this would also lead to mis-rewinding
      the cursor in that function in the case where we have already fetched
      one batch for the ForeignScan node and the asynchronous request has been
      made for the second batch, because even in that case we would just reset
      the counter when called from that function, so we would fail to execute
      MOVE BACKWARD ALL.
      
      To fix, modify that function to process the asynchronous request before
      restarting the scan.
      
      While at it, add a comment to a function to match other places.
      
      Per bug #17344 from Alexander Lakhin.  Back-patch to v14 where the
      aforesaid commit came in.
      
      Patch by me.  Test case by Alexander Lakhin, adjusted by me.  Reviewed
      and tested by Alexander Lakhin and Dmitry Dolgov.
      
      Discussion: https://postgr.es/m/17344-226b78b00de73a7e@postgresql.org
      d1cca944
    • Noah Misch's avatar
      On sparc64+ext4, suppress test failures from known WAL read failure. · d94a95cc
      Noah Misch authored
      Buildfarm members kittiwake, tadarida and snapper began to fail
      frequently when commits 3cd9c3b921977272e6650a5efbeade4203c4bca2 and
      f47ed79cc8a0cfa154dc7f01faaf59822552363f added tests of concurrency, but
      the problem was reachable before those commits.  Back-patch to v10 (all
      supported versions).
      
      Discussion: https://postgr.es/m/20220116210241.GC756210@rfd.leadboat.com
      d94a95cc
  3. 26 Jan, 2022 1 commit
    • Magnus Hagander's avatar
      Fix pg_hba_file_rules for authentication method cert · 4afae689
      Magnus Hagander authored
      For authentication method cert, clientcert=verify-full is implied. But
      the pg_hba_file_rules entry would incorrectly show clientcert=verify-ca.
      
      Per bug #17354
      
      Reported-By: Feike Steenbergen
      Reviewed-By: Jonathan Katz
      Backpatch-through: 12
      4afae689
  4. 25 Jan, 2022 3 commits
  5. 24 Jan, 2022 2 commits
    • Tom Lane's avatar
      Fix limitations on what SQL commands can be issued to a walsender. · 1efcc594
      Tom Lane authored
      In logical replication mode, a WalSender is supposed to be able
      to execute any regular SQL command, as well as the special
      replication commands.  Poor design of the replication-command
      parser caused it to fail in various cases, notably:
      
      * semicolons embedded in a command, or multiple SQL commands
      sent in a single message;
      
      * dollar-quoted literals containing odd numbers of single
      or double quote marks;
      
      * commands starting with a comment.
      
      The basic problem here is that we're trying to run repl_scanner.l
      across the entire input string even when it's not a replication
      command.  Since repl_scanner.l does not understand all of the
      token types known to the core lexer, this is doomed to have
      failure modes.
      
      We certainly don't want to make repl_scanner.l as big as scan.l,
      so instead rejigger stuff so that we only lex the first token of
      a non-replication command.  That will usually look like an IDENT
      to repl_scanner.l, though a comment would end up getting reported
      as a '-' or '/' single-character token.  If the token is a replication
      command keyword, we push it back and proceed normally with repl_gram.y
      parsing.  Otherwise, we can drop out of exec_replication_command()
      without examining the rest of the string.
      
      (It's still theoretically possible for repl_scanner.l to fail on
      the first token; but that could only happen if it's an unterminated
      single- or double-quoted string, in which case you'd have gotten
      largely the same error from the core lexer too.)
      
      In this way, repl_gram.y isn't involved at all in handling general
      SQL commands, so we can get rid of the SQLCmd node type.  (In
      the back branches, we can't remove it because renumbering enum
      NodeTag would be an ABI break; so just leave it sit there unused.)
      
      I failed to resist the temptation to clean up some other sloppy
      coding in repl_scanner.l while at it.  The only externally-visible
      behavior change from that is it now accepts \r and \f as whitespace,
      same as the core lexer.
      
      Per bug #17379 from Greg Rychlewski.  Back-patch to all supported
      branches.
      
      Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org
      1efcc594
    • Tom Lane's avatar
      Remember to reset yy_start state when firing up repl_scanner.l. · ef9706bb
      Tom Lane authored
      Without this, we get odd behavior when the previous cycle of
      lexing exited in a non-default exclusive state.  Every other
      copy of this code is aware that it has to do BEGIN(INITIAL),
      but repl_scanner.l did not get that memo.
      
      The real-world impact of this is probably limited, since most
      replication clients would abandon their connection after getting
      a syntax error.  Still, it's a bug.
      
      This mistake is old, so back-patch to all supported branches.
      
      Discussion: https://postgr.es/m/1874781.1643035952@sss.pgh.pa.us
      ef9706bb
  6. 23 Jan, 2022 4 commits
  7. 22 Jan, 2022 1 commit
  8. 21 Jan, 2022 4 commits
  9. 20 Jan, 2022 4 commits
  10. 19 Jan, 2022 2 commits
  11. 18 Jan, 2022 2 commits
    • Thomas Munro's avatar
      Try to stabilize the reloptions test. · ff4d0d8c
      Thomas Munro authored
      Where we test vacuum_truncate's effects, sometimes this is failing to
      truncate as expected on the build farm.  That could be explained by page
      skipping, so disable it explicitly, with the theory that commit fe246d1c
      didn't go far enough.
      
      Back-patch to 12, where the vacuum_truncate tests were added.
      
      Discussion: https://postgr.es/m/CA%2BhUKGLT2UL5_JhmBzUgkdyKfc%3D5J-gJSQJLysMs4rqLUKLAzw%40mail.gmail.com
      ff4d0d8c
    • Tom Lane's avatar
      Fix psql \d's query for identifying parent triggers. · 3886785b
      Tom Lane authored
      The original coding (from c33869cc) failed with "more than one row
      returned by a subquery used as an expression" if there were unrelated
      triggers of the same tgname on parent partitioned tables.  (That's
      possible because statement-level triggers don't get inherited.)  Fix
      by applying LIMIT 1 after sorting the candidates by inheritance level.
      
      Also, wrap the subquery in a CASE so that we don't have to execute it at
      all when the trigger is visibly non-inherited.  Aside from saving some
      cycles, this avoids the need for a confusing and undocumented NULLIF().
      
      While here, tweak the format of the emitted query to look a bit
      nicer for "psql -E", and add some explanation of this subquery,
      because it badly needs it.
      
      Report and patch by Justin Pryzby (with some editing by me).
      Back-patch to v13 where the faulty code came in.
      
      Discussion: https://postgr.es/m/20211217154356.GJ17618@telsasoft.com
      3886785b
  12. 17 Jan, 2022 2 commits
    • Tom Lane's avatar
      Avoid calling gettext() in signal handlers. · 4e872656
      Tom Lane authored
      It seems highly unlikely that gettext() can be relied on to be
      async-signal-safe.  psql used to understand that, but someone got
      it wrong long ago in the src/bin/scripts/ version of handle_sigint,
      and then the bad idea was perpetuated when those two versions were
      unified into src/fe_utils/cancel.c.
      
      I'm unsure why there have not been field complaints about this
      ... maybe gettext() is signal-safe once it's translated at least
      one message?  But we have no business assuming any such thing.
      
      In cancel.c (v13 and up), I preserved our ability to localize
      "Cancel request sent" messages by invoking gettext() before
      the signal handler is set up.  In earlier branches I just made
      src/bin/scripts/ not localize those messages, as psql did then.
      
      (Just for extra unsafety, the src/bin/scripts/ version was
      invoking fprintf() from a signal handler.  Sigh.)
      
      Noted while fixing signal-safety issues in PQcancel() itself.
      Back-patch to all supported branches.
      
      Discussion: https://postgr.es/m/2937814.1641960929@sss.pgh.pa.us
      4e872656
    • Tom Lane's avatar
      Avoid calling strerror[_r] in PQcancel(). · 05094987
      Tom Lane authored
      PQcancel() is supposed to be safe to call from a signal handler,
      and indeed psql uses it that way.  All of the library functions
      it uses are specified to be async-signal-safe by POSIX ...
      except for strerror.  Neither plain strerror nor strerror_r
      are considered safe.  When this code was written, back in the
      dark ages, we probably figured "oh, strerror will just index
      into a constant array of strings" ... but in any locale except C,
      that's unlikely to be true.  Probably the reason we've not heard
      complaints is that (a) this error-handling code is unlikely to be
      reached in normal use, and (b) in many scenarios, localized error
      strings would already have been loaded, after which maybe it's
      safe to call strerror here.  Still, this is clearly unacceptable.
      
      The best we can do without relying on strerror is to print the
      decimal value of errno, so make it do that instead.  (This is
      probably not much loss of user-friendliness, given that it is
      hard to get a failure here.)
      
      Back-patch to all supported branches.
      
      Discussion: https://postgr.es/m/2937814.1641960929@sss.pgh.pa.us
      05094987
  13. 16 Jan, 2022 2 commits
  14. 15 Jan, 2022 2 commits
    • Tomas Vondra's avatar
      Build inherited extended stats on partitioned tables · ea212bd9
      Tomas Vondra authored
      Commit 859b3003de disabled building of extended stats for inheritance
      trees, to prevent updating the same catalog row twice. While that
      resolved the issue, it also means there are no extended stats for
      declaratively partitioned tables, because there are no data in the
      non-leaf relations.
      
      That also means declaratively partitioned tables were not affected by
      the issue 859b3003de addressed, which means this is a regression
      affecting queries that calculate estimates for the whole inheritance
      tree as a whole (which includes e.g. GROUP BY queries).
      
      But because partitioned tables are empty, we can invert the condition
      and build statistics only for the case with inheritance, without losing
      anything. And we can consider them when calculating estimates.
      
      It may be necessary to run ANALYZE on partitioned tables, to collect
      proper statistics. For declarative partitioning there should no prior
      statistics, and it might take time before autoanalyze is triggered. For
      tables partitioned by inheritance the statistics may include data from
      child relations (if built 859b3003de), contradicting the current code.
      
      Report and patch by Justin Pryzby, minor fixes and cleanup by me.
      Backpatch all the way back to PostgreSQL 10, where extended statistics
      were introduced (same as 859b3003de).
      
      Author: Justin Pryzby
      Reported-by: Justin Pryzby
      Backpatch-through: 10
      Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
      ea212bd9
    • Tomas Vondra's avatar
      Ignore extended statistics for inheritance trees · 2cc007fd
      Tomas Vondra authored
      Since commit 859b3003de we only build extended statistics for individual
      relations, ignoring the child relations. This resolved the issue with
      updating catalog tuple twice, but we still tried to use the statistics
      when calculating estimates for the whole inheritance tree. When the
      relations contain very distinct data, it may produce bogus estimates.
      
      This is roughly the same issue 427c6b5b addressed ~15 years ago, and we
      fix it the same way - by ignoring extended statistics when calculating
      estimates for the inheritance tree as a whole. We still consider
      extended statistics when calculating estimates for individual child
      relations, of course.
      
      This may result in plan changes due to different estimates, but if the
      old statistics were not describing the inheritance tree particularly
      well it's quite likely the new plans is actually better.
      
      Report and patch by Justin Pryzby, minor fixes and cleanup by me.
      Backpatch all the way back to PostgreSQL 10, where extended statistics
      were introduced (same as 859b3003de).
      
      Author: Justin Pryzby
      Reported-by: Justin Pryzby
      Backpatch-through: 10
      Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
      2cc007fd
  15. 14 Jan, 2022 2 commits
    • Andres Freund's avatar
      Fix possible HOT corruption when RECENTLY_DEAD changes to DEAD while pruning. · dad1539a
      Andres Freund authored
      Since dc7420c2 the horizon used for pruning is determined "lazily". A more
      accurate horizon is built on-demand, rather than in GetSnapshotData(). If a
      horizon computation is triggered between two HeapTupleSatisfiesVacuum() calls
      for the same tuple, the result can change from RECENTLY_DEAD to DEAD.
      
      heap_page_prune() can process the same tid multiple times (once following an
      update chain, once "directly"). When the result of HeapTupleSatisfiesVacuum()
      of a tuple changes from RECENTLY_DEAD during the first access, to DEAD in the
      second, the "tuple is DEAD and doesn't chain to anything else" path in
      heap_prune_chain() can end up marking the target of a LP_REDIRECT ItemId
      unused.
      
      Initially not easily visible,
      Once the target of a LP_REDIRECT ItemId is marked unused, a new tuple version
      can reuse it. At that point the corruption may become visible, as index
      entries pointing to the "original" redirect item, now point to a unrelated
      tuple.
      
      To fix, compute HTSV for all tuples on a page only once. This fixes the entire
      class of problems of HTSV changing inside heap_page_prune(). However,
      visibility changes can obviously still occur between HTSV checks inside
      heap_page_prune() and outside (e.g. in lazy_scan_prune()).
      
      The computation of HTSV is now done in bulk, in heap_page_prune(), rather than
      on-demand in heap_prune_chain(). Besides being a bit simpler, it also is
      faster: Memory accesses can happen sequentially, rather than in the order of
      HOT chains.
      
      There are other causes of HeapTupleSatisfiesVacuum() results changing between
      two visibility checks for the same tuple, even before dc7420c2. E.g.
      HEAPTUPLE_INSERT_IN_PROGRESS can change to HEAPTUPLE_DEAD when a transaction
      aborts between the two checks. None of the these other visibility status
      changes are known to cause corruption, but heap_page_prune()'s approach makes
      it hard to be confident.
      
      A patch implementing a more fundamental redesign of heap_page_prune(), which
      fixes this bug and simplifies pruning substantially, has been proposed by
      Peter Geoghegan in
      https://postgr.es/m/CAH2-WzmNk6V6tqzuuabxoxM8HJRaWU6h12toaS-bqYcLiht16A@mail.gmail.com
      
      However, that redesign is larger change than desirable for backpatching. As
      the new design still benefits from the batched visibility determination
      introduced in this commit, it makes sense to commit this narrower fix to 14
      and master, and then commit Peter's improvement in master.
      
      The precise sequence required to trigger the bug is complicated and hard to do
      exercise in an isolation test (until we have wait points). Due to that the
      isolation test initially posted at
      https://postgr.es/m/20211119003623.d3jusiytzjqwb62p%40alap3.anarazel.de
      and updated in
      https://postgr.es/m/20211122175914.ayk6gg6nvdwuhrzb%40alap3.anarazel.de
      isn't committable.
      
      A followup commit will introduce additional assertions, to detect problems
      like this more easily.
      
      Bug: #17255
      Reported-By: default avatarAlexander Lakhin <exclusion@gmail.com>
      Debugged-By: default avatarAndres Freund <andres@anarazel.de>
      Debugged-By: default avatarPeter Geoghegan <pg@bowt.ie>
      Author: Andres Freund <andres@andres@anarazel.de>
      Reviewed-By: default avatarPeter Geoghegan <pg@bowt.ie>
      Discussion: https://postgr.es/m/20211122175914.ayk6gg6nvdwuhrzb@alap3.anarazel.de
      Backpatch: 14-, the oldest branch containing dc7420c2
      dad1539a
    • Michael Paquier's avatar
      Revert error handling improvements for cryptohashes · ad5b6f24
      Michael Paquier authored
      This reverts commits ab27df24, af8d530e and 3a0cced8, that introduced
      pg_cryptohash_error().  In order to make the core code able to pass down
      the new error types that this introduced, some of the MD5-related
      routines had to be reworked, causing an ABI breakage, but we found that
      some external extensions rely on them.  Maintaining compatibility
      outweights the error report benefits, so just revert the change in v14.
      
      Reported-by: Laurenz Albe
      Discussion: https://postgr.es/m/9f0c0a96d28cf14fc87296bbe67061c14eb53ae8.camel@cybertec.at
      ad5b6f24
  16. 13 Jan, 2022 2 commits
  17. 12 Jan, 2022 2 commits
    • Peter Geoghegan's avatar
      Fix memory leak in indexUnchanged hint mechanism. · 41ee68a9
      Peter Geoghegan authored
      Commit 9dc718bd added a "logically unchanged by UPDATE" hinting
      mechanism, which is currently used within nbtree indexes only (see
      commit d168b666).  This mechanism determined whether or not the incoming
      item is a logically unchanged duplicate (a duplicate needed only for
      MVCC versioning purposes) once per row updated per non-HOT update.  This
      approach led to memory leaks which were noticeable with an UPDATE
      statement that updated sufficiently many rows, at least on tables that
      happen to have an expression index.
      
      On HEAD, fix the issue by adding a cache to the executor's per-index
      IndexInfo struct.
      
      Take a different approach on Postgres 14 to avoid an ABI break: simply
      pass down the hint to all indexes unconditionally with non-HOT UPDATEs.
      This is deemed acceptable because the hint is currently interpreted
      within btinsert() as "perform a bottom-up index deletion pass if and
      when the only alternative is splitting the leaf page -- prefer to delete
      any LP_DEAD-set items first".  nbtree must always treat the hint as a
      noisy signal about what might work, as a strategy of last resort, with
      costs imposed on non-HOT updaters.  (The same thing might not be true
      within another index AM that applies the hint, which is why the original
      behavior is preserved on HEAD.)
      
      Author: Peter Geoghegan <pg@bowt.ie>
      Reported-By: default avatarKlaudie Willis <Klaudie.Willis@protonmail.com>
      Diagnosed-By: default avatarTom Lane <tgl@sss.pgh.pa.us>
      Discussion: https://postgr.es/m/261065.1639497535@sss.pgh.pa.us
      Backpatch: 14-, where the hinting mechanism was added.
      41ee68a9
    • Michael Paquier's avatar
      Fix comment related to pg_cryptohash_error() · af8d530e
      Michael Paquier authored
      One of the comments introduced in b69aba7 was worded a bit weirdly, so
      improve it.
      
      Reported-by: Sergey Shinderuk
      Discussion: https://postgr.es/m/71b9a5d2-a3bf-83bc-a243-93dcf0bcfb3b@postgrespro.ru
      Backpatch-through: 14
      af8d530e