1. 17 Jan, 2022 1 commit
    • Tom Lane's avatar
      Avoid calling strerror[_r] in PQcancel(). · 05094987
      Tom Lane authored
      PQcancel() is supposed to be safe to call from a signal handler,
      and indeed psql uses it that way.  All of the library functions
      it uses are specified to be async-signal-safe by POSIX ...
      except for strerror.  Neither plain strerror nor strerror_r
      are considered safe.  When this code was written, back in the
      dark ages, we probably figured "oh, strerror will just index
      into a constant array of strings" ... but in any locale except C,
      that's unlikely to be true.  Probably the reason we've not heard
      complaints is that (a) this error-handling code is unlikely to be
      reached in normal use, and (b) in many scenarios, localized error
      strings would already have been loaded, after which maybe it's
      safe to call strerror here.  Still, this is clearly unacceptable.
      
      The best we can do without relying on strerror is to print the
      decimal value of errno, so make it do that instead.  (This is
      probably not much loss of user-friendliness, given that it is
      hard to get a failure here.)
      
      Back-patch to all supported branches.
      
      Discussion: https://postgr.es/m/2937814.1641960929@sss.pgh.pa.us
      05094987
  2. 16 Jan, 2022 2 commits
  3. 15 Jan, 2022 2 commits
    • Tomas Vondra's avatar
      Build inherited extended stats on partitioned tables · ea212bd9
      Tomas Vondra authored
      Commit 859b3003de disabled building of extended stats for inheritance
      trees, to prevent updating the same catalog row twice. While that
      resolved the issue, it also means there are no extended stats for
      declaratively partitioned tables, because there are no data in the
      non-leaf relations.
      
      That also means declaratively partitioned tables were not affected by
      the issue 859b3003de addressed, which means this is a regression
      affecting queries that calculate estimates for the whole inheritance
      tree as a whole (which includes e.g. GROUP BY queries).
      
      But because partitioned tables are empty, we can invert the condition
      and build statistics only for the case with inheritance, without losing
      anything. And we can consider them when calculating estimates.
      
      It may be necessary to run ANALYZE on partitioned tables, to collect
      proper statistics. For declarative partitioning there should no prior
      statistics, and it might take time before autoanalyze is triggered. For
      tables partitioned by inheritance the statistics may include data from
      child relations (if built 859b3003de), contradicting the current code.
      
      Report and patch by Justin Pryzby, minor fixes and cleanup by me.
      Backpatch all the way back to PostgreSQL 10, where extended statistics
      were introduced (same as 859b3003de).
      
      Author: Justin Pryzby
      Reported-by: Justin Pryzby
      Backpatch-through: 10
      Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
      ea212bd9
    • Tomas Vondra's avatar
      Ignore extended statistics for inheritance trees · 2cc007fd
      Tomas Vondra authored
      Since commit 859b3003de we only build extended statistics for individual
      relations, ignoring the child relations. This resolved the issue with
      updating catalog tuple twice, but we still tried to use the statistics
      when calculating estimates for the whole inheritance tree. When the
      relations contain very distinct data, it may produce bogus estimates.
      
      This is roughly the same issue 427c6b5b addressed ~15 years ago, and we
      fix it the same way - by ignoring extended statistics when calculating
      estimates for the inheritance tree as a whole. We still consider
      extended statistics when calculating estimates for individual child
      relations, of course.
      
      This may result in plan changes due to different estimates, but if the
      old statistics were not describing the inheritance tree particularly
      well it's quite likely the new plans is actually better.
      
      Report and patch by Justin Pryzby, minor fixes and cleanup by me.
      Backpatch all the way back to PostgreSQL 10, where extended statistics
      were introduced (same as 859b3003de).
      
      Author: Justin Pryzby
      Reported-by: Justin Pryzby
      Backpatch-through: 10
      Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
      2cc007fd
  4. 14 Jan, 2022 2 commits
    • Andres Freund's avatar
      Fix possible HOT corruption when RECENTLY_DEAD changes to DEAD while pruning. · dad1539a
      Andres Freund authored
      Since dc7420c2 the horizon used for pruning is determined "lazily". A more
      accurate horizon is built on-demand, rather than in GetSnapshotData(). If a
      horizon computation is triggered between two HeapTupleSatisfiesVacuum() calls
      for the same tuple, the result can change from RECENTLY_DEAD to DEAD.
      
      heap_page_prune() can process the same tid multiple times (once following an
      update chain, once "directly"). When the result of HeapTupleSatisfiesVacuum()
      of a tuple changes from RECENTLY_DEAD during the first access, to DEAD in the
      second, the "tuple is DEAD and doesn't chain to anything else" path in
      heap_prune_chain() can end up marking the target of a LP_REDIRECT ItemId
      unused.
      
      Initially not easily visible,
      Once the target of a LP_REDIRECT ItemId is marked unused, a new tuple version
      can reuse it. At that point the corruption may become visible, as index
      entries pointing to the "original" redirect item, now point to a unrelated
      tuple.
      
      To fix, compute HTSV for all tuples on a page only once. This fixes the entire
      class of problems of HTSV changing inside heap_page_prune(). However,
      visibility changes can obviously still occur between HTSV checks inside
      heap_page_prune() and outside (e.g. in lazy_scan_prune()).
      
      The computation of HTSV is now done in bulk, in heap_page_prune(), rather than
      on-demand in heap_prune_chain(). Besides being a bit simpler, it also is
      faster: Memory accesses can happen sequentially, rather than in the order of
      HOT chains.
      
      There are other causes of HeapTupleSatisfiesVacuum() results changing between
      two visibility checks for the same tuple, even before dc7420c2. E.g.
      HEAPTUPLE_INSERT_IN_PROGRESS can change to HEAPTUPLE_DEAD when a transaction
      aborts between the two checks. None of the these other visibility status
      changes are known to cause corruption, but heap_page_prune()'s approach makes
      it hard to be confident.
      
      A patch implementing a more fundamental redesign of heap_page_prune(), which
      fixes this bug and simplifies pruning substantially, has been proposed by
      Peter Geoghegan in
      https://postgr.es/m/CAH2-WzmNk6V6tqzuuabxoxM8HJRaWU6h12toaS-bqYcLiht16A@mail.gmail.com
      
      However, that redesign is larger change than desirable for backpatching. As
      the new design still benefits from the batched visibility determination
      introduced in this commit, it makes sense to commit this narrower fix to 14
      and master, and then commit Peter's improvement in master.
      
      The precise sequence required to trigger the bug is complicated and hard to do
      exercise in an isolation test (until we have wait points). Due to that the
      isolation test initially posted at
      https://postgr.es/m/20211119003623.d3jusiytzjqwb62p%40alap3.anarazel.de
      and updated in
      https://postgr.es/m/20211122175914.ayk6gg6nvdwuhrzb%40alap3.anarazel.de
      isn't committable.
      
      A followup commit will introduce additional assertions, to detect problems
      like this more easily.
      
      Bug: #17255
      Reported-By: default avatarAlexander Lakhin <exclusion@gmail.com>
      Debugged-By: default avatarAndres Freund <andres@anarazel.de>
      Debugged-By: default avatarPeter Geoghegan <pg@bowt.ie>
      Author: Andres Freund <andres@andres@anarazel.de>
      Reviewed-By: default avatarPeter Geoghegan <pg@bowt.ie>
      Discussion: https://postgr.es/m/20211122175914.ayk6gg6nvdwuhrzb@alap3.anarazel.de
      Backpatch: 14-, the oldest branch containing dc7420c2
      dad1539a
    • Michael Paquier's avatar
      Revert error handling improvements for cryptohashes · ad5b6f24
      Michael Paquier authored
      This reverts commits ab27df24, af8d530e and 3a0cced8, that introduced
      pg_cryptohash_error().  In order to make the core code able to pass down
      the new error types that this introduced, some of the MD5-related
      routines had to be reworked, causing an ABI breakage, but we found that
      some external extensions rely on them.  Maintaining compatibility
      outweights the error report benefits, so just revert the change in v14.
      
      Reported-by: Laurenz Albe
      Discussion: https://postgr.es/m/9f0c0a96d28cf14fc87296bbe67061c14eb53ae8.camel@cybertec.at
      ad5b6f24
  5. 13 Jan, 2022 2 commits
  6. 12 Jan, 2022 2 commits
    • Peter Geoghegan's avatar
      Fix memory leak in indexUnchanged hint mechanism. · 41ee68a9
      Peter Geoghegan authored
      Commit 9dc718bd added a "logically unchanged by UPDATE" hinting
      mechanism, which is currently used within nbtree indexes only (see
      commit d168b666).  This mechanism determined whether or not the incoming
      item is a logically unchanged duplicate (a duplicate needed only for
      MVCC versioning purposes) once per row updated per non-HOT update.  This
      approach led to memory leaks which were noticeable with an UPDATE
      statement that updated sufficiently many rows, at least on tables that
      happen to have an expression index.
      
      On HEAD, fix the issue by adding a cache to the executor's per-index
      IndexInfo struct.
      
      Take a different approach on Postgres 14 to avoid an ABI break: simply
      pass down the hint to all indexes unconditionally with non-HOT UPDATEs.
      This is deemed acceptable because the hint is currently interpreted
      within btinsert() as "perform a bottom-up index deletion pass if and
      when the only alternative is splitting the leaf page -- prefer to delete
      any LP_DEAD-set items first".  nbtree must always treat the hint as a
      noisy signal about what might work, as a strategy of last resort, with
      costs imposed on non-HOT updaters.  (The same thing might not be true
      within another index AM that applies the hint, which is why the original
      behavior is preserved on HEAD.)
      
      Author: Peter Geoghegan <pg@bowt.ie>
      Reported-By: default avatarKlaudie Willis <Klaudie.Willis@protonmail.com>
      Diagnosed-By: default avatarTom Lane <tgl@sss.pgh.pa.us>
      Discussion: https://postgr.es/m/261065.1639497535@sss.pgh.pa.us
      Backpatch: 14-, where the hinting mechanism was added.
      41ee68a9
    • Michael Paquier's avatar
      Fix comment related to pg_cryptohash_error() · af8d530e
      Michael Paquier authored
      One of the comments introduced in b69aba7 was worded a bit weirdly, so
      improve it.
      
      Reported-by: Sergey Shinderuk
      Discussion: https://postgr.es/m/71b9a5d2-a3bf-83bc-a243-93dcf0bcfb3b@postgrespro.ru
      Backpatch-through: 14
      af8d530e
  7. 11 Jan, 2022 2 commits
    • Tom Lane's avatar
      Clean up error message reported after \password encryption failure. · ab27df24
      Tom Lane authored
      Experimenting with FIPS mode enabled, I saw
      
      regression=# \password joe
      Enter new password for user "joe":
      Enter it again:
      could not encrypt password: disabled for FIPS
      out of memory
      
      because PQencryptPasswordConn was still of the opinion that "out of
      memory" is always appropriate to print.
      
      Minor oversight in b69aba745.  Like that one, back-patch to v14.
      ab27df24
    • Michael Paquier's avatar
      Improve error handling of cryptohash computations · 3a0cced8
      Michael Paquier authored
      The existing cryptohash facility was causing problems in some code paths
      related to MD5 (frontend and backend) that relied on the fact that the
      only type of error that could happen would be an OOM, as the MD5
      implementation used in PostgreSQL ~13 (the in-core implementation is
      used when compiling with or without OpenSSL in those older versions),
      could fail only under this circumstance.
      
      The new cryptohash facilities can fail for reasons other than OOMs, like
      attempting MD5 when FIPS is enabled (upstream OpenSSL allows that up to
      1.0.2, Fedora and Photon patch OpenSSL 1.1.1 to allow that), so this
      would cause incorrect reports to show up.
      
      This commit extends the cryptohash APIs so as callers of those routines
      can fetch more context when an error happens, by using a new routine
      called pg_cryptohash_error().  The error states are stored within each
      implementation's internal context data, so as it is possible to extend
      the logic depending on what's suited for an implementation.  The default
      implementation requires few error states, but OpenSSL could report
      various issues depending on its internal state so more is needed in
      cryptohash_openssl.c, and the code is shaped so as we are always able to
      grab the necessary information.
      
      The core code is changed to adapt to the new error routine, painting
      more "const" across the call stack where the static errors are stored,
      particularly in authentication code paths on variables that provide
      log details.  This way, any future changes would warn if attempting to
      free these strings.  The MD5 authentication code was also a bit blurry
      about the handling of "logdetail" (LOG sent to the postmaster), so
      improve the comments related that, while on it.
      
      The origin of the problem is 87ae9691, that introduced the centralized
      cryptohash facility.  Extra changes are done for pgcrypto in v14 for the
      non-OpenSSL code path to cope with the improvements done by this
      commit.
      
      Reported-by: Michael Mühlbeyer
      Author: Michael Paquier
      Reviewed-by: Tom Lane
      Discussion: https://postgr.es/m/89B7F072-5BBE-4C92-903E-D83E865D9367@trivadis.com
      Backpatch-through: 14
      3a0cced8
  8. 10 Jan, 2022 2 commits
  9. 08 Jan, 2022 3 commits
    • Tom Lane's avatar
      Fix results of index-only scans on btree_gist char(N) indexes. · 043c1e1a
      Tom Lane authored
      If contrib/btree_gist is used to make a GIST index on a char(N)
      (bpchar) column, and that column is retrieved via an index-only
      scan, what came out had all trailing spaces removed.  Since
      that doesn't happen in any other kind of table scan, this is
      clearly a bug.  The cause is that gbt_bpchar_compress() strips
      trailing spaces (using rtrim1) before a new index entry is made.
      That was probably a good idea when this code was first written,
      but since we invented index-only scans, it's not so good.
      
      One answer could be to mark this opclass as incapable of index-only
      scans.  But to do so, we'd need an extension module version bump,
      followed by manual action by DBAs to install the updated version
      of btree_gist.  And it's not really a desirable place to end up,
      anyway.
      
      Instead, let's fix the code by removing the unwanted space-stripping
      action and adjusting the opclass's comparison logic to ignore
      trailing spaces as bpchar normally does.  This will not hinder
      cases that work today, since index searches with this logic will
      act the same whether trailing spaces are stored or not.  It will
      not by itself fix the problem of getting space-stripped results
      from index-only scans, of course.  Users who care about that can
      REINDEX affected indexes after installing this update, to immediately
      replace all improperly-truncated index entries.  Otherwise, it can
      be expected that the index's behavior will change incrementally as
      old entries are replaced by new ones.
      
      Per report from Alexander Lakhin.  Back-patch to all supported branches.
      
      Discussion: https://postgr.es/m/696c995b-b37f-5526-f45d-04abe713179f@gmail.com
      043c1e1a
    • Michael Paquier's avatar
      Fix issues with describe queries of extended statistics in psql · f5bea836
      Michael Paquier authored
      This addresses some problems in the describe queries used for extended
      statistics:
      - Two schema qualifications for the text type were missing for \dX.
      - The list of extended statistics listed for a table through \d was
      ordered based on the object OIDs, but it is more consistent with the
      other commands to order by namespace and then by object name.
      - A couple of aliases were not used in \d.  These are removed.
      
      This is similar to commits 1f092a3 and 07f8a9e.
      
      Author: Justin Pryzby
      Discussion: https://postgr.es/m/20220107022235.GA14051@telsasoft.com
      Backpatch-through: 14
      f5bea836
    • Bruce Momjian's avatar
      Update copyright for 2022 · 61c8da50
      Bruce Momjian authored
      Backpatch-through: 10
      61c8da50
  10. 07 Jan, 2022 1 commit
  11. 06 Jan, 2022 2 commits
  12. 05 Jan, 2022 1 commit
    • Michael Paquier's avatar
      Reduce relcache access in WAL sender streaming logical changes · 5ddfebde
      Michael Paquier authored
      get_rel_sync_entry(), which is called each time a change needs to be
      logically replicated, is a rather hot code path in the WAL sender
      sending logical changes.  This code path was doing a relcache access on
      relkind and relpartition for each logical change, but we only need to
      know this information when building or re-building the cached
      information for a relation.
      
      Some measurements prove that this is noticeable in perf profiles,
      particularly when attempting to replicate changes from relations that
      are not published as these cause less overhead in the WAL sender,
      delaying further the replication of changes for relations that are
      published.
      
      Issue introduced in 83fd4532.
      
      Author: Hou Zhijie
      Reviewed-by: Kyotaro Horiguchi, Euler Taveira
      Discussion: https://postgr.es/m/OS0PR01MB5716E863AA9E591C1F010F7A947D9@OS0PR01MB5716.jpnprd01.prod.outlook.com
      Backpatch-through: 13
      5ddfebde
  13. 04 Jan, 2022 2 commits
  14. 03 Jan, 2022 2 commits
    • Tom Lane's avatar
      Fix index-only scan plans, take 2. · d228af79
      Tom Lane authored
      Commit 4ace45677 failed to fix the problem fully, because the
      same issue of attempting to fetch a non-returnable index column
      can occur when rechecking the indexqual after using a lossy index
      operator.  Moreover, it broke EXPLAIN for such indexquals (which
      indicates a gap in our test cases :-().
      
      Revert the code changes of 4ace45677 in favor of adding a new field
      to struct IndexOnlyScan, containing a version of the indexqual that
      can be executed against the index-returned tuple without using any
      non-returnable columns.  (The restrictions imposed by check_index_only
      guarantee this is possible, although we may have to recompute indexed
      expressions.)  Support construction of that during setrefs.c
      processing by marking IndexOnlyScan.indextlist entries as resjunk
      if they can't be returned, rather than removing them entirely.
      (We could alternatively require setrefs.c to look up the IndexOptInfo
      again, but abusing resjunk this way seems like a reasonably safe way
      to avoid needing to do that.)
      
      This solution isn't great from an API-stability standpoint: if there
      are any extensions out there that build IndexOnlyScan structs directly,
      they'll be broken in the next minor releases.  However, only a very
      invasive extension would be likely to do such a thing.  There's no
      change in the Path representation, so typical planner extensions
      shouldn't have a problem.
      
      As before, back-patch to all supported branches.
      
      Discussion: https://postgr.es/m/3179992.1641150853@sss.pgh.pa.us
      Discussion: https://postgr.es/m/17350-b5bdcf476e5badbb@postgresql.org
      d228af79
    • Michael Paquier's avatar
      pg_stat_statements: Remove obsolete comment · 52d50261
      Michael Paquier authored
      Since 4f0b0966, pgss_store() does nothing if compute_query_id is disabled
      or if no other module computed a query identifier, but the top comment
      of this function did not reflect that.  This behavior is already
      documented in its own code path, and this just removes the inconsistent
      comment.
      
      Author: Kyotaro Horiguchi
      Reviewed-by: Julien Rouhaud
      Discussion: https://postgr.es/m/20211122.153823.1325120762360533122.horikyota.ntt@gmail.com
      Backpatch-through: 14
      52d50261
  15. 02 Jan, 2022 1 commit
  16. 01 Jan, 2022 1 commit
    • Tom Lane's avatar
      Fix index-only scan plans when not all index columns can be returned. · cabea571
      Tom Lane authored
      If an index has both returnable and non-returnable columns, and one of
      the non-returnable columns is an expression using a Var that is in a
      returnable column, then a query returning that expression could result
      in an index-only scan plan that attempts to read the non-returnable
      column, instead of recomputing the expression from the returnable
      column as intended.
      
      To fix, redefine the "indextlist" list of an IndexOnlyScan plan node
      as containing null Consts in place of any non-returnable columns.
      This solves the problem by preventing setrefs.c from falsely matching
      to such entries.  The executor is happy since it only cares about the
      exposed types of the entries, and ruleutils.c doesn't care because a
      correct plan won't reference those entries.  I considered some other
      ways to prevent setrefs.c from doing the wrong thing, but this way
      seems good since (a) it allows a very localized fix, (b) it makes
      the indextlist structure more compact in many cases, and (c) the
      indextlist is now a more faithful representation of what the index AM
      will actually produce, viz. nulls for any non-returnable columns.
      
      This is easier to hit since we introduced included columns, but it's
      possible to construct failing examples without that, as per the
      added regression test.  Hence, back-patch to all supported branches.
      
      Per bug #17350 from Louis Jachiet.
      
      Discussion: https://postgr.es/m/17350-b5bdcf476e5badbb@postgresql.org
      cabea571
  17. 30 Dec, 2021 2 commits
  18. 22 Dec, 2021 3 commits
  19. 16 Dec, 2021 1 commit
    • Tom Lane's avatar
      Ensure casting to typmod -1 generates a RelabelType. · f9a8bc9f
      Tom Lane authored
      Fix the code changed by commit 5c056b0c2 so that we always generate
      RelabelType, not something else, for a cast to unspecified typmod.
      Otherwise planner optimizations might not happen.
      
      It appears we missed this point because the previous experiments were
      done on type numeric: the parser undesirably generates a call on the
      numeric() length-coercion function, but then numeric_support()
      optimizes that down to a RelabelType, so that everything seems fine.
      It misbehaves for types that have a non-optimized length coercion
      function, such as bpchar.
      
      Per report from John Naylor.  Back-patch to all supported branches,
      as the previous patch eventually was.  Unfortunately, that no longer
      includes 9.6 ... we really shouldn't put this type of change into a
      nearly-EOL branch.
      
      Discussion: https://postgr.es/m/CAFBsxsEfbFHEkouc+FSj+3K1sHipLPbEC67L0SAe-9-da8QtYg@mail.gmail.com
      f9a8bc9f
  20. 15 Dec, 2021 1 commit
    • Michael Paquier's avatar
      Adjust behavior of some env settings for the TAP tests of MSVC · 4ddbd745
      Michael Paquier authored
      edc2332 has introduced in vcregress.pl some control on the environment
      variables LZ4, TAR and GZIP_PROGRAM to allow any TAP tests to be able
      use those commands.  This makes the settings more consistent with
      src/Makefile.global.in, as the same default gets used for Make and MSVC
      builds.
      
      Each parameter can be changed in buildenv.pl, but as a default gets
      assigned after loading buldenv.pl, it is not possible to unset any of
      these, and using an empty value would not work with "||=" either.  As
      some environments may not have a compatible command in their PATH (tar
      coming from MinGW is an issue, for one), this could break tests without
      an exit path to bypass any failing test.  This commit changes things so
      as the default values for LZ4, TAR and GZIP_PROGRAM are assigned before
      loading buildenv.pl, not after.  This way, we keep the same amount of
      compatibility as a GNU build with the same defaults, and it becomes
      possible to unset any of those values.
      
      While on it, this adds some documentation about those three variables in
      the section dedicated to the TAP tests for MSVC.
      
      Per discussion with Andrew Dunstan.
      
      Discussion: https://postgr.es/m/YbGYe483803il3X7@paquier.xyz
      Backpatch-through: 10
      4ddbd745
  21. 14 Dec, 2021 2 commits
  22. 13 Dec, 2021 3 commits