1. 02 Jul, 2022 1 commit
    • Noah Misch's avatar
      ecpglib: call newlocale() once per process. · 5b94e2bd
      Noah Misch authored
      ecpglib has been calling it once per SQL query and once per EXEC SQL GET
      DESCRIPTOR.  Instead, if newlocale() has not succeeded before, call it
      while establishing a connection.  This mitigates three problems:
      - If newlocale() failed in EXEC SQL GET DESCRIPTOR, the command silently
        proceeded without the intended locale change.
      - On AIX, each newlocale()+freelocale() cycle leaked memory.
      - newlocale() CPU usage may have been nontrivial.
      
      Fail the connection attempt if newlocale() fails.  Rearrange
      ecpg_do_prologue() to validate the connection before its uselocale().
      
      The sort of program that may regress is one running in an environment
      where newlocale() fails.  If that program establishes connections
      without running SQL statements, it will stop working in response to this
      change.  I'm betting against the importance of such an ECPG use case.
      Most SQL execution (any using ECPGdo()) has long required newlocale()
      success, so there's little a connection could do without newlocale().
      
      Back-patch to v10 (all supported versions).
      
      Reviewed by Tom Lane.  Reported by Guillaume Lelarge.
      
      Discussion: https://postgr.es/m/20220101074055.GA54621@rfd.leadboat.com
      5b94e2bd
  2. 01 Jul, 2022 1 commit
    • Thomas Munro's avatar
      Harden dsm_impl.c against unexpected EEXIST. · fb81a93a
      Thomas Munro authored
      Previously, we trusted the OS not to report EEXIST unless we'd passed in
      IPC_CREAT | IPC_EXCL or O_CREAT | O_EXCL, as appropriate.  Solaris's
      shm_open() can in fact do that, causing us to crash because we didn't
      ereport and then we blithely assumed the mapping was successful.
      
      Let's treat EEXIST just like any other error, unless we're actually
      trying to create a new segment.  This applies to shm_open(), where this
      behavior has been seen, and also to the equivalent operations for our
      sysv and mmap modes just on principle.
      
      Based on the underlying reason for the error, namely contention on a
      lock file managed by Solaris librt for each distinct name, this problem
      is only likely to happen on 15 and later, because the new shared memory
      stats system produces shm_open() calls for the same path from
      potentially large numbers of backends concurrently during
      authentication.  Earlier releases only shared memory segments between a
      small number of parallel workers under one Gather node.  You could
      probably hit it if you tried hard enough though, and we should have been
      more defensive in the first place.  Therefore, back-patch to all
      supported releases.
      
      Per build farm animal margay.  This isn't the end of the story, though,
      it just changes random crashes into random "File exists" errors; more
      work needed for a green build farm.
      Reviewed-by: default avatarRobert Haas <robertmhaas@gmail.com>
      Discussion: https://postgr.es/m/CA%2BhUKGKqKrCV5xKWfh9rnm%3Do%3DDwZLTLtnsj_XpUi9g5%3DV%2B9oyg%40mail.gmail.com
      fb81a93a
  3. 27 Jun, 2022 1 commit
    • Heikki Linnakangas's avatar
      Fix visibility check when XID is committed in CLOG but not in procarray. · e24615a0
      Heikki Linnakangas authored
      TransactionIdIsInProgress had a fast path to return 'false' if the
      single-item CLOG cache said that the transaction was known to be
      committed. However, that was wrong, because a transaction is first
      marked as committed in the CLOG but doesn't become visible to others
      until it has removed its XID from the proc array. That could lead to an
      error:
      
          ERROR:  t_xmin is uncommitted in tuple to be updated
      
      or for an UPDATE to go ahead without blocking, before the previous
      UPDATE on the same row was made visible.
      
      The window is usually very short, but synchronous replication makes it
      much wider, because the wait for synchronous replica happens in that
      window.
      
      Another thing that makes it hard to hit is that it's hard to get such
      a commit-in-progress transaction into the single item CLOG cache.
      Normally, if you call TransactionIdIsInProgress on such a transaction,
      it determines that the XID is in progress without checking the CLOG
      and without populating the cache. One way to prime the cache is to
      explicitly call pg_xact_status() on the XID. Another way is to use a
      lot of subtransactions, so that the subxid cache in the proc array is
      overflown, making TransactionIdIsInProgress rely on pg_subtrans and
      CLOG checks.
      
      This has been broken ever since it was introduced in 2008, but the race
      condition is very hard to hit, especially without synchronous
      replication. There were a couple of reports of the error starting from
      summer 2021, but no one was able to find the root cause then.
      
      TransactionIdIsKnownCompleted() is now unused. In 'master', remove it,
      but I left it in place in backbranches in case it's used by extensions.
      
      Also change pg_xact_status() to check TransactionIdIsInProgress().
      Previously, it only checked the CLOG, and returned "committed" before
      the transaction was actually made visible to other queries. Note that
      this also means that you cannot use pg_xact_status() to reproduce the
      bug anymore, even if the code wasn't fixed.
      
      Report and analysis by Konstantin Knizhnik. Patch by Simon Riggs, with
      the pg_xact_status() change added by me.
      
      Author: Simon Riggs
      Reviewed-by: Andres Freund
      Discussion: https://www.postgresql.org/message-id/flat/4da7913d-398c-e2ad-d777-f752cf7f0bbb%40garret.ru
      e24615a0
  4. 26 Jun, 2022 1 commit
  5. 25 Jun, 2022 4 commits
  6. 23 Jun, 2022 1 commit
  7. 22 Jun, 2022 3 commits
    • Bruce Momjian's avatar
      doc: improve wording of plpgsql RAISE format text · f1e3a707
      Bruce Momjian authored
      Reported-by: pg@kirasoft.com
      
      Discussion: https://postgr.es/m/165455351426.573551.7050474465030525109@wrigleys.postgresql.org
      
      Backpatch-through: 10
      f1e3a707
    • Bruce Momjian's avatar
      doc: clarify wording about phantom reads · 1463f22d
      Bruce Momjian authored
      Reported-by: akhilhello@gmail.com
      
      Discussion: https://postgr.es/m/165222922369.669.10475917322916060899@wrigleys.postgresql.org
      
      Backpatch-through: 10
      1463f22d
    • Tom Lane's avatar
      Fix SPI's handling of errors during transaction commit. · 60465188
      Tom Lane authored
      SPI_commit previously left it up to the caller to recover from any error
      occurring during commit.  Since that's complicated and requires use of
      low-level xact.c facilities, it's not too surprising that no caller got
      it right.  Let's move the responsibility for cleanup into spi.c.  Doing
      that requires redefining SPI_commit as starting a new transaction, so
      that it becomes equivalent to SPI_commit_and_chain except that you get
      default transaction characteristics instead of preserving the prior
      transaction's characteristics.  We can make this pretty transparent
      API-wise by redefining SPI_start_transaction() as a no-op.  Callers
      that expect to do something in between might be surprised, but
      available evidence is that no callers do so.
      
      Having made that API redefinition, we can fix this mess by having
      SPI_commit[_and_chain] trap errors and start a new, clean transaction
      before re-throwing the error.  Likewise for SPI_rollback[_and_chain].
      Some cleanup is also needed in AtEOXact_SPI, which was nowhere near
      smart enough to deal with SPI contexts nested inside a committing
      context.
      
      While plperl and pltcl need no changes beyond removing their now-useless
      SPI_start_transaction() calls, plpython needs some more work because it
      hadn't gotten the memo about catching commit/rollback errors in the
      first place.  Such an error resulted in longjmp'ing out of the Python
      interpreter, which leaks Python stack entries at present and is reported
      to crash Python 3.11 altogether.  Add the missing logic to catch such
      errors and convert them into Python exceptions.
      
      This is a back-patch of commit 2e517818f.  That's now aged long enough
      to reduce the concerns about whether it will break something, and we
      do need to ensure that supported branches will work with Python 3.11.
      
      Peter Eisentraut and Tom Lane
      
      Discussion: https://postgr.es/m/3375ffd8-d71c-2565-e348-a597d6e739e3@enterprisedb.com
      Discussion: https://postgr.es/m/17416-ed8fe5d7213d6c25@postgresql.org
      60465188
  8. 21 Jun, 2022 2 commits
    • Amit Kapila's avatar
      Fix stale values in partition map entries on subscribers. · f0022a77
      Amit Kapila authored
      We build the partition map entries on subscribers while applying the
      changes for update/delete on partitions. The component relation in each
      entry is closed after its use so we need to update it on successive use of
      cache entries.
      
      This problem was there since the original commit f1ac27bf that
      introduced this code but we didn't notice it till the recent commit
      26b3455afa started to use the component relation of partition map cache
      entry.
      
      Reported-by: Tom Lane, as per buildfarm
      Author: Amit Langote, Hou Zhijie
      Reviewed-by: Amit Kapila, Shi Yu
      Backpatch-through: 13, where it was introduced
      Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
      f0022a77
    • Amit Kapila's avatar
      Fix partition table's REPLICA IDENTITY checking on the subscriber. · 52d5ea9a
      Amit Kapila authored
      In logical replication, we will check if the target table on the
      subscriber is updatable by comparing the replica identity of the table on
      the publisher with the table on the subscriber. When the target table is a
      partitioned table, we only check its replica identity but not for the
      partition tables. This leads to assertion failure while applying changes
      for update/delete as we expect those to succeed only when the
      corresponding partition table has a primary key or has a replica
      identity defined.
      
      Fix it by checking the replica identity of the partition table while
      applying changes.
      
      Reported-by: Shi Yu
      Author: Shi Yu, Hou Zhijie
      Reviewed-by: Amit Langote, Amit Kapila
      Backpatch-through: 13, where it was introduced
      Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
      52d5ea9a
  9. 16 Jun, 2022 1 commit
  10. 15 Jun, 2022 1 commit
  11. 14 Jun, 2022 2 commits
    • Tom Lane's avatar
      Avoid ecpglib core dump with out-of-order operations. · 7bc21ed8
      Tom Lane authored
      If an application executed operations like EXEC SQL PREPARE
      without having first established a database connection, it could
      get a core dump instead of the expected clean failure.  This
      occurred because we did "pthread_getspecific(actual_connection_key)"
      without ever having initialized the TSD key actual_connection_key.
      The results of that are probably platform-specific, but at least
      on Linux it often leads to a crash.
      
      To fix, add calls to ecpg_pthreads_init() in the code paths that
      might use actual_connection_key uninitialized.  It's harmless
      (and hopefully inexpensive) to do that more than once.
      
      Per bug #17514 from Okano Naoki.  The problem's ancient, so
      back-patch to all supported branches.
      
      Discussion: https://postgr.es/m/17514-edd4fad547c5692c@postgresql.org
      7bc21ed8
    • Tom Lane's avatar
      Doc: clarify the default collation behavior of domains. · be35a645
      Tom Lane authored
      The previous wording was "the underlying data type's default collation
      is used", which is wrong or at least misleading.  The domain inherits
      the base type's collation behavior, which if "default" actually can
      mean that we use some non-default collation obtained from elsewhere.
      
      Per complaint from Jian He.
      
      Discussion: https://postgr.es/m/CACJufxHMR8_4WooDPjjvEdaxB2hQ5a49qthci8fpKP0MKemVRQ@mail.gmail.com
      be35a645
  12. 13 Jun, 2022 2 commits
  13. 10 Jun, 2022 4 commits
  14. 08 Jun, 2022 3 commits
    • Tom Lane's avatar
      Doc: copy-edit "jsonb Indexing" section. · 0ccef410
      Tom Lane authored
      The patch introducing jsonpath dropped a para about that between
      two related examples, and didn't bother updating the introductory
      sentences that it falsified.  The grammar was pretty shaky as well.
      0ccef410
    • Peter Eisentraut's avatar
      Fix whitespace · 804a5079
      Peter Eisentraut authored
      804a5079
    • David Rowley's avatar
      Harden Memoization code against broken data types · cbcea3b9
      David Rowley authored
      Bug #17512 highlighted that a suitably broken data type could cause the
      backend to crash if either the hash function or equality function were in
      someway non-deterministic based on their input values.  Such a data type
      could cause a crash of the backend due to some code which assumes that
      we'll always find a hash table entry corresponding to an item in the
      Memoize LRU list.
      
      Here we remove the assumption that we'll always find the entry
      corresponding to the given LRU list item and add run-time checks to verify
      we have found the given item in the cache.
      
      This is not a fix for bug #17512, but it will turn the crash reported by
      that bug report into an internal ERROR.
      
      Reported-by: Ales Zeleny
      Reviewed-by: Tom Lane
      Discussion: https://postgr.es/m/CAApHDvpxFSTwvoYWT7kmFVSZ9zLAeHb=S9vrz=RExMgSkQNWqw@mail.gmail.com
      Backpatch-through: 14, where Memoize was added.
      cbcea3b9
  15. 07 Jun, 2022 1 commit
    • Tom Lane's avatar
      Fix off-by-one loop termination condition in pg_stat_get_subscription(). · 5c3b5f7d
      Tom Lane authored
      pg_stat_get_subscription scanned one more LogicalRepWorker array entry
      than is really allocated.  In the worst case this could lead to SIGSEGV,
      if the LogicalRepCtx data structure is near the end of shared memory.
      That seems quite unlikely though (thanks to the ordering of calls in
      CreateSharedMemoryAndSemaphores) and we've heard no field reports of it.
      A more likely misbehavior is one row of garbage data in the function's
      result, but even that is not real likely because of the check that the
      pid field matches some live backend.
      
      Report and fix by Kuntal Ghosh.  This bug is old, so back-patch
      to all supported branches.
      
      Discussion: https://postgr.es/m/CAGz5QCJykEDzW6jQK6Yz7Qh_PMtD=95de_7QoocbVR2Qy8hWZA@mail.gmail.com
      5c3b5f7d
  16. 06 Jun, 2022 3 commits
    • Tom Lane's avatar
      Don't fail on libpq-generated error reports in pg_amcheck. · 32a85ee4
      Tom Lane authored
      An error PGresult generated by libpq itself, such as a report of
      connection loss, won't have broken-down error fields.
      should_processing_continue() blithely assumed that
      PG_DIAG_SEVERITY_NONLOCALIZED would always be present, and would
      dump core if it wasn't.
      
      Per grepping to see if 6d157e7cb's mistake was repeated elsewhere.
      32a85ee4
    • Tom Lane's avatar
      Don't fail on libpq-generated error reports in ecpg_raise_backend(). · a5dbca46
      Tom Lane authored
      An error PGresult generated by libpq itself, such as a report of
      connection loss, won't have broken-down error fields.
      ecpg_raise_backend() blithely assumed that PG_DIAG_MESSAGE_PRIMARY
      would always be present, and would end up passing a NULL string
      pointer to snprintf when it isn't.  That would typically crash
      before 3779ac62d, and it would fail to provide a useful error report
      in any case.  Best practice is to substitute PQerrorMessage(conn)
      in such cases, so do that.
      
      Per bug #17421 from Masayuki Hirose.  Back-patch to all supported
      branches.
      
      Discussion: https://postgr.es/m/17421-790ff887e3188874@postgresql.org
      a5dbca46
    • Michael Paquier's avatar
      Fix psql's single transaction mode on client-side errors with -c/-f switches · a04ccf6d
      Michael Paquier authored
      psql --single-transaction is able to handle multiple -c and -f switches
      in a single transaction since d5563d7d, but this had the surprising
      behavior of forcing a transaction COMMIT even if psql failed with an
      error in the client (for example incorrect path given to \copy), which
      would generate an error, but still commit any changes that were already
      applied in the backend.  This commit makes the behavior more consistent,
      by enforcing a transaction ROLLBACK if any commands fail, both
      client-side and backend-side, so as no changes are applied if one error
      happens in any of them.
      
      Some tests are added on HEAD to provide some coverage about all that.
      Backend-side errors are unreliable as IPC::Run can complain on SIGPIPE
      if psql quits before reading a query result, but that should work
      properly in the case where any errors come from psql itself, which is
      what the original report is about.
      
      Reported-by: Christoph Berg
      Author: Kyotaro Horiguchi, Michael Paquier
      Discussion: https://postgr.es/m/17504-76b68018e130415e@postgresql.org
      Backpatch-through: 10
      a04ccf6d
  17. 03 Jun, 2022 2 commits
  18. 02 Jun, 2022 1 commit
  19. 01 Jun, 2022 4 commits
  20. 31 May, 2022 2 commits