1. 14 Jul, 2022 9 commits
  2. 13 Jul, 2022 1 commit
    • Alvaro Herrera's avatar
      Plug memory leak · 9e038d69
      Alvaro Herrera authored
      Commit 054325c5eeb3 created a memory leak in PQsendQueryInternal in case
      an error occurs while sending the message.  Repair.
      
      Backpatch to 14, like that commit.  Reported by Coverity.
      9e038d69
  3. 12 Jul, 2022 1 commit
    • Tom Lane's avatar
      Invent qsort_interruptible(). · af72b088
      Tom Lane authored
      Justin Pryzby reported that some scenarios could cause gathering
      of extended statistics to spend many seconds in an un-cancelable
      qsort() operation.  To fix, invent qsort_interruptible(), which is
      just like qsort_arg() except that it will also do CHECK_FOR_INTERRUPTS
      every so often.  This bloats the backend by a couple of kB, which
      seems like a good investment.  (We considered just enabling
      CHECK_FOR_INTERRUPTS in the existing qsort and qsort_arg functions,
      but there are some callers for which that'd demonstrably be unsafe.
      Opt-in seems like a better way.)
      
      For now, just apply qsort_interruptible() in statistics collection.
      There's probably more places where it could be useful, but we can
      always change other call sites as we find problems.
      
      Back-patch to v14.  Before that we didn't have extended stats on
      expressions, so that the problem was less severe.  Also, this patch
      depends on the sort_template infrastructure introduced in v14.
      
      Tom Lane and Justin Pryzby
      
      Discussion: https://postgr.es/m/20220509000108.GQ28830@telsasoft.com
      af72b088
  4. 11 Jul, 2022 2 commits
  5. 10 Jul, 2022 1 commit
  6. 09 Jul, 2022 1 commit
  7. 08 Jul, 2022 1 commit
  8. 07 Jul, 2022 1 commit
    • Dean Rasheed's avatar
      Fix alias matching in transformLockingClause(). · 8d846444
      Dean Rasheed authored
      When locking a specific named relation for a FOR [KEY] UPDATE/SHARE
      clause, transformLockingClause() finds the relation to lock by
      scanning the rangetable for an RTE with a matching eref->aliasname.
      However, it failed to account for the visibility rules of a join RTE.
      
      If a join RTE doesn't have a user-supplied alias, it will have a
      generated eref->aliasname of "unnamed_join" that is not visible as a
      relation name in the parse namespace. Such an RTE needs to be skipped,
      otherwise it might be found in preference to a regular base relation
      with a user-supplied alias of "unnamed_join", preventing it from being
      locked.
      
      In addition, if a join RTE doesn't have a user-supplied alias, but
      does have a join_using_alias, then the RTE needs to be matched using
      that alias rather than the generated eref->aliasname, otherwise a
      misleading "relation not found" error will be reported rather than a
      "join cannot be locked" error.
      
      Backpatch all the way, except for the second part which only goes back
      to 14, where JOIN USING aliases were added.
      
      Dean Rasheed, reviewed by Tom Lane.
      
      Discussion: https://postgr.es/m/CAEZATCUY_KOBnqxbTSPf=7fz9HWPnZ5Xgb9SwYzZ8rFXe7nb=w@mail.gmail.com
      8d846444
  9. 05 Jul, 2022 4 commits
    • Tom Lane's avatar
      Tighten pg_upgrade's new check for non-upgradable anyarray usages. · 9783413c
      Tom Lane authored
      We only need to reject cases when the aggregate or operator is
      itself declared with a polymorphic type.  Per buildfarm.
      
      Discussion: https://postgr.es/m/3383880.QJadu78ljV@vejsadalnx
      9783413c
    • Tom Lane's avatar
      Fix pg_upgrade to detect non-upgradable anyarray usages. · 175e60a5
      Tom Lane authored
      When we changed some built-in functions to use anycompatiblearray
      instead of anyarray, we created a dump/restore hazard for user-defined
      operators and aggregates relying on those functions: the user objects
      have to be modified to change their signatures similarly.  This causes
      pg_upgrade to fail partway through if the source installation contains
      such objects.  We generally try to have pg_upgrade detect such hazards
      and fail before it does anything exciting, so add logic to detect
      this case too.
      
      Back-patch to v14 where the change was made.
      
      Justin Pryzby, reviewed by Andrey Borodin
      
      Discussion: https://postgr.es/m/3383880.QJadu78ljV@vejsadalnx
      175e60a5
    • Alvaro Herrera's avatar
      libpq: Improve idle state handling in pipeline mode · 7c1f4261
      Alvaro Herrera authored
      We were going into IDLE state too soon when executing queries via
      PQsendQuery in pipeline mode, causing several scenarios to misbehave in
      different ways -- most notably, as reported by Daniele Varrazzo, that a
      warning message is produced by libpq:
        message type 0x33 arrived from server while idle
      But it is also possible, if queries are sent and results consumed not in
      lockstep, for the expected mediating NULL result values from PQgetResult
      to be lost (a problem which has not been reported, but which is more
      serious).
      
      Fix this by introducing two new concepts: one is a command queue element
      PGQUERY_CLOSE to tell libpq to wait for the CloseComplete server
      response to the Close message that is sent by PQsendQuery.  Because the
      application is not expecting any PGresult from this, the mechanism to
      consume it is a bit hackish.
      
      The other concept, authored by Horiguchi-san, is a PGASYNC_PIPELINE_IDLE
      state for libpq's state machine to differentiate "really idle" from
      merely "the idle state that occurs in between reading results from the
      server for elements in the pipeline".  This makes libpq not go fully
      IDLE when the libpq command queue contains entries; in normal cases, we
      only go IDLE once at the end of the pipeline, when the server response
      to the final SYNC message is received.  (However, there are corner cases
      it doesn't fix, such as terminating the query sequence by
      PQsendFlushRequest instead of PQpipelineSync; this sort of scenario is
      what requires PGQUERY_CLOSE bit above.)
      
      This last bit helps make the libpq state machine clearer; in particular
      we can get rid of an ugly hack in pqParseInput3 to avoid considering
      IDLE as such when the command queue contains entries.
      
      A new test mode is added to libpq_pipeline.c to tickle some related
      problematic cases.
      Reported-by: default avatarDaniele Varrazzo <daniele.varrazzo@gmail.com>
      Co-authored-by: default avatarKyotaro Horiguchi <horikyota.ntt@gmail.com>
      Discussion: https://postgr.es/m/CA+mi_8bvD0_CW3sumgwPvWdNzXY32itoG_16tDYRu_1S2gV2iw@mail.gmail.com
      7c1f4261
    • Alvaro Herrera's avatar
      BRIN: improve documentation on summarization · 0b71e43c
      Alvaro Herrera authored
      The existing wording wasn't clear enough and some details weren't
      anywhere, such as the fact that autosummarization is off by default.
      Improve.
      
      Authors: Roberto Mello, Jaime Casanova, Justin Pryzby, Álvaro Herrera
      Discussion: https://postgr.es/m/CAKz==bK_NoJytRyQfX8K-erCW3Ff7--oGYpiB8+ePVS7dRVW_A@mail.gmail.com
      Discussion: https://postgr.es/m/20220224193520.GY9008@telsasoft.com
      0b71e43c
  10. 03 Jul, 2022 2 commits
  11. 02 Jul, 2022 1 commit
    • Noah Misch's avatar
      ecpglib: call newlocale() once per process. · 5b94e2bd
      Noah Misch authored
      ecpglib has been calling it once per SQL query and once per EXEC SQL GET
      DESCRIPTOR.  Instead, if newlocale() has not succeeded before, call it
      while establishing a connection.  This mitigates three problems:
      - If newlocale() failed in EXEC SQL GET DESCRIPTOR, the command silently
        proceeded without the intended locale change.
      - On AIX, each newlocale()+freelocale() cycle leaked memory.
      - newlocale() CPU usage may have been nontrivial.
      
      Fail the connection attempt if newlocale() fails.  Rearrange
      ecpg_do_prologue() to validate the connection before its uselocale().
      
      The sort of program that may regress is one running in an environment
      where newlocale() fails.  If that program establishes connections
      without running SQL statements, it will stop working in response to this
      change.  I'm betting against the importance of such an ECPG use case.
      Most SQL execution (any using ECPGdo()) has long required newlocale()
      success, so there's little a connection could do without newlocale().
      
      Back-patch to v10 (all supported versions).
      
      Reviewed by Tom Lane.  Reported by Guillaume Lelarge.
      
      Discussion: https://postgr.es/m/20220101074055.GA54621@rfd.leadboat.com
      5b94e2bd
  12. 01 Jul, 2022 1 commit
    • Thomas Munro's avatar
      Harden dsm_impl.c against unexpected EEXIST. · fb81a93a
      Thomas Munro authored
      Previously, we trusted the OS not to report EEXIST unless we'd passed in
      IPC_CREAT | IPC_EXCL or O_CREAT | O_EXCL, as appropriate.  Solaris's
      shm_open() can in fact do that, causing us to crash because we didn't
      ereport and then we blithely assumed the mapping was successful.
      
      Let's treat EEXIST just like any other error, unless we're actually
      trying to create a new segment.  This applies to shm_open(), where this
      behavior has been seen, and also to the equivalent operations for our
      sysv and mmap modes just on principle.
      
      Based on the underlying reason for the error, namely contention on a
      lock file managed by Solaris librt for each distinct name, this problem
      is only likely to happen on 15 and later, because the new shared memory
      stats system produces shm_open() calls for the same path from
      potentially large numbers of backends concurrently during
      authentication.  Earlier releases only shared memory segments between a
      small number of parallel workers under one Gather node.  You could
      probably hit it if you tried hard enough though, and we should have been
      more defensive in the first place.  Therefore, back-patch to all
      supported releases.
      
      Per build farm animal margay.  This isn't the end of the story, though,
      it just changes random crashes into random "File exists" errors; more
      work needed for a green build farm.
      Reviewed-by: default avatarRobert Haas <robertmhaas@gmail.com>
      Discussion: https://postgr.es/m/CA%2BhUKGKqKrCV5xKWfh9rnm%3Do%3DDwZLTLtnsj_XpUi9g5%3DV%2B9oyg%40mail.gmail.com
      fb81a93a
  13. 27 Jun, 2022 1 commit
    • Heikki Linnakangas's avatar
      Fix visibility check when XID is committed in CLOG but not in procarray. · e24615a0
      Heikki Linnakangas authored
      TransactionIdIsInProgress had a fast path to return 'false' if the
      single-item CLOG cache said that the transaction was known to be
      committed. However, that was wrong, because a transaction is first
      marked as committed in the CLOG but doesn't become visible to others
      until it has removed its XID from the proc array. That could lead to an
      error:
      
          ERROR:  t_xmin is uncommitted in tuple to be updated
      
      or for an UPDATE to go ahead without blocking, before the previous
      UPDATE on the same row was made visible.
      
      The window is usually very short, but synchronous replication makes it
      much wider, because the wait for synchronous replica happens in that
      window.
      
      Another thing that makes it hard to hit is that it's hard to get such
      a commit-in-progress transaction into the single item CLOG cache.
      Normally, if you call TransactionIdIsInProgress on such a transaction,
      it determines that the XID is in progress without checking the CLOG
      and without populating the cache. One way to prime the cache is to
      explicitly call pg_xact_status() on the XID. Another way is to use a
      lot of subtransactions, so that the subxid cache in the proc array is
      overflown, making TransactionIdIsInProgress rely on pg_subtrans and
      CLOG checks.
      
      This has been broken ever since it was introduced in 2008, but the race
      condition is very hard to hit, especially without synchronous
      replication. There were a couple of reports of the error starting from
      summer 2021, but no one was able to find the root cause then.
      
      TransactionIdIsKnownCompleted() is now unused. In 'master', remove it,
      but I left it in place in backbranches in case it's used by extensions.
      
      Also change pg_xact_status() to check TransactionIdIsInProgress().
      Previously, it only checked the CLOG, and returned "committed" before
      the transaction was actually made visible to other queries. Note that
      this also means that you cannot use pg_xact_status() to reproduce the
      bug anymore, even if the code wasn't fixed.
      
      Report and analysis by Konstantin Knizhnik. Patch by Simon Riggs, with
      the pg_xact_status() change added by me.
      
      Author: Simon Riggs
      Reviewed-by: Andres Freund
      Discussion: https://www.postgresql.org/message-id/flat/4da7913d-398c-e2ad-d777-f752cf7f0bbb%40garret.ru
      e24615a0
  14. 26 Jun, 2022 1 commit
  15. 25 Jun, 2022 4 commits
  16. 23 Jun, 2022 1 commit
  17. 22 Jun, 2022 3 commits
    • Bruce Momjian's avatar
      doc: improve wording of plpgsql RAISE format text · f1e3a707
      Bruce Momjian authored
      Reported-by: pg@kirasoft.com
      
      Discussion: https://postgr.es/m/165455351426.573551.7050474465030525109@wrigleys.postgresql.org
      
      Backpatch-through: 10
      f1e3a707
    • Bruce Momjian's avatar
      doc: clarify wording about phantom reads · 1463f22d
      Bruce Momjian authored
      Reported-by: akhilhello@gmail.com
      
      Discussion: https://postgr.es/m/165222922369.669.10475917322916060899@wrigleys.postgresql.org
      
      Backpatch-through: 10
      1463f22d
    • Tom Lane's avatar
      Fix SPI's handling of errors during transaction commit. · 60465188
      Tom Lane authored
      SPI_commit previously left it up to the caller to recover from any error
      occurring during commit.  Since that's complicated and requires use of
      low-level xact.c facilities, it's not too surprising that no caller got
      it right.  Let's move the responsibility for cleanup into spi.c.  Doing
      that requires redefining SPI_commit as starting a new transaction, so
      that it becomes equivalent to SPI_commit_and_chain except that you get
      default transaction characteristics instead of preserving the prior
      transaction's characteristics.  We can make this pretty transparent
      API-wise by redefining SPI_start_transaction() as a no-op.  Callers
      that expect to do something in between might be surprised, but
      available evidence is that no callers do so.
      
      Having made that API redefinition, we can fix this mess by having
      SPI_commit[_and_chain] trap errors and start a new, clean transaction
      before re-throwing the error.  Likewise for SPI_rollback[_and_chain].
      Some cleanup is also needed in AtEOXact_SPI, which was nowhere near
      smart enough to deal with SPI contexts nested inside a committing
      context.
      
      While plperl and pltcl need no changes beyond removing their now-useless
      SPI_start_transaction() calls, plpython needs some more work because it
      hadn't gotten the memo about catching commit/rollback errors in the
      first place.  Such an error resulted in longjmp'ing out of the Python
      interpreter, which leaks Python stack entries at present and is reported
      to crash Python 3.11 altogether.  Add the missing logic to catch such
      errors and convert them into Python exceptions.
      
      This is a back-patch of commit 2e517818f.  That's now aged long enough
      to reduce the concerns about whether it will break something, and we
      do need to ensure that supported branches will work with Python 3.11.
      
      Peter Eisentraut and Tom Lane
      
      Discussion: https://postgr.es/m/3375ffd8-d71c-2565-e348-a597d6e739e3@enterprisedb.com
      Discussion: https://postgr.es/m/17416-ed8fe5d7213d6c25@postgresql.org
      60465188
  18. 21 Jun, 2022 2 commits
    • Amit Kapila's avatar
      Fix stale values in partition map entries on subscribers. · f0022a77
      Amit Kapila authored
      We build the partition map entries on subscribers while applying the
      changes for update/delete on partitions. The component relation in each
      entry is closed after its use so we need to update it on successive use of
      cache entries.
      
      This problem was there since the original commit f1ac27bf that
      introduced this code but we didn't notice it till the recent commit
      26b3455afa started to use the component relation of partition map cache
      entry.
      
      Reported-by: Tom Lane, as per buildfarm
      Author: Amit Langote, Hou Zhijie
      Reviewed-by: Amit Kapila, Shi Yu
      Backpatch-through: 13, where it was introduced
      Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
      f0022a77
    • Amit Kapila's avatar
      Fix partition table's REPLICA IDENTITY checking on the subscriber. · 52d5ea9a
      Amit Kapila authored
      In logical replication, we will check if the target table on the
      subscriber is updatable by comparing the replica identity of the table on
      the publisher with the table on the subscriber. When the target table is a
      partitioned table, we only check its replica identity but not for the
      partition tables. This leads to assertion failure while applying changes
      for update/delete as we expect those to succeed only when the
      corresponding partition table has a primary key or has a replica
      identity defined.
      
      Fix it by checking the replica identity of the partition table while
      applying changes.
      
      Reported-by: Shi Yu
      Author: Shi Yu, Hou Zhijie
      Reviewed-by: Amit Langote, Amit Kapila
      Backpatch-through: 13, where it was introduced
      Discussion: https://postgr.es/m/OSZPR01MB6310F46CD425A967E4AEF736FDA49@OSZPR01MB6310.jpnprd01.prod.outlook.com
      52d5ea9a
  19. 16 Jun, 2022 1 commit
  20. 15 Jun, 2022 1 commit
  21. 14 Jun, 2022 1 commit
    • Tom Lane's avatar
      Avoid ecpglib core dump with out-of-order operations. · 7bc21ed8
      Tom Lane authored
      If an application executed operations like EXEC SQL PREPARE
      without having first established a database connection, it could
      get a core dump instead of the expected clean failure.  This
      occurred because we did "pthread_getspecific(actual_connection_key)"
      without ever having initialized the TSD key actual_connection_key.
      The results of that are probably platform-specific, but at least
      on Linux it often leads to a crash.
      
      To fix, add calls to ecpg_pthreads_init() in the code paths that
      might use actual_connection_key uninitialized.  It's harmless
      (and hopefully inexpensive) to do that more than once.
      
      Per bug #17514 from Okano Naoki.  The problem's ancient, so
      back-patch to all supported branches.
      
      Discussion: https://postgr.es/m/17514-edd4fad547c5692c@postgresql.org
      7bc21ed8