1. 20 Jul, 2022 1 commit
    • Fujii Masao's avatar
      Prevent BASE_BACKUP in the middle of another backup in the same session. · 2aedf25e
      Fujii Masao authored
      Multiple non-exclusive backups are able to be run conrrently in different
      sessions. But, in the same session, only one non-exclusive backup can be
      run at the same moment. If pg_backup_start (pg_start_backup in v14 or before)
      is called in the middle of another non-exclusive backup in the same session,
      an error is thrown.
      
      However, previously, in logical replication walsender mode, even if that
      walsender session had already called pg_backup_start and started
      a non-exclusive backup, it could execute BASE_BACKUP command and
      start another non-exclusive backup. Which caused subsequent pg_backup_stop
      to throw an error because BASE_BACKUP unexpectedly reset the session state
      marked by pg_backup_start.
      
      This commit prevents BASE_BACKUP command in the middle of another
      non-exclusive backup in the same session.
      
      Back-patch to all supported branches.
      
      Author: Fujii Masao
      Reviewed-by: Kyotaro Horiguchi, Masahiko Sawada, Michael Paquier, Robert Haas
      Discussion: https://postgr.es/m/3374718f-9fbf-a950-6d66-d973e027f44c@oss.nttdata.com
      2aedf25e
  2. 18 Jul, 2022 2 commits
  3. 17 Jul, 2022 2 commits
  4. 16 Jul, 2022 1 commit
  5. 15 Jul, 2022 2 commits
  6. 14 Jul, 2022 9 commits
  7. 13 Jul, 2022 1 commit
    • Alvaro Herrera's avatar
      Plug memory leak · 9e038d69
      Alvaro Herrera authored
      Commit 054325c5eeb3 created a memory leak in PQsendQueryInternal in case
      an error occurs while sending the message.  Repair.
      
      Backpatch to 14, like that commit.  Reported by Coverity.
      9e038d69
  8. 12 Jul, 2022 1 commit
    • Tom Lane's avatar
      Invent qsort_interruptible(). · af72b088
      Tom Lane authored
      Justin Pryzby reported that some scenarios could cause gathering
      of extended statistics to spend many seconds in an un-cancelable
      qsort() operation.  To fix, invent qsort_interruptible(), which is
      just like qsort_arg() except that it will also do CHECK_FOR_INTERRUPTS
      every so often.  This bloats the backend by a couple of kB, which
      seems like a good investment.  (We considered just enabling
      CHECK_FOR_INTERRUPTS in the existing qsort and qsort_arg functions,
      but there are some callers for which that'd demonstrably be unsafe.
      Opt-in seems like a better way.)
      
      For now, just apply qsort_interruptible() in statistics collection.
      There's probably more places where it could be useful, but we can
      always change other call sites as we find problems.
      
      Back-patch to v14.  Before that we didn't have extended stats on
      expressions, so that the problem was less severe.  Also, this patch
      depends on the sort_template infrastructure introduced in v14.
      
      Tom Lane and Justin Pryzby
      
      Discussion: https://postgr.es/m/20220509000108.GQ28830@telsasoft.com
      af72b088
  9. 11 Jul, 2022 2 commits
  10. 10 Jul, 2022 1 commit
  11. 09 Jul, 2022 1 commit
  12. 08 Jul, 2022 1 commit
  13. 07 Jul, 2022 1 commit
    • Dean Rasheed's avatar
      Fix alias matching in transformLockingClause(). · 8d846444
      Dean Rasheed authored
      When locking a specific named relation for a FOR [KEY] UPDATE/SHARE
      clause, transformLockingClause() finds the relation to lock by
      scanning the rangetable for an RTE with a matching eref->aliasname.
      However, it failed to account for the visibility rules of a join RTE.
      
      If a join RTE doesn't have a user-supplied alias, it will have a
      generated eref->aliasname of "unnamed_join" that is not visible as a
      relation name in the parse namespace. Such an RTE needs to be skipped,
      otherwise it might be found in preference to a regular base relation
      with a user-supplied alias of "unnamed_join", preventing it from being
      locked.
      
      In addition, if a join RTE doesn't have a user-supplied alias, but
      does have a join_using_alias, then the RTE needs to be matched using
      that alias rather than the generated eref->aliasname, otherwise a
      misleading "relation not found" error will be reported rather than a
      "join cannot be locked" error.
      
      Backpatch all the way, except for the second part which only goes back
      to 14, where JOIN USING aliases were added.
      
      Dean Rasheed, reviewed by Tom Lane.
      
      Discussion: https://postgr.es/m/CAEZATCUY_KOBnqxbTSPf=7fz9HWPnZ5Xgb9SwYzZ8rFXe7nb=w@mail.gmail.com
      8d846444
  14. 05 Jul, 2022 4 commits
    • Tom Lane's avatar
      Tighten pg_upgrade's new check for non-upgradable anyarray usages. · 9783413c
      Tom Lane authored
      We only need to reject cases when the aggregate or operator is
      itself declared with a polymorphic type.  Per buildfarm.
      
      Discussion: https://postgr.es/m/3383880.QJadu78ljV@vejsadalnx
      9783413c
    • Tom Lane's avatar
      Fix pg_upgrade to detect non-upgradable anyarray usages. · 175e60a5
      Tom Lane authored
      When we changed some built-in functions to use anycompatiblearray
      instead of anyarray, we created a dump/restore hazard for user-defined
      operators and aggregates relying on those functions: the user objects
      have to be modified to change their signatures similarly.  This causes
      pg_upgrade to fail partway through if the source installation contains
      such objects.  We generally try to have pg_upgrade detect such hazards
      and fail before it does anything exciting, so add logic to detect
      this case too.
      
      Back-patch to v14 where the change was made.
      
      Justin Pryzby, reviewed by Andrey Borodin
      
      Discussion: https://postgr.es/m/3383880.QJadu78ljV@vejsadalnx
      175e60a5
    • Alvaro Herrera's avatar
      libpq: Improve idle state handling in pipeline mode · 7c1f4261
      Alvaro Herrera authored
      We were going into IDLE state too soon when executing queries via
      PQsendQuery in pipeline mode, causing several scenarios to misbehave in
      different ways -- most notably, as reported by Daniele Varrazzo, that a
      warning message is produced by libpq:
        message type 0x33 arrived from server while idle
      But it is also possible, if queries are sent and results consumed not in
      lockstep, for the expected mediating NULL result values from PQgetResult
      to be lost (a problem which has not been reported, but which is more
      serious).
      
      Fix this by introducing two new concepts: one is a command queue element
      PGQUERY_CLOSE to tell libpq to wait for the CloseComplete server
      response to the Close message that is sent by PQsendQuery.  Because the
      application is not expecting any PGresult from this, the mechanism to
      consume it is a bit hackish.
      
      The other concept, authored by Horiguchi-san, is a PGASYNC_PIPELINE_IDLE
      state for libpq's state machine to differentiate "really idle" from
      merely "the idle state that occurs in between reading results from the
      server for elements in the pipeline".  This makes libpq not go fully
      IDLE when the libpq command queue contains entries; in normal cases, we
      only go IDLE once at the end of the pipeline, when the server response
      to the final SYNC message is received.  (However, there are corner cases
      it doesn't fix, such as terminating the query sequence by
      PQsendFlushRequest instead of PQpipelineSync; this sort of scenario is
      what requires PGQUERY_CLOSE bit above.)
      
      This last bit helps make the libpq state machine clearer; in particular
      we can get rid of an ugly hack in pqParseInput3 to avoid considering
      IDLE as such when the command queue contains entries.
      
      A new test mode is added to libpq_pipeline.c to tickle some related
      problematic cases.
      Reported-by: default avatarDaniele Varrazzo <daniele.varrazzo@gmail.com>
      Co-authored-by: default avatarKyotaro Horiguchi <horikyota.ntt@gmail.com>
      Discussion: https://postgr.es/m/CA+mi_8bvD0_CW3sumgwPvWdNzXY32itoG_16tDYRu_1S2gV2iw@mail.gmail.com
      7c1f4261
    • Alvaro Herrera's avatar
      BRIN: improve documentation on summarization · 0b71e43c
      Alvaro Herrera authored
      The existing wording wasn't clear enough and some details weren't
      anywhere, such as the fact that autosummarization is off by default.
      Improve.
      
      Authors: Roberto Mello, Jaime Casanova, Justin Pryzby, Álvaro Herrera
      Discussion: https://postgr.es/m/CAKz==bK_NoJytRyQfX8K-erCW3Ff7--oGYpiB8+ePVS7dRVW_A@mail.gmail.com
      Discussion: https://postgr.es/m/20220224193520.GY9008@telsasoft.com
      0b71e43c
  15. 03 Jul, 2022 2 commits
  16. 02 Jul, 2022 1 commit
    • Noah Misch's avatar
      ecpglib: call newlocale() once per process. · 5b94e2bd
      Noah Misch authored
      ecpglib has been calling it once per SQL query and once per EXEC SQL GET
      DESCRIPTOR.  Instead, if newlocale() has not succeeded before, call it
      while establishing a connection.  This mitigates three problems:
      - If newlocale() failed in EXEC SQL GET DESCRIPTOR, the command silently
        proceeded without the intended locale change.
      - On AIX, each newlocale()+freelocale() cycle leaked memory.
      - newlocale() CPU usage may have been nontrivial.
      
      Fail the connection attempt if newlocale() fails.  Rearrange
      ecpg_do_prologue() to validate the connection before its uselocale().
      
      The sort of program that may regress is one running in an environment
      where newlocale() fails.  If that program establishes connections
      without running SQL statements, it will stop working in response to this
      change.  I'm betting against the importance of such an ECPG use case.
      Most SQL execution (any using ECPGdo()) has long required newlocale()
      success, so there's little a connection could do without newlocale().
      
      Back-patch to v10 (all supported versions).
      
      Reviewed by Tom Lane.  Reported by Guillaume Lelarge.
      
      Discussion: https://postgr.es/m/20220101074055.GA54621@rfd.leadboat.com
      5b94e2bd
  17. 01 Jul, 2022 1 commit
    • Thomas Munro's avatar
      Harden dsm_impl.c against unexpected EEXIST. · fb81a93a
      Thomas Munro authored
      Previously, we trusted the OS not to report EEXIST unless we'd passed in
      IPC_CREAT | IPC_EXCL or O_CREAT | O_EXCL, as appropriate.  Solaris's
      shm_open() can in fact do that, causing us to crash because we didn't
      ereport and then we blithely assumed the mapping was successful.
      
      Let's treat EEXIST just like any other error, unless we're actually
      trying to create a new segment.  This applies to shm_open(), where this
      behavior has been seen, and also to the equivalent operations for our
      sysv and mmap modes just on principle.
      
      Based on the underlying reason for the error, namely contention on a
      lock file managed by Solaris librt for each distinct name, this problem
      is only likely to happen on 15 and later, because the new shared memory
      stats system produces shm_open() calls for the same path from
      potentially large numbers of backends concurrently during
      authentication.  Earlier releases only shared memory segments between a
      small number of parallel workers under one Gather node.  You could
      probably hit it if you tried hard enough though, and we should have been
      more defensive in the first place.  Therefore, back-patch to all
      supported releases.
      
      Per build farm animal margay.  This isn't the end of the story, though,
      it just changes random crashes into random "File exists" errors; more
      work needed for a green build farm.
      Reviewed-by: default avatarRobert Haas <robertmhaas@gmail.com>
      Discussion: https://postgr.es/m/CA%2BhUKGKqKrCV5xKWfh9rnm%3Do%3DDwZLTLtnsj_XpUi9g5%3DV%2B9oyg%40mail.gmail.com
      fb81a93a
  18. 27 Jun, 2022 1 commit
    • Heikki Linnakangas's avatar
      Fix visibility check when XID is committed in CLOG but not in procarray. · e24615a0
      Heikki Linnakangas authored
      TransactionIdIsInProgress had a fast path to return 'false' if the
      single-item CLOG cache said that the transaction was known to be
      committed. However, that was wrong, because a transaction is first
      marked as committed in the CLOG but doesn't become visible to others
      until it has removed its XID from the proc array. That could lead to an
      error:
      
          ERROR:  t_xmin is uncommitted in tuple to be updated
      
      or for an UPDATE to go ahead without blocking, before the previous
      UPDATE on the same row was made visible.
      
      The window is usually very short, but synchronous replication makes it
      much wider, because the wait for synchronous replica happens in that
      window.
      
      Another thing that makes it hard to hit is that it's hard to get such
      a commit-in-progress transaction into the single item CLOG cache.
      Normally, if you call TransactionIdIsInProgress on such a transaction,
      it determines that the XID is in progress without checking the CLOG
      and without populating the cache. One way to prime the cache is to
      explicitly call pg_xact_status() on the XID. Another way is to use a
      lot of subtransactions, so that the subxid cache in the proc array is
      overflown, making TransactionIdIsInProgress rely on pg_subtrans and
      CLOG checks.
      
      This has been broken ever since it was introduced in 2008, but the race
      condition is very hard to hit, especially without synchronous
      replication. There were a couple of reports of the error starting from
      summer 2021, but no one was able to find the root cause then.
      
      TransactionIdIsKnownCompleted() is now unused. In 'master', remove it,
      but I left it in place in backbranches in case it's used by extensions.
      
      Also change pg_xact_status() to check TransactionIdIsInProgress().
      Previously, it only checked the CLOG, and returned "committed" before
      the transaction was actually made visible to other queries. Note that
      this also means that you cannot use pg_xact_status() to reproduce the
      bug anymore, even if the code wasn't fixed.
      
      Report and analysis by Konstantin Knizhnik. Patch by Simon Riggs, with
      the pg_xact_status() change added by me.
      
      Author: Simon Riggs
      Reviewed-by: Andres Freund
      Discussion: https://www.postgresql.org/message-id/flat/4da7913d-398c-e2ad-d777-f752cf7f0bbb%40garret.ru
      e24615a0
  19. 26 Jun, 2022 1 commit
  20. 25 Jun, 2022 4 commits
  21. 23 Jun, 2022 1 commit