1. 17 Jun, 2021 2 commits
    • Heikki Linnakangas's avatar
      Tidy up GetMultiXactIdMembers()'s behavior on error · d24c5658
      Heikki Linnakangas authored
      One of the error paths left *members uninitialized. That's not a live
      bug, because most callers don't look at *members when the function
      returns -1, but let's be tidy. One caller, in heap_lock_tuple(), does
      "if (members != NULL) pfree(members)", but AFAICS it never passes an
      invalid 'multi' value so it should not reach that error case.
      
      The callers are also a bit inconsistent in their expectations.
      heap_lock_tuple() pfrees the 'members' array if it's not-NULL, others
      pfree() it if "nmembers >= 0", and others if "nmembers > 0". That's
      not a live bug either, because the function should never return 0, but
      add an Assert for that to make it more clear. I left the callers alone
      for now.
      
      I also moved the line where we set *nmembers. It wasn't wrong before,
      but I like to do that right next to the 'return' statement, to make it
      clear that it's always set on return.
      
      Also remove one unreachable return statement after ereport(ERROR), for
      brevity and for consistency with the similar if-block right after it.
      
      Author: Greg Nancarrow with the additional changes by me
      Backpatch-through: 9.6, all supported versions
      d24c5658
    • Amit Kapila's avatar
      Document a few caveats in synchronous logical replication. · 3cb828db
      Amit Kapila authored
      In a synchronous logical setup, locking [user] catalog tables can cause
      deadlock. This is because logical decoding of transactions can lock
      catalog tables to access them so exclusively locking those in transactions
      can lead to deadlock. To avoid this users must refrain from having
      exclusive locks on catalog tables.
      
      Author: Takamichi Osumi
      Reviewed-by: Vignesh C, Amit Kapila
      Backpatch-through: 9.6
      Discussion: https://www.postgresql.org/message-id/20210222222847.tpnb6eg3yiykzpky%40alap3.anarazel.de
      3cb828db
  2. 16 Jun, 2021 4 commits
    • Tom Lane's avatar
      Fix plancache refcount leak after error in ExecuteQuery. · 131ea3e9
      Tom Lane authored
      When stuffing a plan from the plancache into a Portal, one is
      not supposed to risk throwing an error between GetCachedPlan and
      PortalDefineQuery; if that happens, the plan refcount incremented
      by GetCachedPlan will be leaked.  I managed to break this rule
      while refactoring code in 9dbf2b7d.  There is no visible
      consequence other than some memory leakage, and since nobody is
      very likely to trigger the relevant error conditions many times
      in a row, it's not surprising we haven't noticed.  Nonetheless,
      it's a bug, so rearrange the order of operations to remove the
      hazard.
      
      Noted on the way to looking for a better fix for bug #17053.
      This mistake is pretty old, so back-patch to all supported
      branches.
      131ea3e9
    • Tomas Vondra's avatar
      Fix copying data into slots with FDW batching · 99cea49d
      Tomas Vondra authored
      Commit b676ac44 optimized handling of tuple slots with bulk inserts
      into foreign tables, so that the slots are initialized only once and
      reused for all batches. The data was however copied into the slots only
      after the initialization, inserting duplicate values when the slot gets
      reused. Fixed by moving the ExecCopySlot outside the init branch.
      
      The existing postgres_fdw tests failed to catch this due to inserting
      data into foreign tables without unique indexes, and then checking only
      the number of inserted rows. This adds a new test with both a unique
      index and a check of inserted values.
      
      Reported-by: Alexander Pyhalov
      Discussion: https://postgr.es/m/7a8cf8d56b3d18e5c0bccd6cd42d04ac%40postgrespro.ru
      99cea49d
    • Tom Lane's avatar
      Improve SQLSTATE reporting in some replication-related code. · 6b787d9e
      Tom Lane authored
      I started out with the goal of reporting ERRCODE_CONNECTION_FAILURE
      when walrcv_connect() fails, but as I looked around I realized that
      whoever wrote this code was of the opinion that errcodes are purely
      optional.  That's not my understanding of our project policy.  Hence,
      make sure that an errcode is provided in each ereport that (a) is
      ERROR or higher level and (b) isn't arguably an internal logic error.
      Also fix some very dubious existing errcode assignments.
      
      While this is not per policy, it's also largely cosmetic, since few
      of these cases could get reported to applications.  So I don't
      feel a need to back-patch.
      
      Discussion: https://postgr.es/m/2189704.1623512522@sss.pgh.pa.us
      6b787d9e
    • Heikki Linnakangas's avatar
      Fix outdated comment that talked about seek position of WAL file. · d0303bc8
      Heikki Linnakangas authored
      Since commit c24dcd0c, we have been using pg_pread() to read the WAL
      file, which doesn't change the seek position (unless we fall back to
      the implementation in src/port/pread.c). Update comment accordingly.
      
      Backpatch-through: 12, where we started to use pg_pread()
      d0303bc8
  3. 15 Jun, 2021 12 commits
  4. 14 Jun, 2021 6 commits
  5. 13 Jun, 2021 3 commits
    • Tom Lane's avatar
      Work around portability issue with newer versions of mktime(). · f807e341
      Tom Lane authored
      Recent glibc versions have made mktime() fail if tm_isdst is
      inconsistent with the prevailing timezone; in particular it fails for
      tm_isdst = 1 when the zone is UTC.  (This seems wildly inconsistent
      with the POSIX-mandated treatment of "incorrect" values for the other
      fields of struct tm, so if you ask me it's a bug, but I bet they'll
      say it's intentional.)  This has been observed to cause cosmetic
      problems when pg_restore'ing an archive created in a different
      timezone.
      
      To fix, do mktime() using the field values from the archive, and if
      that fails try again with tm_isdst = -1.  This will give a result
      that's off by the UTC-offset difference from the original zone, but
      that was true before, too.  It's not terribly critical since we don't
      do anything with the result except possibly print it.  (Someday we
      should flush this entire bit of logic and record a standard-format
      timestamp in the archive instead.  That's not okay for a back-patched
      bug fix, though.)
      
      Also, guard our only other use of mktime() by having initdb's
      build_time_t() set tm_isdst = -1 not 0.  This case could only have
      an issue in zones that are DST year-round; but I think some do exist,
      or could in future.
      
      Per report from Wells Oliver.  Back-patch to all supported
      versions, since any of them might need to run with a newer glibc.
      
      Discussion: https://postgr.es/m/CAOC+FBWDhDHO7G-i1_n_hjRzCnUeFO+H-Czi1y10mFhRWpBrew@mail.gmail.com
      f807e341
    • Andrew Dunstan's avatar
      Further tweaks to stuck_on_old_timeline recovery test · 9d97c340
      Andrew Dunstan authored
      Translate path slashes on target directory path. This was confusing old
      branches, but is applied to all branches for the sake of uniformity.
      Perl is perfectly able to understand paths with forward slashes.
      
      Along the way, restore the previous archive_wait query, for the sake of
      uniformity with other tests, per gripe from Tom Lane.
      9d97c340
    • Michael Paquier's avatar
      Ignore more environment variables in pg_regress.c · a9e0b3b0
      Michael Paquier authored
      This is similar to the work done in 8279f68a for TestLib.pm, where
      environment variables set may cause unwanted failures if using a
      temporary installation with pg_regress.  The list of variables reset is
      adjusted in each stable branch depending on what is supported.
      
      Comments are added to remember that the lists in TestLib.pm and
      pg_regress.c had better be kept in sync.
      
      Reviewed-by: Álvaro Herrera
      Discussion: https://postgr.es/m/YMNR9GYDn+fHlMta@paquier.xyz
      Backpatch-through: 9.6
      a9e0b3b0
  6. 12 Jun, 2021 8 commits
    • Tom Lane's avatar
      Restore robustness of TAP tests that wait for postmaster restart. · f452aaf7
      Tom Lane authored
      Several TAP tests use poll_query_until() to wait for the postmaster
      to restart.  They were checking to see if a trivial query
      (e.g. "SELECT 1") succeeds.  However, that's problematic in the wake
      of commit 11e9caff, because now that we feed said query to psql
      via stdin, we risk IPC::Run whining about a SIGPIPE failure if psql
      quits before reading the query.  Hence, we can't use a nonempty
      query in cases where we need to wait for connection failures to
      stop happening.
      
      Per the precedent of commits c757a3da and 6d41dd04, we can pass
      "undef" as the query in such cases to ensure that IPC::Run has
      nothing to write.  However, then we have to say that the expected
      output is empty, and this exposes a deficiency in poll_query_until:
      if psql fails altogether and returns empty stdout, poll_query_until
      will treat that as a success!  That's because, contrary to its
      documentation, it makes no actual check for psql failure, looking
      neither at the exit status nor at stderr.
      
      To fix that, adjust poll_query_until to insist on empty stderr as
      well as a stdout match.  (I experimented with checking exit status
      instead, but it seems that psql often does exit(1) in cases that we
      need to consider successes.  That might be something to fix someday,
      but it would be a non-back-patchable behavior change.)
      
      Back-patch to v10.  The test cases needing this exist only as far
      back as v11, but it seems wise to keep poll_query_until's behavior
      the same in v10, in case we back-patch another such test case in
      future.  (9.6 does not currently need this change, because in that
      branch poll_query_until can't be told to accept empty stdout as
      a success case.)
      
      Per assorted buildfarm failures, mostly on hoverfly.
      
      Discussion: https://postgr.es/m/CAA4eK1+zM6L4QSA1XMvXY_qqWwdUmqkOS1+hWvL8QcYEBGA1Uw@mail.gmail.com
      f452aaf7
    • Tom Lane's avatar
      Ensure pg_filenode_relation(0, 0) returns NULL. · 1250aad4
      Tom Lane authored
      Previously, a zero value for the relfilenode resulted in
      a confusing error message about "unexpected duplicate".
      This function returns NULL for other invalid relfilenode
      values, so zero should be treated likewise.
      
      It's been like this all along, so back-patch to all supported
      branches.
      
      Justin Pryzby
      
      Discussion: https://postgr.es/m/20210612023324.GT16435@telsasoft.com
      1250aad4
    • Tom Lane's avatar
      Don't use Asserts to check for violations of replication protocol. · fe6a20ce
      Tom Lane authored
      Using an Assert to check the validity of incoming messages is an
      extremely poor decision.  In a debug build, it should not be that easy
      for a broken or malicious remote client to crash the logrep worker.
      The consequences could be even worse in non-debug builds, which will
      fail to make such checks at all, leading to who-knows-what misbehavior.
      Hence, promote every Assert that could possibly be triggered by wrong
      or out-of-order replication messages to a full test-and-ereport.
      
      To avoid bloating the set of messages the translation team has to cope
      with, establish a policy that replication protocol violation error
      reports don't need to be translated.  Hence, all the new messages here
      use errmsg_internal().  A couple of old messages are changed likewise
      for consistency.
      
      Along the way, fix some non-idiomatic or outright wrong uses of
      hash_search().
      
      Most of these mistakes are new with the "streaming replication"
      patch (commit 46482432), but a couple go back a long way.
      Back-patch as appropriate.
      
      Discussion: https://postgr.es/m/1719083.1623351052@sss.pgh.pa.us
      fe6a20ce
    • Andrew Dunstan's avatar
      Fix new recovery test for use under msys · c3652f97
      Andrew Dunstan authored
      Commit caba8f0d wasn't quite right for msys, as demonstrated by
      several buildfarm animals, including jacana and fairywren. We need to
      use the msys perl in the archive command, but call it in such a way that
      Windows will understand the path. Furthermore, inside the copy script we
      need to convert a Windows path to an msys path.
      c3652f97
    • Michael Paquier's avatar
      Simplify some code in getObjectTypeDescription() · b56b83aa
      Michael Paquier authored
      This routine is designed to never return an empty description or NULL,
      providing description fallbacks even if missing objects are accepted,
      but it included a code path where this was considered possible.  All the
      callers of this routine already consider NULL as not possible, so
      change a bit the code to map with the assumptions of the callers, and
      add more comments close to the callers of this routine to outline the
      behavior expected.
      
      This code is new as of 2a10fdc4, so no backpatch is needed.
      
      Discussion: https://postgr.es/m/YMNY6RGPBRCeLmFb@paquier.xyz
      b56b83aa
    • Michael Paquier's avatar
      Improve log pattern detection in recently-added TAP tests · bfd96b7a
      Michael Paquier authored
      ab55d742 has introduced some tests with rows found as missing in logical
      replication subscriptions for partitioned tables, relying on a logic
      with a lookup of the logs generated, scanning the whole file.  This
      commit makes the logic more precise, by scanning the logs only from the
      position before the key queries are run to the position where we check
      for the logs.  This will reduce the risk of issues with log patterns
      overlapping with each other if those tests get more complicated in the
      future.
      
      Per discussion with Tom Lane.
      
      Discussion: https://postgr.es/m/YMP+Gx2S8meYYHW4@paquier.xyz
      Backpatch-through: 13
      bfd96b7a
    • Andres Freund's avatar
      Improve and cleanup ProcArrayAdd(), ProcArrayRemove(). · d8e950d3
      Andres Freund authored
      941697c3 changed ProcArrayAdd()/Remove() substantially. As reported by
      zhanyi, I missed that due to the introduction of the PGPROC->pgxactoff
      ProcArrayRemove() does not need to search for the position in
      procArray->pgprocnos anymore - that's pgxactoff. Remove the search loop.
      
      The removal of the search loop reduces assertion coverage a bit - but we can
      easily do better than before by adding assertions to other loops over
      pgprocnos elements.
      
      Also do a bit of cleanup, mostly by reducing line lengths by introducing local
      helper variables and adding newlines.
      
      Author: zhanyi <w@hidva.com>
      Author: Andres Freund <andres@anarazel.de>
      Discussion: https://postgr.es/m/tencent_5624AA3B116B3D1C31CA9744@qq.com
      d8e950d3
    • Bruce Momjian's avatar
      doc: remove extra right angle bracket in PG 14 relnotes · d120e66f
      Bruce Momjian authored
      Reported-by: Justin Pryzby
      d120e66f
  7. 11 Jun, 2021 5 commits