1. 25 May, 2021 3 commits
    • Michael Paquier's avatar
      Disallow SSL renegotiation · 01e6f1a8
      Michael Paquier authored
      SSL renegotiation is already disabled as of 48d23c72, however this does
      not prevent the server to comply with a client willing to use
      renegotiation.  In the last couple of years, renegotiation had its set
      of security issues and flaws (like the recent CVE-2021-3449), and it
      could be possible to crash the backend with a client attempting
      renegotiation.
      
      This commit takes one extra step by disabling renegotiation in the
      backend in the same way as SSL compression (f9264d15) or tickets
      (97d3a0b0).  OpenSSL 1.1.0h has added an option named
      SSL_OP_NO_RENEGOTIATION able to achieve that.  In older versions
      there is an option called SSL3_FLAGS_NO_RENEGOTIATE_CIPHERS that
      was undocumented, and could be set within the SSL object created when
      the TLS connection opens, but I have decided not to use it, as it feels
      trickier to rely on, and it is not official.  Note that this option is
      not usable in OpenSSL < 1.1.0h as the internal contents of the *SSL
      object are hidden to applications.
      
      SSL renegotiation concerns protocols up to TLSv1.2.
      
      Per original report from Robert Haas, with a patch based on a suggestion
      by Andres Freund.
      
      Author: Michael Paquier
      Reviewed-by: Daniel Gustafsson
      Discussion: https://postgr.es/m/YKZBXx7RhU74FlTE@paquier.xyz
      Backpatch-through: 9.6
      01e6f1a8
    • David Rowley's avatar
      Fix setrefs.c code for Result Cache nodes · cba5c70b
      David Rowley authored
      Result Cache, added in 9eacee2e neglected to properly adjust the plan
      references in setrefs.c.  This could lead to the following error during
      EXPLAIN:
      
      ERROR:  cannot decompile join alias var in plan tree
      
      Fix that.
      
      Bug: 17030
      Reported-by: Hans Buschmann
      Discussion: https://postgr.es/m/17030-5844aecae42fe223@postgresql.org
      cba5c70b
    • Peter Geoghegan's avatar
      Consider triggering VACUUM failsafe during scan. · c242baa4
      Peter Geoghegan authored
      The wraparound failsafe mechanism added by commit 1e55e7d1 handled the
      one-pass strategy case (i.e. the "table has no indexes" case) by adding
      a dedicated failsafe check.  This made up for the fact that the usual
      one-pass checks inside lazy_vacuum_all_indexes() cannot ever be reached
      during a one-pass strategy VACUUM.
      
      This approach failed to account for two-pass VACUUMs that opt out of
      index vacuuming up-front.  The INDEX_CLEANUP off case in the only case
      that works like that.
      
      Fix this by performing a failsafe check every 4GB during the first scan
      of the heap, regardless of the details of the VACUUM.  This eliminates
      the special case, and will make the failsafe trigger more reliably.
      
      Author: Peter Geoghegan <pg@bowt.ie>
      Reported-By: default avatarAndres Freund <andres@anarazel.de>
      Reviewed-By: default avatarMasahiko Sawada <sawada.mshk@gmail.com>
      Discussion: https://postgr.es/m/20210424002921.pb3t7h6frupdqnkp@alap3.anarazel.de
      c242baa4
  2. 24 May, 2021 2 commits
  3. 23 May, 2021 5 commits
    • Tom Lane's avatar
      Re-order pg_attribute columns to eliminate some padding space. · f5024d8d
      Tom Lane authored
      Now that attcompression is just a char, there's a lot of wasted
      padding space after it.  Move it into the group of char-wide
      columns to save a net of 4 bytes per pg_attribute entry.  While
      we're at it, swap the order of attstorage and attalign to make for
      a more logical grouping of these columns.
      
      Also re-order actions in related code to match the new field ordering.
      
      This patch also fixes one outright bug: equalTupleDescs() failed to
      compare attcompression.  That could, for example, cause relcache
      reload to fail to adopt a new value following a change.
      
      Michael Paquier and Tom Lane, per a gripe from Andres Freund.
      
      Discussion: https://postgr.es/m/20210517204803.iyk5wwvwgtjcmc5w@alap3.anarazel.de
      f5024d8d
    • Tom Lane's avatar
      Be more verbose when the postmaster unexpectedly quits. · bc2a389e
      Tom Lane authored
      Emit a LOG message when the postmaster stops because of a failure in
      the startup process.  There already is a similar message if we exit
      for that reason during PM_STARTUP phase, so it seems inconsistent
      that there was none if the startup process fails later on.
      
      Also emit a LOG message when the postmaster stops after a crash
      because restart_after_crash is disabled.  This seems potentially
      helpful in case DBAs (or developers) forget that that's set.
      Also, it was the only remaining place where the postmaster would
      do an abnormal exit without any comment as to why.
      
      In passing, remove an unreachable call of ExitPostmaster(0).
      
      Discussion: https://postgr.es/m/194914.1621641288@sss.pgh.pa.us
      bc2a389e
    • Bruce Momjian's avatar
      doc: word-wrap and indent PG 14 relnotes · 8f73ed6b
      Bruce Momjian authored
      8f73ed6b
    • Tom Lane's avatar
      Fix access to no-longer-open relcache entry in logical-rep worker. · b39630fd
      Tom Lane authored
      If we redirected a replicated tuple operation into a partition child
      table, and then tried to fire AFTER triggers for that event, the
      relation cache entry for the child table was already closed.  This has
      no visible ill effects as long as the entry is still there and still
      valid, but an unluckily-timed cache flush could result in a crash or
      other misbehavior.
      
      To fix, postpone the ExecCleanupTupleRouting call (which is what
      closes the child table) until after we've fired triggers.  This
      requires a bit of refactoring so that the cleanup function can
      have access to the necessary state.
      
      In HEAD, I took the opportunity to simplify some of worker.c's
      function APIs based on use of the new ApplyExecutionData struct.
      However, it doesn't seem safe/practical to back-patch that aspect,
      at least not without a lot of analysis of possible interactions
      with a04daa97.
      
      In passing, add an Assert to afterTriggerInvokeEvents to catch
      such cases.  This seems worthwhile because we've grown a number
      of fairly unstructured ways of calling AfterTriggerEndQuery.
      
      Back-patch to v13, where worker.c grew the ability to deal with
      partitioned target tables.
      
      Discussion: https://postgr.es/m/3382681.1621381328@sss.pgh.pa.us
      b39630fd
    • Bruce Momjian's avatar
  4. 22 May, 2021 4 commits
    • Bruce Momjian's avatar
    • Tom Lane's avatar
      Remove plpgsql's special-case code paths for SET/RESET. · 30168be8
      Tom Lane authored
      In the wake of 84f5c290, it's no longer necessary for plpgsql to
      handle SET/RESET specially.  The point of that was just to avoid
      taking a new transaction snapshot prematurely, which the regular code
      path through _SPI_execute_plan() now does just fine (in fact better,
      since it now does the right thing for LOCK too).  Hence, rip out a
      few lines of code, going back to the old way of treating SET/RESET
      as a generic SQL command.  This essentially reverts all but the
      test cases from b981275b.
      
      Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
      30168be8
    • David Rowley's avatar
      Fix planner's use of Result Cache with unique joins · 9e215378
      David Rowley authored
      When the planner considered using a Result Cache node to cache results
      from the inner side of a Nested Loop Join, it failed to consider that the
      inner path's parameterization may not be the entire join condition.  If
      the join was marked as inner_unique then we may accidentally put the cache
      in singlerow mode.  This meant that entries would be marked as complete
      after caching the first row.  That was wrong as if only part of the join
      condition was parameterized then the uniqueness of the unique join was not
      guaranteed at the Result Cache's level.  The uniqueness is only guaranteed
      after Nested Loop applies the join filter.  If subsequent rows were found,
      this would lead to:
      
      ERROR: cache entry already complete
      
      This could have been fixed by only putting the cache in singlerow mode if
      the entire join condition was parameterized.  However, Nested Loop will
      only read its inner side so far as the first matching row when the join is
      unique, so that might mean we never get an opportunity to mark cache
      entries as complete.  Since non-complete cache entries are useless for
      subsequent lookups, we just don't bother considering a Result Cache path
      in this case.
      
      In passing, remove the XXX comment that claimed the above ERROR might be
      better suited to be an Assert.  After there being an actual case which
      triggered it, it seems better to keep it an ERROR.
      
      Reported-by: David Christensen
      Discussion: https://postgr.es/m/CAOxo6X+dy-V58iEPFgst8ahPKEU+38NZzUuc+a7wDBZd4TrHMQ@mail.gmail.com
      9e215378
    • Bruce Momjian's avatar
      0cdaa05b
  5. 21 May, 2021 6 commits
    • Bruce Momjian's avatar
      55370f8d
    • Tom Lane's avatar
      Disallow whole-row variables in GENERATED expressions. · 4b100744
      Tom Lane authored
      This was previously allowed, but I think that was just an oversight.
      It's a clear violation of the rule that a generated column cannot
      depend on itself or other generated columns.  Moreover, because the
      code was relying on the assumption that no such cross-references
      exist, it was pretty easy to crash ALTER TABLE and perhaps other
      places.  Even if you managed not to crash, you got quite unstable,
      implementation-dependent results.
      
      Per report from Vitaly Ustinov.
      Back-patch to v12 where GENERATED came in.
      
      Discussion: https://postgr.es/m/CAM_DEiWR2DPT6U4xb-Ehigozzd3n3G37ZB1+867zbsEVtYoJww@mail.gmail.com
      4b100744
    • Tom Lane's avatar
      Fix usage of "tableoid" in GENERATED expressions. · 2b0ee126
      Tom Lane authored
      We consider this supported (though I've got my doubts that it's a
      good idea, because tableoid is not immutable).  However, several
      code paths failed to fill the field in soon enough, causing such
      a GENERATED expression to see zero or the wrong value.  This
      occurred when ALTER TABLE adds a new GENERATED column to a table
      with existing rows, and during regular INSERT or UPDATE on a
      foreign table with GENERATED columns.
      
      Noted during investigation of a report from Vitaly Ustinov.
      Back-patch to v12 where GENERATED came in.
      
      Discussion: https://postgr.es/m/CAM_DEiWR2DPT6U4xb-Ehigozzd3n3G37ZB1+867zbsEVtYoJww@mail.gmail.com
      2b0ee126
    • Tom Lane's avatar
      Restore the portal-level snapshot after procedure COMMIT/ROLLBACK. · 84f5c290
      Tom Lane authored
      COMMIT/ROLLBACK necessarily destroys all snapshots within the session.
      The original implementation of intra-procedure transactions just
      cavalierly did that, ignoring the fact that this left us executing in
      a rather different environment than normal.  In particular, it turns
      out that handling of toasted datums depends rather critically on there
      being an outer ActiveSnapshot: otherwise, when SPI or the core
      executor pop whatever snapshot they used and return, it's unsafe to
      dereference any toasted datums that may appear in the query result.
      It's possible to demonstrate "no known snapshots" and "missing chunk
      number N for toast value" errors as a result of this oversight.
      
      Historically this outer snapshot has been held by the Portal code,
      and that seems like a good plan to preserve.  So add infrastructure
      to pquery.c to allow re-establishing the Portal-owned snapshot if it's
      not there anymore, and add enough bookkeeping support that we can tell
      whether it is or not.
      
      We can't, however, just re-establish the Portal snapshot as part of
      COMMIT/ROLLBACK.  As in normal transaction start, acquiring the first
      snapshot should wait until after SET and LOCK commands.  Hence, teach
      spi.c about doing this at the right time.  (Note that this patch
      doesn't fix the problem for any PLs that try to run intra-procedure
      transactions without using SPI to execute SQL commands.)
      
      This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD,
      rename that to allow_nonatomic.
      
      replication/logical/worker.c also needs some fixes, because it wasn't
      careful to hold a snapshot open around AFTER trigger execution.
      That code doesn't use a Portal, which I suspect someday we're gonna
      have to fix.  But for now, just rearrange the order of operations.
      This includes back-patching the recent addition of finish_estate()
      to centralize the cleanup logic there.
      
      This also back-patches commit 2ecfeda3 into v13, to improve the
      test coverage for worker.c (it was that test that exposed that
      worker.c's snapshot management is wrong).
      
      Per bug #15990 from Andreas Wicht.  Back-patch to v11 where
      intra-procedure COMMIT was added.
      
      Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
      84f5c290
    • Peter Eisentraut's avatar
    • Amit Kapila's avatar
      Fix deadlock for multiple replicating truncates of the same table. · 6d0eb385
      Amit Kapila authored
      While applying the truncate change, the logical apply worker acquires
      RowExclusiveLock on the relation being truncated. This allowed truncate on
      the relation at a time by two apply workers which lead to a deadlock. The
      reason was that one of the workers after updating the pg_class tuple tries
      to acquire SHARE lock on the relation and started to wait for the second
      worker which has acquired RowExclusiveLock on the relation. And when the
      second worker tries to update the pg_class tuple, it starts to wait for
      the first worker which leads to a deadlock. Fix it by acquiring
      AccessExclusiveLock on the relation before applying the truncate change as
      we do for normal truncate operation.
      
      Author: Peter Smith, test case by Haiying Tang
      Reviewed-by: Dilip Kumar, Amit Kapila
      Backpatch-through: 11
      Discussion: https://postgr.es/m/CAHut+PsNm43p0jM+idTvWwiGZPcP0hGrHMPK9TOAkc+a4UpUqw@mail.gmail.com
      6d0eb385
  6. 20 May, 2021 5 commits
    • Tom Lane's avatar
      Avoid detoasting failure after COMMIT inside a plpgsql FOR loop. · f21fadaf
      Tom Lane authored
      exec_for_query() normally tries to prefetch a few rows at a time
      from the query being iterated over, so as to reduce executor
      entry/exit overhead.  Unfortunately this is unsafe if we have
      COMMIT or ROLLBACK within the loop, because there might be
      TOAST references in the data that we prefetched but haven't
      yet examined.  Immediately after the COMMIT/ROLLBACK, we have
      no snapshots in the session, meaning that VACUUM is at liberty
      to remove recently-deleted TOAST rows.
      
      This was originally reported as a case triggering the "no known
      snapshots" error in init_toast_snapshot(), but even if you miss
      hitting that, you can get "missing toast chunk", as illustrated
      by the added isolation test case.
      
      To fix, just disable prefetching in non-atomic contexts.  Maybe
      there will be performance complaints prompting us to work harder
      later, but it's not clear at the moment that this really costs
      much, and I doubt we'd want to back-patch any complicated fix.
      
      In passing, adjust that error message in init_toast_snapshot()
      to be a little clearer about the likely cause of the problem.
      
      Patch by me, based on earlier investigation by Konstantin Knizhnik.
      
      Per bug #15990 from Andreas Wicht.  Back-patch to v11 where
      intra-procedure COMMIT was added.
      
      Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
      f21fadaf
    • Bruce Momjian's avatar
      4f586fe2
    • Andrew Dunstan's avatar
      Install PostgresVersion.pm · bdbb2ce7
      Andrew Dunstan authored
      A lamentable oversight on my part meant that when PostgresVersion.pm was
      added in commit 4c4eaf3d provision to install it was not added to the
      Makefile, so it was not installed along with the other perl modules.
      bdbb2ce7
    • Tom Lane's avatar
      Clean up cpluspluscheck violation. · 6d59a218
      Tom Lane authored
      "typename" is a C++ keyword, so pg_upgrade.h fails to compile in C++.
      Fortunately, there seems no likely reason for somebody to need to
      do that.  Nonetheless, it's project policy that all .h files should
      pass cpluspluscheck, so rename the argument to fix that.
      
      Oversight in 57c081de; back-patch as that was.  (The policy requiring
      pg_upgrade.h to pass cpluspluscheck only goes back to v12, but it
      seems best to keep this code looking the same in all branches.)
      6d59a218
    • Andrew Dunstan's avatar
      Use a more portable way to get the version string in PostgresNode · 8bdd6f56
      Andrew Dunstan authored
      Older versions of perl on Windows don't like the list form of pipe open,
      and perlcritic doesn't like the string form of open, so we avoid both
      with a simpler formulation using qx{}.
      
      Per complaint from Amit Kapila.
      8bdd6f56
  7. 19 May, 2021 9 commits
    • Tom Lane's avatar
      Avoid creating testtablespace directories where not wanted. · 413c1ef9
      Tom Lane authored
      Recently we refactored things so that pg_regress makes the
      "testtablespace" subdirectory used by the core regression tests,
      instead of doing that in the makefiles.  That had the undesirable
      side effect of making such a subdirectory in every directory that
      has "input" or "output" test files.  Since these subdirectories
      remain empty, git doesn't complain about them, but nonetheless
      they're clutter.
      
      To fix, invent an explicit --make-testtablespace-dir switch,
      so that pg_regress only makes the subdirectory when explicitly
      told to.
      
      Discussion: https://postgr.es/m/2854388.1621284789@sss.pgh.pa.us
      413c1ef9
    • Bruce Momjian's avatar
      doc: revert 1e7d53bd so libpq chapter number is accessable · 4f7d1c30
      Bruce Momjian authored
      Fix PG 14 relnotes to use <link> instead of <xref>.  This was discussed
      in commit message 59fa7eb6.
      4f7d1c30
    • Bruce Momjian's avatar
    • Dean Rasheed's avatar
      Fix pgbench permute tests. · 0f516d03
      Dean Rasheed authored
      One of the tests for the pgbench permute() function added by
      6b258e3d fails on some 32-bit platforms, due to variations in the
      floating point computations in getrand(). The remaining tests give
      sufficient coverage, so just remove the failing test.
      
      Reported by Christoph Berg. Analysis by Thomas Munro and Tom Lane.
      Based on patch by Fabien Coelho.
      
      Discussion: https://postgr.es/m/YKQnUoYV63GRJBDD@msg.df7cb.de
      0f516d03
    • Fujii Masao's avatar
      Make standby promotion reset the recovery pause state to 'not paused'. · 167bd480
      Fujii Masao authored
      If a promotion is triggered while recovery is paused, the paused state ends
      and promotion continues. But previously in that case
      pg_get_wal_replay_pause_state() returned 'paused' wrongly while a promotion
      was ongoing.
      
      This commit changes a standby promotion so that it marks the recovery
      pause state as 'not paused' when it's triggered, to fix the issue.
      
      Author: Fujii Masao
      Reviewed-by: Dilip Kumar, Kyotaro Horiguchi
      Discussion: https://postgr.es/m/f706876c-4894-0ba5-6f4d-79803eeea21b@oss.nttdata.com
      167bd480
    • Amit Kapila's avatar
      Fix 020_messages.pl test. · 0a442a40
      Amit Kapila authored
      We were not waiting for a publisher to catch up with the subscriber after
      creating a subscription. Now, it can happen that apply worker starts
      replication even after we have disabled the subscription in the test. This
      will make the test expect that there is no active slot whereas there
      exists one. Fix this symptom by allowing the publisher to wait for
      catching up with the subscription.
      
      It is not a good idea to ensure if the slot is still active by checking
      for walsender existence as we release the slot after we clean up the
      walsender related memory. Fix that by checking the slot status in
      pg_replication_slots.
      
      Also, it is better to avoid repeated enabling/disabling of the
      subscription.
      
      Finally, we make autovacuum off for this test to avoid any empty
      transaction appearing in the test while consuming changes.
      
      Reported-by: as per buildfarm
      Author: Vignesh C
      Reviewed-by: Amit Kapila, Michael Paquier
      Discussion: https://postgr.es/m/CAA4eK1+uW1UGDHDz-HWMHMen76mKP7NJebOTZN4uwbyMjaYVww@mail.gmail.com
      0a442a40
    • Bruce Momjian's avatar
    • Fujii Masao's avatar
      Fix issues in pg_stat_wal. · d8735b8b
      Fujii Masao authored
      1) Previously there were both pgstat_send_wal() and pgstat_report_wal()
         in order to send WAL activity to the stats collector. With the former being
         used by wal writer, the latter by most other processes. They were a bit
         redundant and so this commit merges them into pgstat_send_wal() to
         simplify the code.
      
      2) Previously WAL global statistics counters were calculated and then
         compared with zero-filled buffer in order to determine whether any WAL
         activity has happened since the last submission. These calculation and
         comparison were not cheap. This was regularly exercised even in read-only
         workloads. This commit fixes the issue by making some WAL activity
         counters directly be checked to determine if there's WAL activity stats
         to send.
      
      3) Previously pgstat_report_stat() did not check if there's WAL activity
         stats to send as part of the "Don't expend a clock check if nothing to do"
         check at the top. It's probably rare to have pending WAL stats without
         also passing one of the other conditions, but for safely this commit
         changes pgstat_report_stats() so that it checks also some WAL activity
         counters at the top.
      
      This commit also adds the comments about the design of WAL stats.
      
      Reported-by: Andres Freund
      Author: Masahiro Ikeda
      Reviewed-by: Kyotaro Horiguchi, Atsushi Torikoshi, Andres Freund, Fujii Masao
      Discussion: https://postgr.es/m/20210324232224.vrfiij2rxxwqqjjb@alap3.anarazel.de
      d8735b8b
    • Michael Paquier's avatar
      Add --no-toast-compression to pg_dumpall · 694da198
      Michael Paquier authored
      This is an oversight from bbe0a81d, where the equivalent option exists
      in pg_dump.  This is useful to be able to reset the compression methods
      cluster-wide when restoring the data based on default_toast_compression.
      
      Reviewed-by: Daniel Gustafsson, Tom Lane
      Discussion: https://postgr.es/m/YKHC+qCJvzCRVCpY@paquier.xyz
      694da198
  8. 18 May, 2021 1 commit
  9. 17 May, 2021 5 commits