1. 28 Dec, 2020 9 commits
    • Tom Lane's avatar
      Improve log messages related to pg_hba.conf not matching a connection. · 3995c424
      Tom Lane authored
      Include details on whether GSS encryption has been activated;
      since we added "hostgssenc" type HBA entries, that's relevant info.
      
      Kyotaro Horiguchi and Tom Lane.  Back-patch to v12 where
      GSS encryption was introduced.
      
      Discussion: https://postgr.es/m/e5b0b6ed05764324a2f3fe7acfc766d5@smhi.se
      3995c424
    • Tom Lane's avatar
      Fix assorted issues in backend's GSSAPI encryption support. · 622ae462
      Tom Lane authored
      Unrecoverable errors detected by GSSAPI encryption can't just be
      reported with elog(ERROR) or elog(FATAL), because attempting to
      send the error report to the client is likely to lead to infinite
      recursion or loss of protocol sync.  Instead make this code do what
      the SSL encryption code has long done, which is to just report any
      such failure to the server log (with elevel COMMERROR), then pretend
      we've lost the connection by returning errno = ECONNRESET.
      
      Along the way, fix confusion about whether message translation is done
      by pg_GSS_error() or its callers (the latter should do it), and make
      the backend version of that function work more like the frontend
      version.
      
      Avoid allocating the port->gss struct until it's needed; we surely
      don't need to allocate it in the postmaster.
      
      Improve logging of "connection authorized" messages with GSS enabled.
      (As part of this, I back-patched the code changes from dc11f31a.)
      
      Make BackendStatusShmemSize() account for the GSS-related space that
      will be allocated by CreateSharedBackendStatus().  This omission
      could possibly cause out-of-shared-memory problems with very high
      max_connections settings.
      
      Remove arbitrary, pointless restriction that only GSS authentication
      can be used on a GSS-encrypted connection.
      
      Improve documentation; notably, document the fact that libpq now
      prefers GSS encryption over SSL encryption if both are possible.
      
      Per report from Mikael Gustavsson.  Back-patch to v12 where
      this code was introduced.
      
      Discussion: https://postgr.es/m/e5b0b6ed05764324a2f3fe7acfc766d5@smhi.se
      622ae462
    • Tom Lane's avatar
      Fix bugs in libpq's GSSAPI encryption support. · ff6ce9a3
      Tom Lane authored
      The critical issue fixed here is that if a GSSAPI-encrypted connection
      is successfully made, pqsecure_open_gss() cleared conn->allow_ssl_try,
      as an admittedly-hacky way of preventing us from then trying to tunnel
      SSL encryption over the already-encrypted connection.  The problem
      with that is that if we abandon the GSSAPI connection because of a
      failure during authentication, we would not attempt SSL encryption
      in the next try with the same server.  This can lead to unexpected
      connection failure, or silently getting a non-encrypted connection
      where an encrypted one is expected.
      
      Fortunately, we'd only manage to make a GSSAPI-encrypted connection
      if both client and server hold valid tickets in the same Kerberos
      infrastructure, which is a relatively uncommon environment.
      Nonetheless this is a very nasty bug with potential security
      consequences.  To fix, don't reset the flag, instead adding a
      check for conn->gssenc being already true when deciding whether
      to try to initiate SSL.
      
      While here, fix some lesser issues in libpq's GSSAPI code:
      
      * Use the need_new_connection stanza when dropping an attempted
      GSSAPI connection, instead of partially duplicating that code.
      The consequences of this are pretty minor: AFAICS it could only
      lead to auth_req_received or password_needed remaining set when
      they shouldn't, which is not too harmful.
      
      * Fix pg_GSS_error() to not repeat the "mprefix" it's given multiple
      times, and to notice any failure return from gss_display_status().
      
      * Avoid gratuitous dependency on NI_MAXHOST in
      pg_GSS_load_servicename().
      
      Per report from Mikael Gustavsson.  Back-patch to v12 where
      this code was introduced.
      
      Discussion: https://postgr.es/m/e5b0b6ed05764324a2f3fe7acfc766d5@smhi.se
      ff6ce9a3
    • Tom Lane's avatar
      Expose the default for channel_binding in PQconndefaults(). · cf61b073
      Tom Lane authored
      If there's a static default value for a connection option,
      it should be shown in the PQconninfoOptions array.
      
      Daniele Varrazzo
      
      Discussion: https://postgr.es/m/CA+mi_8Zo8Rgn7p+6ZRY7QdDu+23ukT9AvoHNyPbgKACxwgGhZA@mail.gmail.com
      cf61b073
    • Tom Lane's avatar
      Further fix thinko in plpgsql memory leak fix. · 5f2e09bc
      Tom Lane authored
      There's a second call of get_eval_mcontext() that should also be
      get_stmt_mcontext().  This is actually dead code, since no
      interesting allocations happen before switching back to the
      original context, but we should keep it in sync with the other
      call to forestall possible future bugs.
      
      Discussion: https://postgr.es/m/f075f7be-c654-9aa8-3ffc-e9214622f02a@enterprisedb.com
      5f2e09bc
    • Tom Lane's avatar
      Fix thinko in plpgsql memory leak fix. · ea80d8d9
      Tom Lane authored
      Commit a6b1f536 intended to place the transient "target" list of
      a CALL statement in the function's statement-lifespan context,
      but I fat-fingered that and used get_eval_mcontext() instead of
      get_stmt_mcontext().  The eval_mcontext belongs to the "simple
      expression" infrastructure, which is destroyed at transaction end.
      The net effect is that a CALL in a procedure to another procedure
      that has OUT or INOUT parameters would fail if the called procedure
      did a COMMIT.
      
      Per report from Peter Eisentraut.  Back-patch to v11, like the
      prior patch.
      
      Discussion: https://postgr.es/m/f075f7be-c654-9aa8-3ffc-e9214622f02a@enterprisedb.com
      ea80d8d9
    • Michael Paquier's avatar
      Fix inconsistent code with shared invalidations of snapshots · 643428c5
      Michael Paquier authored
      The code in charge of processing a single invalidation message has been
      using since 568d4138 the structure for relation mapping messages.  This
      had fortunately no consequence as both locate the database ID at the
      same location, but it could become a problem in the future if this area
      of the code changes.
      
      Author: Konstantin Knizhnik
      Discussion: https://postgr.es/m/8044c223-4d3a-2cdb-42bf-29940840ce94@postgrespro.ru
      Backpatch-through: 9.5
      643428c5
    • Fujii Masao's avatar
      postgres_fdw: Fix connection leak. · e3ebcca8
      Fujii Masao authored
      In postgres_fdw, the cached connections to foreign servers will not be
      closed until the local session exits if the user mappings or foreign servers
      that those connections depend on are dropped. Those connections can be
      leaked.
      
      To fix that connection leak issue, after a change to a pg_foreign_server
      or pg_user_mapping catalog entry, this commit makes postgres_fdw close
      the connections depending on that entry immediately if current
      transaction has not used those connections yet. Otherwise, mark those
      connections as invalid and then close them at the end of current transaction,
      since they cannot be closed in the midst of the transaction using them.
      Closed connections will be remade at the next opportunity if necessary.
      
      Back-patch to all supported branches.
      
      Author: Bharath Rupireddy
      Reviewed-by: Zhihong Yu, Zhijie Hou, Fujii Masao
      Discussion: https://postgr.es/m/CALj2ACVNcGH_6qLY-4_tXz8JLvA+4yeBThRfxMz7Oxbk1aHcpQ@mail.gmail.com
      e3ebcca8
    • Bruce Momjian's avatar
      Revert "Add key management system" (978f869b) & later commits · 3187ef7c
      Bruce Momjian authored
      The patch needs test cases, reorganization, and cfbot testing.
      Technically reverts commits 5c31afc4..e35b2bad (exclusive/inclusive)
      and 08db7c63..ccbe3413.
      
      Reported-by: Tom Lane, Michael Paquier
      
      Discussion: https://postgr.es/m/E1ktAAG-0002V2-VB@gemulon.postgresql.org
      3187ef7c
  2. 27 Dec, 2020 3 commits
    • Jeff Davis's avatar
      Second attempt to stabilize 05c02589. · facad314
      Jeff Davis authored
      Removing the EXPLAIN test to stabilize the buildfarm. The execution
      test should still be effective to catch the bug even if the plan is
      slightly different on different platforms.
      facad314
    • Jeff Davis's avatar
      Stabilize test introduced in 05c02589, per buildfarm. · fa0fdf05
      Jeff Davis authored
      In passing, make the capitalization match the rest of the file.
      
      Reported-by: Tom Lane
      fa0fdf05
    • Jeff Davis's avatar
      Fix bug #16784 in Disk-based Hash Aggregation. · 05c02589
      Jeff Davis authored
      Before processing tuples, agg_refill_hash_table() was setting all
      pergroup pointers to NULL to signal to advance_aggregates() that it
      should not attempt to advance groups that had spilled.
      
      The problem was that it also set the pergroups for sorted grouping
      sets to NULL, which caused rescanning to fail.
      
      Instead, change agg_refill_hash_table() to only set the pergroups for
      hashed grouping sets to NULL; and when compiling the expression, pass
      doSort=false.
      
      Reported-by: Alexander Lakhin
      Discussion: https://postgr.es/m/16784-7ff169bf2c3d1588%40postgresql.org
      Backpatch-through: 13
      05c02589
  3. 26 Dec, 2020 9 commits
  4. 25 Dec, 2020 9 commits
  5. 24 Dec, 2020 5 commits
    • Bruce Momjian's avatar
      revert removal of hex_decode() from ecpg from commit c3826f83 · 558a6e8e
      Bruce Momjian authored
      ecpglib on certain platforms can't handle the pg_log_fatal calls from
      libraries.  This was reported by the buildfarm.  It needs a refactoring
      and return value change if it is later removed.
      
      Backpatch-through: master
      558a6e8e
    • Bruce Momjian's avatar
      move hex_decode() to /common so it can be called from frontend · c3826f83
      Bruce Momjian authored
      This allows removal of a copy of hex_decode() from ecpg, and will be
      used by the soon-to-be added pg_alterckey command.
      
      Backpatch-through: master
      c3826f83
    • Tom Lane's avatar
      Fix race condition between shutdown and unstarted background workers. · 7519bd16
      Tom Lane authored
      If a database shutdown (smart or fast) is commanded between the time
      some process decides to request a new background worker and the time
      that the postmaster can launch that worker, then nothing happens
      because the postmaster won't launch any bgworkers once it's exited
      PM_RUN state.  This is fine ... unless the requesting process is
      waiting for that worker to finish (or even for it to start); in that
      case the requestor is stuck, and only manual intervention will get us
      to the point of being able to shut down.
      
      To fix, cancel pending requests for workers when the postmaster sends
      shutdown (SIGTERM) signals, and similarly cancel any new requests that
      arrive after that point.  (We can optimize things slightly by only
      doing the cancellation for workers that have waiters.)  To fit within
      the existing bgworker APIs, the "cancel" is made to look like the
      worker was started and immediately stopped, causing deregistration of
      the bgworker entry.  Waiting processes would have to deal with
      premature worker exit anyway, so this should introduce no bugs that
      weren't there before.  We do have a side effect that registration
      records for restartable bgworkers might disappear when theoretically
      they should have remained in place; but since we're shutting down,
      that shouldn't matter.
      
      Back-patch to v10.  There might be value in putting this into 9.6
      as well, but the management of bgworkers is a bit different there
      (notably see 8ff51869) and I'm not convinced it's worth the effort
      to validate the patch for that branch.
      
      Discussion: https://postgr.es/m/661570.1608673226@sss.pgh.pa.us
      7519bd16
    • Tom Lane's avatar
      Improve client error messages for immediate-stop situations. · 7e784d1d
      Tom Lane authored
      Up to now, if the DBA issued "pg_ctl stop -m immediate", the message
      sent to clients was the same as for a crash-and-restart situation.
      This is confusing, not least because the message claims that the
      database will soon be up again, something we have no business
      predicting.
      
      Improve things so that we can generate distinct messages for the two
      cases (and also recognize an ad-hoc SIGQUIT, should somebody try that).
      To do that, add a field to pmsignal.c's shared memory data structure
      that the postmaster sets just before broadcasting SIGQUIT to its
      children.  No interlocking seems to be necessary; the intervening
      signal-sending and signal-receipt should sufficiently serialize accesses
      to the field.  Hence, this isn't any riskier than the existing usages
      of pmsignal.c.
      
      We might in future extend this idea to improve other
      postmaster-to-children signal scenarios, although none of them
      currently seem to be as badly overloaded as SIGQUIT.
      
      Discussion: https://postgr.es/m/559291.1608587013@sss.pgh.pa.us
      7e784d1d
    • Michael Paquier's avatar
      Fix typos and grammar in docs and comments · 90fbf7c5
      Michael Paquier authored
      This fixes several areas of the documentation and some comments in
      matters of style, grammar, or even format.
      
      Author: Justin Pryzby
      Discussion: https://postgr.es/m/20201222041153.GK30237@telsasoft.com
      90fbf7c5
  6. 23 Dec, 2020 2 commits
    • Bruce Momjian's avatar
      dummy commit · 6ecf488d
      Bruce Momjian authored
      6ecf488d
    • Michael Paquier's avatar
      Fix portability issues with parsing of recovery_target_xid · 6db27037
      Michael Paquier authored
      The parsing of this parameter has been using strtoul(), which is not
      portable across platforms.  On most Unix platforms, unsigned long has a
      size of 64 bits, while on Windows it is 32 bits.  It is common in
      recovery scenarios to rely on the output of txid_current() or even the
      newer pg_current_xact_id() to get a transaction ID for setting up
      recovery_target_xid.  The value returned by those functions includes the
      epoch in the computed result, which would cause strtoul() to fail where
      unsigned long has a size of 32 bits once the epoch is incremented.
      
      WAL records and 2PC data include only information about 32-bit XIDs and
      it is not possible to have XIDs across more than one epoch, so
      discarding the high bits from the transaction ID set has no impact on
      recovery.  On the contrary, the use of strtoul() prevents a consistent
      behavior across platforms depending on the size of unsigned long.
      
      This commit changes the parsing of recovery_target_xid to use
      pg_strtouint64() instead, available down to 9.6.  There is one TAP test
      stressing recovery with recovery_target_xid, where a tweak based on
      pg_reset{xlog,wal} is added to bump the XID epoch so as this change gets
      tested, as per an idea from Alexander Lakhin.
      
      Reported-by: Alexander Lakhin
      Discussion: https://postgr.es/m/16780-107fd0c0385b1035@postgresql.org
      Backpatch-through: 9.6
      6db27037
  7. 22 Dec, 2020 3 commits
    • Tom Lane's avatar
      Improve autoprewarm's handling of early-shutdown scenarios. · ff769831
      Tom Lane authored
      Bad things happen if the DBA issues "pg_ctl stop -m fast" before
      autoprewarm finishes loading its list of blocks to prewarm.
      The current worker process successfully terminates early, but
      (if this wasn't the last database with blocks to prewarm) the
      leader process will just try to launch another worker for the
      next database.  Since the postmaster is now in PM_WAIT_BACKENDS
      state, it ignores the launch request, and the leader just sits
      until it's killed manually.
      
      This is mostly the fault of our half-baked design for launching
      background workers, but a proper fix for that is likely to be
      too invasive to be back-patchable.  To ameliorate the situation,
      fix apw_load_buffers() to check whether SIGTERM has arrived
      just before trying to launch another worker.  That leaves us with
      only a very narrow window in each worker launch where SIGTERM
      could occur between the launch request and successful worker start.
      
      Another issue is that if the leader process does manage to exit,
      it unconditionally rewrites autoprewarm.blocks with only the
      blocks currently in shared buffers, thus forgetting any blocks
      that we hadn't reached yet while prewarming.  This seems quite
      unhelpful, since the next database start will then not have the
      expected prewarming benefit.  Fix it to not modify the file if
      we shut down before the initial load attempt is complete.
      
      Per bug #16785 from John Thompson.  Back-patch to v11 where
      the autoprewarm code was introduced.
      
      Discussion: https://postgr.es/m/16785-c0207d8c67fb5f25@postgresql.org
      ff769831
    • Tom Lane's avatar
      Increase timeout in 021_row_visibility.pl. · 08dde1b3
      Tom Lane authored
      Commit 7b28913b figured 30 seconds is long enough for anybody,
      but in contexts like valgrind runs, it isn't necessarily.
      08dde1b3
    • Tomas Vondra's avatar
      Improve find_em_expr_usable_for_sorting_rel comment · 1ca2eb10
      Tomas Vondra authored
      Clarify the relationship between find_em_expr_usable_for_sorting_rel and
      prepare_sort_from_pathkeys, i.e. what restrictions need to be shared
      between those two places.
      
      Author: James Coleman
      Reviewed-by: Tomas Vondra
      Backpatch-through: 13
      Discussion: https://postgr.es/m/CAAaqYe8cK3g5CfLC4w7bs%3DhC0mSksZC%3DH5M8LSchj5e5OxpTAg%40mail.gmail.com
      1ca2eb10