Commits · 3995c424984e991b1069a2869af972dc07574c0b · Abuhujair Javed / Postgres FD Implementation

28 Dec, 2020 9 commits

Improve log messages related to pg_hba.conf not matching a connection. · 3995c424

Tom Lane authored Dec 28, 2020

Include details on whether GSS encryption has been activated;
since we added "hostgssenc" type HBA entries, that's relevant info.

Kyotaro Horiguchi and Tom Lane.  Back-patch to v12 where
GSS encryption was introduced.

Discussion: https://postgr.es/m/e5b0b6ed05764324a2f3fe7acfc766d5@smhi.se

3995c424

Fix assorted issues in backend's GSSAPI encryption support. · 622ae462

Tom Lane authored Dec 28, 2020

Unrecoverable errors detected by GSSAPI encryption can't just be
reported with elog(ERROR) or elog(FATAL), because attempting to
send the error report to the client is likely to lead to infinite
recursion or loss of protocol sync.  Instead make this code do what
the SSL encryption code has long done, which is to just report any
such failure to the server log (with elevel COMMERROR), then pretend
we've lost the connection by returning errno = ECONNRESET.

Along the way, fix confusion about whether message translation is done
by pg_GSS_error() or its callers (the latter should do it), and make
the backend version of that function work more like the frontend
version.

Avoid allocating the port->gss struct until it's needed; we surely
don't need to allocate it in the postmaster.

Improve logging of "connection authorized" messages with GSS enabled.
(As part of this, I back-patched the code changes from dc11f31a.)

Make BackendStatusShmemSize() account for the GSS-related space that
will be allocated by CreateSharedBackendStatus().  This omission
could possibly cause out-of-shared-memory problems with very high
max_connections settings.

Remove arbitrary, pointless restriction that only GSS authentication
can be used on a GSS-encrypted connection.

Improve documentation; notably, document the fact that libpq now
prefers GSS encryption over SSL encryption if both are possible.

Per report from Mikael Gustavsson.  Back-patch to v12 where
this code was introduced.

Discussion: https://postgr.es/m/e5b0b6ed05764324a2f3fe7acfc766d5@smhi.se

622ae462

Fix bugs in libpq's GSSAPI encryption support. · ff6ce9a3

Tom Lane authored Dec 28, 2020

The critical issue fixed here is that if a GSSAPI-encrypted connection
is successfully made, pqsecure_open_gss() cleared conn->allow_ssl_try,
as an admittedly-hacky way of preventing us from then trying to tunnel
SSL encryption over the already-encrypted connection.  The problem
with that is that if we abandon the GSSAPI connection because of a
failure during authentication, we would not attempt SSL encryption
in the next try with the same server.  This can lead to unexpected
connection failure, or silently getting a non-encrypted connection
where an encrypted one is expected.

Fortunately, we'd only manage to make a GSSAPI-encrypted connection
if both client and server hold valid tickets in the same Kerberos
infrastructure, which is a relatively uncommon environment.
Nonetheless this is a very nasty bug with potential security
consequences.  To fix, don't reset the flag, instead adding a
check for conn->gssenc being already true when deciding whether
to try to initiate SSL.

While here, fix some lesser issues in libpq's GSSAPI code:

* Use the need_new_connection stanza when dropping an attempted
GSSAPI connection, instead of partially duplicating that code.
The consequences of this are pretty minor: AFAICS it could only
lead to auth_req_received or password_needed remaining set when
they shouldn't, which is not too harmful.

* Fix pg_GSS_error() to not repeat the "mprefix" it's given multiple
times, and to notice any failure return from gss_display_status().

* Avoid gratuitous dependency on NI_MAXHOST in
pg_GSS_load_servicename().

Per report from Mikael Gustavsson.  Back-patch to v12 where
this code was introduced.

Discussion: https://postgr.es/m/e5b0b6ed05764324a2f3fe7acfc766d5@smhi.se

ff6ce9a3

Expose the default for channel_binding in PQconndefaults(). · cf61b073

Tom Lane authored Dec 28, 2020

If there's a static default value for a connection option,
it should be shown in the PQconninfoOptions array.

Daniele Varrazzo

Discussion: https://postgr.es/m/CA+mi_8Zo8Rgn7p+6ZRY7QdDu+23ukT9AvoHNyPbgKACxwgGhZA@mail.gmail.com

cf61b073

Further fix thinko in plpgsql memory leak fix. · 5f2e09bc

Tom Lane authored Dec 28, 2020

There's a second call of get_eval_mcontext() that should also be
get_stmt_mcontext(). This is actually dead code, since no
interesting allocations happen before switching back to the
original context, but we should keep it in sync with the other
call to forestall possible future bugs.

Discussion: https://postgr.es/m/f075f7be-c654-9aa8-3ffc-e9214622f02a@enterprisedb.com

5f2e09bc

Fix thinko in plpgsql memory leak fix. · ea80d8d9

Tom Lane authored Dec 28, 2020

Commit a6b1f536 intended to place the transient "target" list of
a CALL statement in the function's statement-lifespan context,
but I fat-fingered that and used get_eval_mcontext() instead of
get_stmt_mcontext().  The eval_mcontext belongs to the "simple
expression" infrastructure, which is destroyed at transaction end.
The net effect is that a CALL in a procedure to another procedure
that has OUT or INOUT parameters would fail if the called procedure
did a COMMIT.

Per report from Peter Eisentraut.  Back-patch to v11, like the
prior patch.

Discussion: https://postgr.es/m/f075f7be-c654-9aa8-3ffc-e9214622f02a@enterprisedb.com

ea80d8d9

Fix inconsistent code with shared invalidations of snapshots · 643428c5

Michael Paquier authored Dec 28, 2020

The code in charge of processing a single invalidation message has been
using since 568d4138 the structure for relation mapping messages.  This
had fortunately no consequence as both locate the database ID at the
same location, but it could become a problem in the future if this area
of the code changes.

Author: Konstantin Knizhnik
Discussion: https://postgr.es/m/8044c223-4d3a-2cdb-42bf-29940840ce94@postgrespro.ru
Backpatch-through: 9.5

643428c5

postgres_fdw: Fix connection leak. · e3ebcca8

Fujii Masao authored Dec 28, 2020

In postgres_fdw, the cached connections to foreign servers will not be
closed until the local session exits if the user mappings or foreign servers
that those connections depend on are dropped. Those connections can be
leaked.

To fix that connection leak issue, after a change to a pg_foreign_server
or pg_user_mapping catalog entry, this commit makes postgres_fdw close
the connections depending on that entry immediately if current
transaction has not used those connections yet. Otherwise, mark those
connections as invalid and then close them at the end of current transaction,
since they cannot be closed in the midst of the transaction using them.
Closed connections will be remade at the next opportunity if necessary.

Back-patch to all supported branches.

Author: Bharath Rupireddy
Reviewed-by: Zhihong Yu, Zhijie Hou, Fujii Masao
Discussion: https://postgr.es/m/CALj2ACVNcGH_6qLY-4_tXz8JLvA+4yeBThRfxMz7Oxbk1aHcpQ@mail.gmail.com

e3ebcca8

Revert "Add key management system" () & later commits · 3187ef7c

Bruce Momjian authored Dec 27, 2020

The patch needs test cases, reorganization, and cfbot testing.
Technically reverts commits 5c31afc4..e35b2bad (exclusive/inclusive)
and 08db7c63..ccbe3413.

Reported-by: Tom Lane, Michael Paquier

Discussion: https://postgr.es/m/E1ktAAG-0002V2-VB@gemulon.postgresql.org

3187ef7c

27 Dec, 2020 3 commits

Second attempt to stabilize . · facad314

Jeff Davis authored Dec 27, 2020

Removing the EXPLAIN test to stabilize the buildfarm. The execution
test should still be effective to catch the bug even if the plan is
slightly different on different platforms.

facad314

Stabilize test introduced in 05c02589, per buildfarm. · fa0fdf05
Jeff Davis authored Dec 27, 2020
```
In passing, make the capitalization match the rest of the file.

Reported-by: Tom Lane
```
fa0fdf05

Fix bug #16784 in Disk-based Hash Aggregation. · 05c02589

Jeff Davis authored Dec 26, 2020

Before processing tuples, agg_refill_hash_table() was setting all
pergroup pointers to NULL to signal to advance_aggregates() that it
should not attempt to advance groups that had spilled.

The problem was that it also set the pergroups for sorted grouping
sets to NULL, which caused rescanning to fail.

Instead, change agg_refill_hash_table() to only set the pergroups for
hashed grouping sets to NULL; and when compiling the expression, pass
doSort=false.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/16784-7ff169bf2c3d1588%40postgresql.org
Backpatch-through: 13

05c02589

26 Dec, 2020 9 commits

initdb: document that -K requires an argument · ccbe3413

Bruce Momjian authored Dec 26, 2020

Reported-by: "Shinoda, Noriyoshi"

Discussion: https://postgr.es/m/TU4PR8401MB1152E92B4D44C81E496D6032EEDB0@TU4PR8401MB1152.NAMPRD84.PROD.OUTLOOK.COM

Author: "Shinoda, Noriyoshi"

Backpatch-through: msater

ccbe3413

auth commands: list specific commands to install in Makefile · ba6725df
Bruce Momjian authored Dec 26, 2020
```
Previously I used Makefile functions.

Backpatch-through: master
```
ba6725df

pg_alterckey: remove TAP check rules from Makefile · e174a6f1

Bruce Momjian authored Dec 26, 2020

Reported-by: Pavel Stehule, Michael Paquier

Discussion: https://postgr.es/m/CAFj8pRBRNo4co5bqCx4BLx1ZZ45Z_T-opPxA+u7SLp7gAtBpNA@mail.gmail.com

Backpatch-through: master

e174a6f1

Add scripts for retrieving the cluster file encryption key · d7602afa
Bruce Momjian authored Dec 26, 2020
```
Scripts are passphrase, direct, AWS, and two Yubikey ones.

Backpatch-through: master
```
d7602afa
doc: fix SGML markup for pg_alterckey from commit 62afb42a · 3d4843ba
Bruce Momjian authored Dec 26, 2020
```
Backpatch-through: master
```
3d4843ba

pg_alterckey: adjust doc build and Win32 sleep/open build fails · 82f8c45b

Bruce Momjian authored Dec 25, 2020

Fix for commit 62afb42a.

Reported-by: Tom Lane

Discussion: https://postgr.es/m/1252111.1608953815@sss.pgh.pa.us

Backpatch-through: master

82f8c45b

Fix function call typo in frontend Win32 code, commit 978f869b · 7705f8ca
Bruce Momjian authored Dec 25, 2020
```
Reported-by: buildfarm member walleye

Backpatch-through: master
```
7705f8ca
Allow ssl_passphrase_command to prompt the terminal · 300e430c
Bruce Momjian authored Dec 25, 2020
```
Previously the command could not access the terminal for a passphrase.

Backpatch-through: master
```
300e430c

Add pg_alterckey utility to change the cluster key · 62afb42a

Bruce Momjian authored Dec 25, 2020

This can change the key that encrypts the data encryption keys used for
cluster file encryption.

Discussion: https://postgr.es/m/20201202213814.GG20285@momjian.us

Backpatch-through: master

62afb42a

25 Dec, 2020 9 commits

remove missing reference to crypto test from patch · f2348993

Bruce Momjian authored Dec 25, 2020

Reported-by: Tom Lane

Discussion: https://postgr.es/m/1205031.1608925990@sss.pgh.pa.us

Backpatch-through: master

f2348993

Really fix the dummy implementations in cipher.c. · 0848cf4f
Tom Lane authored Dec 25, 2020
```
945083b2 wasn't enough to silence compiler warnings.
```
0848cf4f
fix no-return function call in cipher.c from commit 978f869b · 8e59813e
Bruce Momjian authored Dec 25, 2020
```
Reported-by: buildfarm member sifaka

Backpatch-through: master
```
8e59813e

fixes docs and missing initdb help option for commit · 26d60f2a

Bruce Momjian authored Dec 25, 2020

Reported-by: Erik Rijkers

Discussion: https://postgr.es/m/a27e7bb60fc4c4a1fe960f7b055ba822@xs4all.nl

Backpatch-through: master

26d60f2a

Invalidate acl.c caches when pg_authid changes. · 08db7c63

Noah Misch authored Dec 25, 2020

This makes existing sessions reflect "ALTER ROLE ... [NO]INHERIT" as
quickly as they have been reflecting "GRANT role_name".  Back-patch to
9.5 (all supported versions).

Reviewed by Nathan Bossart.

Discussion: https://postgr.es/m/20201221095028.GB3777719@rfd.leadboat.com

08db7c63

remove uint128 requirement from patch 978f869b (CFE) · e35b2bad
Bruce Momjian authored Dec 25, 2020
```
Used char[16] instead.

Reported-by: buildfarm member florican

Backpatch-through: master
```
e35b2bad
Fix return value and const declaration from commit 978f869b · 945083b2
Bruce Momjian authored Dec 25, 2020
```
This fixes the non-OpenSSL compile case.

Reported-by: buildfarm member sifaka

Backpatch-through: master
```
945083b2

Add key management system · 978f869b

Bruce Momjian authored Dec 25, 2020

This adds a key management system that stores (currently) two data
encryption keys of length 128, 192, or 256 bits. The data keys are
AES256 encrypted using a key encryption key, and validated via GCM
cipher mode. A command to obtain the key encryption key must be
specified at initdb time, and will be run at every database server
start. New parameters allow a file descriptor open to the terminal to
be passed. pg_upgrade support has also been added.

Discussion: https://postgr.es/m/CA+fd4k7q5o6Nc_AaX6BcYM9yqTbC6_pnH-6nSD=54Zp6NBQTCQ@mail.gmail.com
Discussion: https://postgr.es/m/20201202213814.GG20285@momjian.us

Author: Masahiko Sawada, me, Stephen Frost

978f869b

Avoid time-of-day-dependent failure in log rotation test. · 5c31afc4

Tom Lane authored Dec 24, 2020

Buildfarm members pogona and petalura have shown a failure when
pg_ctl/t/004_logrotate.pl starts just before local midnight.
The default rotate-at-midnight behavior occurs just before the
Perl script examines current_logfiles, so it figures that the
rotation it's already requested has occurred ... but in reality,
that rotation happens just after it looks, so the expected new
log data goes into a different file than the one it's examining.

In HEAD, src/test/kerberos/t/001_auth.pl has acquired similar code
that evidently has a related failure mode. Besides being quite new,
few buildfarm critters run that test, so it's unsurprising that
we've not yet seen a failure there.

Fix both cases by setting log_rotation_age = 0 so that no time-based
rotation can occur. Also absorb 004_logrotate.pl's decision to
set lc_messages = 'C' into the kerberos test, in hopes that it will
work in non-English prevailing locales.

Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=pogona&dt=2020-12-24%2022%3A10%3A04
Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&dt=2020-02-01%2022%3A20%3A04

5c31afc4

24 Dec, 2020 5 commits

revert removal of hex_decode() from ecpg from commit · 558a6e8e

Bruce Momjian authored Dec 24, 2020

ecpglib on certain platforms can't handle the pg_log_fatal calls from
libraries.  This was reported by the buildfarm.  It needs a refactoring
and return value change if it is later removed.

Backpatch-through: master

558a6e8e

move hex_decode() to /common so it can be called from frontend · c3826f83

Bruce Momjian authored Dec 24, 2020

This allows removal of a copy of hex_decode() from ecpg, and will be
used by the soon-to-be added pg_alterckey command.

Backpatch-through: master

c3826f83

Fix race condition between shutdown and unstarted background workers. · 7519bd16

Tom Lane authored Dec 24, 2020

If a database shutdown (smart or fast) is commanded between the time
some process decides to request a new background worker and the time
that the postmaster can launch that worker, then nothing happens
because the postmaster won't launch any bgworkers once it's exited
PM_RUN state. This is fine ... unless the requesting process is
waiting for that worker to finish (or even for it to start); in that
case the requestor is stuck, and only manual intervention will get us
to the point of being able to shut down.

To fix, cancel pending requests for workers when the postmaster sends
shutdown (SIGTERM) signals, and similarly cancel any new requests that
arrive after that point. (We can optimize things slightly by only
doing the cancellation for workers that have waiters.) To fit within
the existing bgworker APIs, the "cancel" is made to look like the
worker was started and immediately stopped, causing deregistration of
the bgworker entry. Waiting processes would have to deal with
premature worker exit anyway, so this should introduce no bugs that
weren't there before. We do have a side effect that registration
records for restartable bgworkers might disappear when theoretically
they should have remained in place; but since we're shutting down,
that shouldn't matter.

Back-patch to v10. There might be value in putting this into 9.6
as well, but the management of bgworkers is a bit different there
(notably see 8ff51869) and I'm not convinced it's worth the effort
to validate the patch for that branch.

Discussion: https://postgr.es/m/661570.1608673226@sss.pgh.pa.us

7519bd16

Improve client error messages for immediate-stop situations. · 7e784d1d

Tom Lane authored Dec 24, 2020

Up to now, if the DBA issued "pg_ctl stop -m immediate", the message
sent to clients was the same as for a crash-and-restart situation.
This is confusing, not least because the message claims that the
database will soon be up again, something we have no business
predicting.

Improve things so that we can generate distinct messages for the two
cases (and also recognize an ad-hoc SIGQUIT, should somebody try that).
To do that, add a field to pmsignal.c's shared memory data structure
that the postmaster sets just before broadcasting SIGQUIT to its
children.  No interlocking seems to be necessary; the intervening
signal-sending and signal-receipt should sufficiently serialize accesses
to the field.  Hence, this isn't any riskier than the existing usages
of pmsignal.c.

We might in future extend this idea to improve other
postmaster-to-children signal scenarios, although none of them
currently seem to be as badly overloaded as SIGQUIT.

Discussion: https://postgr.es/m/559291.1608587013@sss.pgh.pa.us

7e784d1d

Fix typos and grammar in docs and comments · 90fbf7c5

Michael Paquier authored Dec 24, 2020

This fixes several areas of the documentation and some comments in
matters of style, grammar, or even format.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20201222041153.GK30237@telsasoft.com

90fbf7c5

23 Dec, 2020 2 commits

dummy commit · 6ecf488d
Bruce Momjian authored Nov 25, 2020

6ecf488d

Fix portability issues with parsing of recovery_target_xid · 6db27037

Michael Paquier authored Dec 23, 2020

The parsing of this parameter has been using strtoul(), which is not
portable across platforms. On most Unix platforms, unsigned long has a
size of 64 bits, while on Windows it is 32 bits. It is common in
recovery scenarios to rely on the output of txid_current() or even the
newer pg_current_xact_id() to get a transaction ID for setting up
recovery_target_xid. The value returned by those functions includes the
epoch in the computed result, which would cause strtoul() to fail where
unsigned long has a size of 32 bits once the epoch is incremented.

WAL records and 2PC data include only information about 32-bit XIDs and
it is not possible to have XIDs across more than one epoch, so
discarding the high bits from the transaction ID set has no impact on
recovery. On the contrary, the use of strtoul() prevents a consistent
behavior across platforms depending on the size of unsigned long.

This commit changes the parsing of recovery_target_xid to use
pg_strtouint64() instead, available down to 9.6. There is one TAP test
stressing recovery with recovery_target_xid, where a tweak based on
pg_reset{xlog,wal} is added to bump the XID epoch so as this change gets
tested, as per an idea from Alexander Lakhin.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/16780-107fd0c0385b1035@postgresql.org
Backpatch-through: 9.6

6db27037

22 Dec, 2020 3 commits

Improve autoprewarm's handling of early-shutdown scenarios. · ff769831

Tom Lane authored Dec 22, 2020

Bad things happen if the DBA issues "pg_ctl stop -m fast" before
autoprewarm finishes loading its list of blocks to prewarm.
The current worker process successfully terminates early, but
(if this wasn't the last database with blocks to prewarm) the
leader process will just try to launch another worker for the
next database.  Since the postmaster is now in PM_WAIT_BACKENDS
state, it ignores the launch request, and the leader just sits
until it's killed manually.

This is mostly the fault of our half-baked design for launching
background workers, but a proper fix for that is likely to be
too invasive to be back-patchable.  To ameliorate the situation,
fix apw_load_buffers() to check whether SIGTERM has arrived
just before trying to launch another worker.  That leaves us with
only a very narrow window in each worker launch where SIGTERM
could occur between the launch request and successful worker start.

Another issue is that if the leader process does manage to exit,
it unconditionally rewrites autoprewarm.blocks with only the
blocks currently in shared buffers, thus forgetting any blocks
that we hadn't reached yet while prewarming.  This seems quite
unhelpful, since the next database start will then not have the
expected prewarming benefit.  Fix it to not modify the file if
we shut down before the initial load attempt is complete.

Per bug #16785 from John Thompson.  Back-patch to v11 where
the autoprewarm code was introduced.

Discussion: https://postgr.es/m/16785-c0207d8c67fb5f25@postgresql.org

ff769831

Increase timeout in 021_row_visibility.pl. · 08dde1b3

Tom Lane authored Dec 22, 2020

Commit 7b28913b figured 30 seconds is long enough for anybody,
but in contexts like valgrind runs, it isn't necessarily.

08dde1b3

Improve find_em_expr_usable_for_sorting_rel comment · 1ca2eb10

Tomas Vondra authored Dec 22, 2020

Clarify the relationship between find_em_expr_usable_for_sorting_rel and
prepare_sort_from_pathkeys, i.e. what restrictions need to be shared
between those two places.

Author: James Coleman
Reviewed-by: Tomas Vondra
Backpatch-through: 13
Discussion: https://postgr.es/m/CAAaqYe8cK3g5CfLC4w7bs%3DhC0mSksZC%3DH5M8LSchj5e5OxpTAg%40mail.gmail.com

1ca2eb10