Commits · 627c79a1e87d9ec4a8a8a0c5be8564ba74e221ea · Abuhujair Javed / Postgres FD Implementation

22 Feb, 2022 1 commit

Add compute_query_id = regress · 627c79a1

Michael Paquier authored Feb 22, 2022

"regress" is a new mode added to compute_query_id aimed at facilitating
regression testing when a module computing query IDs is loaded into the
backend, like pg_stat_statements.  It works the same way as "auto",
meaning that query IDs are computed if a module enables it, except that
query IDs are hidden in EXPLAIN outputs to ensure regression output
stability.

Like any GUCs of the kind (force_parallel_mode, etc.), this new
configuration can be added to an instance's postgresql.conf, or just
passed down with PGOPTIONS at command level.  compute_query_id uses an
enum for its set of option values, meaning that this addition ensures
ABI compatibility.

Using this new configuration mode allows installcheck-world to pass when
running the tests on an instance with pg_stat_statements enabled,
stabilizing the test output while checking the paths doing query ID
computations.

Reported-by: Anton Melnikov
Reviewed-by: Julien Rouhaud
Discussion: https://postgr.es/m/1634283396.372373993@f75.i.mail.ru
Discussion: https://postgr.es/m/YgHlxgc/OimuPYhH@paquier.xyz
Backpatch-through: 14

627c79a1

21 Feb, 2022 1 commit

Fix temporary object cleanup failing due to toast access without snapshot. · 7bbfe599

Andres Freund authored Feb 19, 2022

When cleaning up temporary objects during process exit the cleanup could fail
with:
  FATAL: cannot fetch toast data without an active snapshot

The bug is caused by RemoveTempRelationsCallback() not setting up a
snapshot. If an object with toasted catalog data needs to be cleaned up,
init_toast_snapshot() could fail with the above error.

Most of the time however the the problem is masked due to cached catalog
snapshots being returned by GetOldestSnapshot(). But dropping an object can
cause catalog invalidations to be emitted. If no further catalog accesses are
necessary between the invalidation processing and the next toast datum
deletion, the bug becomes visible.

It's easy to miss this bug because it typically happens after clients
disconnect and the FATAL error just ends up in the log.

Luckily temporary table cleanup at the next use of the same temporary schema
or during DISCARD ALL does not have the same problem.

Fix the bug by pushing a snapshot in RemoveTempRelationsCallback(). Also add
isolation tests for temporary object cleanup, including objects with toasted
catalog data.

A future HEAD only commit will add more assertions.

Reported-By: Miles Delahunty
Author: Andres Freund
Discussion: https://postgr.es/m/CAOFAq3BU5Mf2TTvu8D9n_ZOoFAeQswuzk7yziAb7xuw_qyw5gw@mail.gmail.com
Backpatch: 10-

7bbfe599

20 Feb, 2022 2 commits

Remove most msys special processing in TAP tests · 8b5cd373

Andrew Dunstan authored Feb 20, 2022

Following migration of Windows buildfarm members running TAP tests to
use of ucrt64 perl for those tests, special processing for msys perl is
no longer necessary and so is removed.

Backpatch to release 10

Discussion: https://postgr.es/m/c65a8781-77ac-ea95-d185-6db291e1baeb@dunslane.net

8b5cd373

Remove PostgreSQL::Test::Utils::perl2host completely · 652ff988

Andrew Dunstan authored Feb 20, 2022

Commit f1ac4a74de disabled this processing, and as nothing has broken (as
expected) here we proceed to remove the routine and adjust all the call
sites.

Backpatch to release 10

Discussion: https://postgr.es/m/0ba775a2-8aa0-0d56-d780-69427cf6f33d@dunslane.net
Discussion: https://postgr.es/m/20220125023609.5ohu3nslxgoygihl@alap3.anarazel.de

652ff988

18 Feb, 2022 1 commit

Suppress warning about stack_base_ptr with late-model GCC. · 2e30d77a

Tom Lane authored Feb 17, 2022

GCC 12 complains that set_stack_base is storing the address of
a local variable in a long-lived pointer.  This is an entirely
reasonable warning (indeed, it just helped us find a bug);
but that behavior is intentional here.  We can work around it
by using __builtin_frame_address(0) instead of a specific local
variable; that produces an address a dozen or so bytes different,
in my testing, but we don't care about such a small difference.
Maybe someday a compiler lacking that function will start to issue
a similar warning, but we'll worry about that when it happens.

Patch by me, per a suggestion from Andres Freund.  Back-patch to
v12, which is as far back as the patch will go without some pain.
(Recently-established project policy would permit a back-patch as
far as 9.2, but I'm disinclined to expend the work until GCC 12
is much more widespread.)

Discussion: https://postgr.es/m/3773792.1645141467@sss.pgh.pa.us

2e30d77a

16 Feb, 2022 1 commit

Doc: Update documentation for modifying postgres_fdw foreign tables. · a9e186da

Etsuro Fujita authored Feb 16, 2022

Document that they can be modified using COPY as well.

Back-patch to v11 where commit 3d956d95 added support for COPY in
postgres_fdw.

a9e186da

14 Feb, 2022 2 commits

WAL log unchanged toasted replica identity key attributes. · 04645bbc

Amit Kapila authored Feb 14, 2022

Currently, during UPDATE, the unchanged replica identity key attributes
are not logged separately because they are getting logged as part of the
new tuple. But if they are stored externally then the untoasted values are
not getting logged as part of the new tuple and logical replication won't
be able to replicate such UPDATEs. So we need to log such attributes as
part of the old_key_tuple during UPDATE.

Reported-by: Haiying Tang
Author: Dilip Kumar and Amit Kapila
Reviewed-by: Alvaro Herrera, Haiying Tang, Andres Freund
Backpatch-through: 10
Discussion: https://postgr.es/m/OS0PR01MB611342D0A92D4F4BF26C0F47FB229@OS0PR01MB6113.jpnprd01.prod.outlook.com

04645bbc

Fix memory leak in IndexScan node with reordering · c76665ed

Alexander Korotkov authored Feb 14, 2022

Fix ExecReScanIndexScan() to free the referenced tuples while emptying the
priority queue.  Backpatch to all supported versions.

Discussion: https://postgr.es/m/CAHqSB9gECMENBQmpbv5rvmT3HTaORmMK3Ukg73DsX5H7EJV7jw%40mail.gmail.com
Author: Aliaksandr Kalenik
Reviewed-by: Tom Lane, Alexander Korotkov
Backpatch-through: 10

c76665ed

12 Feb, 2022 1 commit

Fix thinko in PQisBusy(). · ae27b1ac

Tom Lane authored Feb 12, 2022

In commit 1f39a1c0 I made PQisBusy consider conn->write_failed, but
that is now looking like complete brain fade. In the first place, the
logic is quite wrong: it ought to be like "and not" rather than "or".
This meant that once we'd gotten into a write_failed state, PQisBusy
would always return true, probably causing the calling application to
iterate its loop until PQconsumeInput returns a hard failure thanks
to connection loss. That's not what we want: the intended behavior
is to return an error PGresult, which the application probably has
much cleaner support for.

But in the second place, checking write_failed here seems like the
wrong thing anyway. The idea of the write_failed mechanism is to
postpone handling of a write failure until we've read all we can from
the server; so that flag should not interfere with input-processing
behavior. (Compare 7247e243.) What we *should* check for is
status = CONNECTION_BAD, ie, socket already closed. (Most places that
close the socket don't touch asyncStatus, but they do reset status.)
This primarily ensures that if PQisBusy() returns true then there is
an open socket, which is assumed by several call sites in our own
code, and probably other applications too.

While at it, fix a nearby thinko in libpq's my_sock_write: we should
only consult errno for res < 0, not res == 0. This is harmless since
pqsecure_raw_write would force errno to zero in such a case, but it
still could confuse readers.

Noted by Andres Freund. Backpatch to v12 where 1f39a1c0 came in.

Discussion: https://postgr.es/m/20220211011025.ek7exh6owpzjyudn@alap3.anarazel.de

ae27b1ac

11 Feb, 2022 1 commit

Don't use_physical_tlist for an IOS with non-returnable columns. · 277e744a

Tom Lane authored Feb 11, 2022

createplan.c tries to save a runtime projection step by specifying
a scan plan node's output as being exactly the table's columns, or
index's columns in the case of an index-only scan, if there is not a
reason to do otherwise. This logic did not previously pay attention
to whether an index's columns are returnable. That worked, sort of
accidentally, until commit 9a3ddeb51 taught setrefs.c to reject plans
that try to read a non-returnable column. I have no desire to loosen
setrefs.c's new check, so instead adjust use_physical_tlist() to not
try to optimize this way when there are non-returnable column(s).

Per report from Ryan Kelly. Like the previous patch, back-patch
to all supported branches.

Discussion: https://postgr.es/m/CAHUie24ddN+pDNw7fkhNrjrwAX=fXXfGZZEHhRuofV_N_ftaSg@mail.gmail.com

277e744a

10 Feb, 2022 5 commits

Make pg_ctl stop/restart/promote recheck postmaster aliveness. · 1e8c5cf7

Tom Lane authored Feb 10, 2022

"pg_ctl stop/restart" checked that the postmaster PID is valid just
once, as a side-effect of sending the stop signal, and then would
wait-till-timeout for the postmaster.pid file to go away.  This
neglects the case wherein the postmaster dies uncleanly after we
signal it.  Similarly, once "pg_ctl promote" has sent the signal,
it'd wait for the corresponding on-disk state change to occur
even if the postmaster dies.

I'm not sure how we've managed not to notice this problem, but it
seems to explain slow execution of the 017_shm.pl test script on AIX
since commit 4fdbf9af5, which added a speculative "pg_ctl stop" with
the idea of making real sure that the postmaster isn't there.  In the
test steps that kill-9 and then restart the postmaster, it's possible
to get past the initial signal attempt before kill() stops working
for the doomed postmaster.  If that happens, pg_ctl waited till
PGCTLTIMEOUT before giving up ... and the buildfarm's AIX members
have that set very high.

To fix, include a "kill(pid, 0)" test (similar to what
postmaster_is_alive uses) in these wait loops, so that we'll
give up immediately if the postmaster PID disappears.

While here, I chose to refactor those loops out of where they were.
do_stop() and do_restart() can perfectly well share one copy of the
wait-for-stop loop, and it seems desirable to put a similar function
beside that for wait-for-promote.

Back-patch to all supported versions, since pg_ctl's wait logic
is substantially identical in all, and we're seeing the slow test
behavior in all branches.

Discussion: https://postgr.es/m/20220210023537.GA3222837@rfd.leadboat.com

1e8c5cf7

Use gendef instead of pexports for building windows .def files · 92f60f53

Andrew Dunstan authored Feb 10, 2022

Modern msys systems lack pexports but have gendef instead, so use that.

Discussion: https://postgr.es/m/3ccde7a9-e4f9-e194-30e0-0936e6ad68ba@dunslane.net

Backpatch to release 9.4 to enable building with perl on older branches.
Before that pexports is not used for plperl.

92f60f53

Make timeout.c more robust against missed timer interrupts. · 2e211c16

Tom Lane authored Feb 10, 2022

Commit 09cf1d52 taught schedule_alarm() to not do anything if
the next requested event is after when we expect the next interrupt
to fire. However, if somehow an interrupt gets lost, we'll continue
to not do anything indefinitely, even after the "next interrupt" time
is obviously in the past. Thus, one missed interrupt can break
timeout scheduling for the life of the session. Michael Harris
reported a scenario where a bug in a user-defined function caused this
to happen, so you don't even need to assume kernel bugs exist to think
this is worth fixing. We can make things more robust at little cost
by detecting the case where signal_due_at is before "now" and forcing
a new setitimer call to occur. This isn't a completely bulletproof
fix of course; but in our typical usage pattern where we frequently set
timeouts and clear them before they are reached, the interrupt will
get re-enabled after at most one timeout interval, which with a little
luck will be before we really need it.

While here, let's mark signal_due_at as volatile, since the signal
handler can both examine and set it. I'm not sure there's any
actual risk given that signal_pending is already volatile, but
it's surely questionable.

Backpatch to v14 where this logic came in.

Michael Harris and Tom Lane

Discussion: https://postgr.es/m/CADofcAWbMrvgwSMqO4iG_iD3E2v8ZUrC-_crB41my=VMM02-CA@mail.gmail.com

2e211c16

Set SNI ClientHello extension to localhost in tests · 5f00ef06

Daniel Gustafsson authored Feb 10, 2022

The connection strings in the SSL client tests were using the host
set up from Cluster.pm which is a temporary pathname. When SNI is
enabled we pass the host to OpenSSL in order to set the server name
indication ClientHello extension via SSL_set_tlsext_host_name.

OpenSSL doesn't validate the hostname apart from checking the max
length, but LibreSSL checks for RFC 5890 conformance which results
in errors during testing as the pathname from Cluster.pm is not a
valid hostname.

Fix by setting the host explicitly to localhost, as that's closer
to the intent of the test.

Backpatch through 14 where SNI support came in.
Reported-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/17391-304f81bcf724b58b@postgresql.org
Backpatch-through: 14

5f00ef06

Use Test::Builder::todo_start(), replacing $::TODO. · 1a83297d

Noah Misch authored Feb 09, 2022

Some pre-2017 Test::More versions need perfect $Test::Builder::Level
maintenance to find the variable. Buildfarm member snapper reported an
overall failure that the file intended to hide via the TODO construct.
That trouble was reachable in v11 and v10. For later branches, this
serves as defense in depth. Back-patch to v10 (all supported versions).

Discussion: https://postgr.es/m/20220202055556.GB2745933@rfd.leadboat.com

1a83297d

09 Feb, 2022 2 commits

Test honestly for <sys/signalfd.h>. · c23461a2

Tom Lane authored Feb 09, 2022

Commit 6a2a70a0 supposed that any platform having <sys/epoll.h>
would also have <sys/signalfd.h>.  It turns out there are still a
few people using platforms where that's not so, so we'd better make
a separate configure probe for it.  But since it took this long to
notice, I'm content with the decision to not have a separate code
path for epoll-only machines; we'll just fall back to using poll()
for these stragglers.

Per gripe from Gabriela Serventi.  Back-patch to v14 where this
code came in.

Discussion: https://postgr.es/m/CAHOHWE-JjJDfcYuLAAEO7Jk07atFAU47z8TzHzg71gbC0aMy=g@mail.gmail.com

c23461a2

Remove ppport.h's broken re-implementation of eval_pv(). · e327291e

Tom Lane authored Feb 08, 2022

Recent versions of Devel::PPPort try to redefine eval_pv() to
dodge a bug in pre-5.31 Perl versions. Unfortunately the redefinition
fails on compilers that don't support statements nested within
expressions. However, we aren't actually interested in this bug fix,
since we always call eval_pv() with croak_on_error = FALSE.
So, until there's an upstream fix for this breakage, just comment
out the macro to revert to the older behavior.

Per report from Wei Sun, as well as previous buildfarm failure
on pademelon (which I'd unfortunately not looked at carefully
enough to understand the cause). Back-patch to all supported
versions, since we're using the same ppport.h in all.

Discussion: https://postgr.es/m/tencent_2EFCC8BA0107B6EC0F97179E019A8A43C806@qq.com
Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=pademelon&dt=2022-02-02%2001%3A22%3A58

e327291e

07 Feb, 2022 2 commits
- Stamp 14.2. · 78b1ddc7
  Tom Lane authored Feb 07, 2022
  
  78b1ddc7
- Translation updates · 3c879e61
  Peter Eisentraut authored Feb 07, 2022
```
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 063c497a909612d444c7c7188482db9aef86200f
```
  3c879e61
06 Feb, 2022 1 commit
- Release notes for 14.2, 13.6, 12.10, 11.15, 10.20. · 1d9ccc0a
  Tom Lane authored Feb 06, 2022
  
  1d9ccc0a
05 Feb, 2022 2 commits

Doc: be clearer that foreign-table partitions need user-added constraints. · d0cd7b77

Tom Lane authored Feb 05, 2022

A very well-informed user might deduce this from what we said already,
but I'd bet against it.  Lay it out explicitly.

While here, rewrite the comment about tuple routing to be more
intelligible to an average SQL user.

Per bug #17395 from Alexander Lakhin.  Back-patch to v11.  (The text
in this area is different in v10 and I'm not sufficiently excited
about this point to adapt the patch.)

Discussion: https://postgr.es/m/17395-8c326292078d1a57@postgresql.org

d0cd7b77

Test, don't just Assert, that mergejoin's inputs are in order. · d13a838e

Tom Lane authored Feb 05, 2022

There are two Asserts in nodeMergejoin.c that are reachable if
the input data is not in the expected order.  This seems way too
fragile.  Alexander Lakhin reported a case where the assertions
could be triggered with misconfigured foreign-table partitions,
and bitter experience with unstable operating system collation
definitions suggests another easy route to hitting them.  Neither
Assert is in a place where we can't afford one more test-and-branch,
so replace 'em with plain test-and-elog logic.

Per bug #17395.  While the reported symptom is relatively recent,
collation changes could happen anytime, so back-patch to all
supported branches.

Discussion: https://postgr.es/m/17395-8c326292078d1a57@postgresql.org

d13a838e

04 Feb, 2022 1 commit

First-draft release notes for 14.2. · ab22eea8

Tom Lane authored Feb 04, 2022

As usual, the release notes for older branches will be made by cutting
these down, but put them up for community review first.

ab22eea8

03 Feb, 2022 3 commits

Fix compiler warning in non-assert builds, introduced in f862d57057f. · 2a3958e4
Andres Freund authored Feb 03, 2022
```
Discussion: https://postgr.es/m/20220203183655.ralgkh54sdcgysmn@alap3.anarazel.de
Backpatch: 14-, like f862d57057f
```
2a3958e4

Further fix for EvalPlanQual with mix of local and foreign partitions. · 7b0cec2f

Etsuro Fujita authored Feb 03, 2022

We assume that direct-modify ForeignScan nodes cannot be re-evaluated
during EvalPlanQual processing, but the rework for inherited
UPDATE/DELETE in commit 86dc9005 changed things, without considering
that, so that such ForeignScan nodes get called as part of the
EvalPlanQual subtree during EvalPlanQual processing in the case of an
inherited UPDATE/DELETE where the inheritance set contains foreign
target relations. To avoid re-evaluating such ForeignScan nodes during
EvalPlanQual processing, commit c3928b467 modified nodeForeignscan.c,
but the assumption made there that ExecForeignScan() should never be
called for such ForeignScan nodes during EvalPlanQual processing turned
out to be wrong in some cases, leading to a segmentation fault or a
"cannot re-evaluate a Foreign Update or Delete during EvalPlanQual"
error.

Fix by modifying nodeForeignscan.c further to avoid re-evaluating such
ForeignScan nodes even in ExecForeignScan()/ExecReScanForeignScan()
during EvalPlanQual processing. Since this makes non-reachable the
test-and-elog added to ForeignNext() by commit c3928b467 that produced
the aforesaid error, convert the test-and-elog to an Assert.

Per bug #17355 from Alexander Lakhin. Back-patch to v14 where both
commits came in.

Patch by me, reviewed and tested by Alexander Lakhin and Amit Langote.

Discussion: https://postgr.es/m/17355-de8e362eb7001a96@postgresql.org

7b0cec2f

doc: clarify syntax notation, particularly parentheses · 25560761

Bruce Momjian authored Feb 02, 2022

Also move TCL syntax to the PL/tcl section.

Reported-by: davs2rt@gmail.com

Discussion: https://postgr.es/m/164308146320.12460.3590769444508751574@wrigleys.postgresql.org

Backpatch-through: 10

25560761

02 Feb, 2022 2 commits

doc: Fix mistake in PL/Python documentation · ee57467c
Peter Eisentraut authored Feb 02, 2022
```
Small thinko introduced by 94aceed3

Reported-by: nassehk@gmail.com
```
ee57467c

Replace use of deprecated Python module distutils.sysconfig, take 2. · 803f0b17

Tom Lane authored Feb 01, 2022

With Python 3.10, configure spits out warnings about the module
distutils.sysconfig being deprecated and scheduled for removal in
Python 3.12. Change the uses in configure to use the module sysconfig
instead. The logic stays largely the same, although we have to
rely on INCLUDEPY instead of the deprecated get_python_inc function.

Note that sysconfig exists since Python 2.7, so this moves the
minimum required version up from Python 2.6 (or 2.4, before v13).
Also, sysconfig didn't exist in Python 3.1, so the minimum 3.x
version is now 3.2.

Back-patch of commit bd233bdd8 into all supported branches.

In v10, this also includes back-patching v11's beff4bb9, primarily
because this opinion is clearly out-of-date:

While at it, get rid of the code's assumption that both the major and
minor numbers contain exactly one digit. That will foreseeably be
broken by Python 3.10 in perhaps four or five years. That's far enough
out that we probably don't need to back-patch this.

Peter Eisentraut, Tom Lane, Andres Freund

Discussion: https://postgr.es/m/c74add3c-09c4-a9dd-1a03-a846e5b2fc52@enterprisedb.com

803f0b17

31 Jan, 2022 4 commits

Revert "plperl: Fix breakage of c89f409749c in back branches." · 00fdfde6

Tom Lane authored Jan 31, 2022

This reverts commits d81cac47 et al.  We shouldn't need that
hack after the preceding commits.

Discussion: https://postgr.es/m/20220131015130.shn6wr2fzuymerf6@alap3.anarazel.de

00fdfde6

plperl: update ppport.h to Perl 5.34.0. · 2d44912c

Tom Lane authored Jan 31, 2022

Also apply the changes suggested by running
perl ppport.h --compat-version=5.8.0

And remove some no-longer-required NEED_foo declarations.

Dagfinn Ilmari Mannsåker

Back-patch of commit 05798c9f7 into all supported branches.
At the time we thought this update was mostly cosmetic, but the
lack of it has caused trouble, while the patch itself hasn't.

Discussion: https://postgr.es/m/87y278s6iq.fsf@wibble.ilmari.org
Discussion: https://postgr.es/m/20220131015130.shn6wr2fzuymerf6@alap3.anarazel.de

2d44912c

plperl: Fix breakage of c89f409749c in back branches. · d81cac47

Andres Freund authored Jan 30, 2022

ppport.h was only updated in 05798c9f7f0 (master). Unfortunately my commit
c89f409749c uses PERL_VERSION_LT which came in with that update. Breaking most
buildfarm animals.

I should have noticed that...

We might want to backpatch the ppport update instead, but for now lets get the
buildfarm green again.

Discussion: https://postgr.es/m/20220131015130.shn6wr2fzuymerf6@alap3.anarazel.de
Backpatch: 10-14, master doesn't need it

d81cac47

plperl: windows: Use Perl_setlocale on 5.28+, fixing compile failure. · 8484e381

Andres Freund authored Jan 30, 2022

For older versions we need our own copy of perl's setlocale(), because it was
not exposed (why we need the setlocale in the first place is explained in
plperl_init_interp) . The copy stopped working in 5.28, as some of the used
macros are not public anymore.  But Perl_setlocale is available in 5.28, so
use that.

Author: Victor Wagner <vitus@wagner.pp.ru>
Reviewed-By: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://postgr.es/m/20200501134711.08750c5f@antares.wagner.home
Backpatch: all versions

8484e381

29 Jan, 2022 2 commits

Fix failure to validate the result of select_common_type(). · c025067f

Tom Lane authored Jan 29, 2022

Although select_common_type() has a failure-return convention, an
apparent successful return just provides a type OID that *might* work
as a common supertype; we've not validated that the required casts
actually exist. In the mainstream use-cases that doesn't matter,
because we'll proceed to invoke coerce_to_common_type() on each input,
which will fail appropriately if the proposed common type doesn't
actually work. However, a few callers didn't read the (nonexistent)
fine print, and thought that if they got back a nonzero OID then the
coercions were sure to work.

This affects in particular the recently-added "anycompatible"
polymorphic types; we might think that a function/operator using
such types matches cases it really doesn't. A likely end result
of that is unexpected "ambiguous operator" errors, as for example
in bug #17387 from James Inform. Another, much older, case is that
the parser might try to transform an "x IN (list)" construct to
a ScalarArrayOpExpr even when the list elements don't actually have
a common supertype.

It doesn't seem desirable to add more checking to select_common_type
itself, as that'd just slow down the mainstream use-cases. Instead,
write a separate function verify_common_type that performs the
missing checks, and add a call to that where necessary. Likewise add
verify_common_type_from_oids to go with select_common_type_from_oids.

Back-patch to v13 where the "anycompatible" types came in. (The
symptom complained of in bug #17387 doesn't appear till v14, but
that's just because we didn't get around to converting || to use
anycompatible till then.) In principle the "x IN (list)" fix could
go back all the way, but I'm not currently convinced that it makes
much difference in real-world cases, so I won't bother for now.

Discussion: https://postgr.es/m/17387-5dfe54b988444963@postgresql.org

c025067f

Fix incorrect memory context switch in COPY TO execution · b30282fc

Michael Paquier authored Jan 29, 2022

c532d15d has split the logic of COPY commands into multiple files, one
change being to move the internals of BeginCopy() to BeginCopyTo().
Originally the code was written so as we'd switch back-and-forth between
the current execution memory context and the dedicated memory context
for the COPY command, and this refactoring has introduced an extra
switch to the current memory context from the COPY context once
BeginCopyTo() is done with the past logic coming from BeginCopy().

The code was correctly doing the analyze, rewrite and planning phases in
the COPY context, but it was not assigning "copy_file" (FILE* used when
copying to a source file) and "filename" in the COPY context, making the
COPY status data inconsistent.

Author: Bharath Rupireddy
Reviewed-by: Japin Li
Discussion: https://postgr.es/m/CALj2ACWvVa69foi9jhHFY=2BuHxAoYboyE+vXQTARwxZcJnVrQ@mail.gmail.com
Backpatch-through: 14

b30282fc

28 Jan, 2022 2 commits

Fix typo in comment. · d99166ed
Etsuro Fujita authored Jan 28, 2022

d99166ed

Prevent memory context logging from sending log message to connected client. · 6e7ee55e

Fujii Masao authored Jan 28, 2022

When pg_log_backend_memory_contexts() is executed, the target backend
should use LOG_SERVER_ONLY to log its memory contexts, to prevent them
from being sent to its connected client regardless of client_min_messages.
But previously the backend unexpectedly used LOG to log the message
"logging memory contexts of PID %d" and it could be sent to the client.
This is a bug in memory context logging.

To fix the bug, this commit changes that message so that it's logged with
LOG_SERVER_ONLY.

Back-patch to v14 where pg_log_backend_memory_contexts() was added.

Author: Fujii Masao
Reviewed-by: Bharath Rupireddy, Atsushi Torikoshi
Discussion: https://postgr.es/m/82c12f36-86f7-5e72-79af-7f5c37f6cad7@oss.nttdata.com

6e7ee55e

27 Jan, 2022 4 commits

Fix ordering of XIDs in ProcArrayApplyRecoveryInfo · fb2f8e53

Tomas Vondra authored Jan 27, 2022

Commit 8431e296 reworked ProcArrayApplyRecoveryInfo to sort XIDs
before adding them to KnownAssignedXids. But the XIDs are sorted using
xidComparator, which compares the XIDs simply as uint32 values, not
logically. KnownAssignedXidsAdd() however expects XIDs in logical order,
and calls TransactionIdFollowsOrEquals() to enforce that. If there are
XIDs for which the two orderings disagree, an error is raised and the
recovery fails/restarts.

Hitting this issue is fairly easy - you just need two transactions, one
started before the 4B limit (e.g. XID 4294967290), the other sometime
after it (e.g. XID 1000). Logically (4294967290 <= 1000) but when
compared using xidComparator we try to add them in the opposite order.
Which makes KnownAssignedXidsAdd() fail with an error like this:

ERROR: out-of-order XID insertion in KnownAssignedXids

This only happens during replica startup, while processing RUNNING_XACTS
records to build the snapshot. Once we reach STANDBY_SNAPSHOT_READY, we
skip these records. So this does not affect already running replicas,
but if you restart (or create) a replica while there are transactions
with XIDs for which the two orderings disagree, you may hit this.

Long-running transactions and frequent replica restarts increase the
likelihood of hitting this issue. Once the replica gets into this state,
it can't be started (even if the old transactions are terminated).

Fixed by sorting the XIDs logically - this is fine because we're dealing
with normal XIDs (because it's XIDs assigned to backends) and from the
same wraparound epoch (otherwise the backends could not be running at
the same time on the primary node). So there are no problems with the
triangle inequality, which is why xidComparator compares raw values.

Investigation and root cause analysis by Abhijit Menon-Sen. Patch by me.

This issue is present in all releases since 9.4, however releases up to
9.6 are EOL already so backpatch to 10 only.

Reviewed-by: Abhijit Menon-Sen
Reviewed-by: Alvaro Herrera
Backpatch-through: 10
Discussion: https://postgr.es/m/36b8a501-5d73-277c-4972-f58a4dce088a%40enterprisedb.com

fb2f8e53

Improve msys2 detection for TAP tests · 999dc1d2

Andrew Dunstan authored Jan 27, 2022

Perl instances on some msys toolchains (e.g. UCRT64) have their
configured osname set to 'MSWin32' rather than 'msys'.  The test for
the msys2 platform is adjusted accordingly.

Backpatch to release 14.

999dc1d2

postgres_fdw: Fix handling of a pending asynchronous request in postgresReScanForeignScan(). · d1cca944

Etsuro Fujita authored Jan 27, 2022

Commit 27e1f145 failed to process a pending asynchronous request made
for a given ForeignScan node in postgresReScanForeignScan() (if any) in
cases where we would only reset the next_tuple counter in that function,
contradicting the assumption that there should be no pending
asynchronous requests that have been made for async-capable subplans for
the parent Append node after ReScan. This led to an assert failure in
an assert-enabled build. I think this would also lead to mis-rewinding
the cursor in that function in the case where we have already fetched
one batch for the ForeignScan node and the asynchronous request has been
made for the second batch, because even in that case we would just reset
the counter when called from that function, so we would fail to execute
MOVE BACKWARD ALL.

To fix, modify that function to process the asynchronous request before
restarting the scan.

While at it, add a comment to a function to match other places.

Per bug #17344 from Alexander Lakhin. Back-patch to v14 where the
aforesaid commit came in.

Patch by me. Test case by Alexander Lakhin, adjusted by me. Reviewed
and tested by Alexander Lakhin and Dmitry Dolgov.

Discussion: https://postgr.es/m/17344-226b78b00de73a7e@postgresql.org

d1cca944

On sparc64+ext4, suppress test failures from known WAL read failure. · d94a95cc

Noah Misch authored Jan 26, 2022

Buildfarm members kittiwake, tadarida and snapper began to fail
frequently when commits 3cd9c3b921977272e6650a5efbeade4203c4bca2 and
f47ed79cc8a0cfa154dc7f01faaf59822552363f added tests of concurrency, but
the problem was reachable before those commits. Back-patch to v10 (all
supported versions).

Discussion: https://postgr.es/m/20220116210241.GC756210@rfd.leadboat.com

d94a95cc