Commits · 08a9e7a8c7917233926802aaea94a5529a747a50 · Abuhujair Javed / Postgres FD Implementation

20 Apr, 2022 1 commit

Fix breakage in AlterFunction(). · 08a9e7a8

Tom Lane authored Apr 19, 2022

An ALTER FUNCTION command that tried to update both the function's
proparallel property and its proconfig list failed to do the former,
because it stored the new proparallel value into a tuple that was
no longer the interesting one.  Carelessness in 7aea8e4f.

(I did not bother with a regression test, because the only likely
future breakage would be for someone to ignore the comment I added
and add some other field update after the heap_modify_tuple step.
A test using existing function properties could not catch that.)

Per report from Bryn Llewellyn.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/8AC9A37F-99BD-446F-A2F7-B89AD0022774@yugabyte.com

08a9e7a8

19 Apr, 2022 2 commits

Fix extract epoch from interval calculation · 7a8d8219

Peter Eisentraut authored Apr 19, 2022

The new numeric code for extract epoch from interval accidentally
truncated the DAYS_PER_YEAR value to an integer, leading to results
that mismatched the floating-point interval_part calculations.

The commit a2da77cd that introduced
this actually contains the regression test change that this reverts.
I suppose this was missed at the time.
Reported-by: Joseph Koshakow <koshy44@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/CAAvxfHd5n%3D13NYA2q_tUq%3D3%3DSuWU-CufmTf-Ozj%3DfrEgt7pXwQ%40mail.gmail.com

7a8d8219

Fix the check to limit sync workers. · c9dea58e

Amit Kapila authored Apr 19, 2022

We don't allow to invoke more sync workers once we have reached the sync
worker limit per subscription. But the check to enforce this also doesn't
allow to launch an apply worker if it gets restarted.

This code was introduced by commit de438971 but we caught the problem
only with the test added by recent commit c91f71b9dc which started failing
occasionally in the buildfarm.

As per buildfarm.
Diagnosed-by: Amit Kapila, Masahiko Sawada, Tomas Vondra
Author: Amit Kapila
Backpatch-through: 10
Discussion: https://postgr.es/m/CAH2L28vddB_NFdRVpuyRBJEBWjz4BSyTB=_ektNRH8NJ1jf95g@mail.gmail.com
	    https://postgr.es/m/f90d2b03-4462-ce95-a524-d91464e797c8@enterprisedb.com

c9dea58e

18 Apr, 2022 2 commits

Avoid invalid array reference in transformAlterTableStmt(). · e805735a

Tom Lane authored Apr 18, 2022

Don't try to look at the attidentity field of system attributes,
because they're not there in the TupleDescAttr array.  Sometimes
this is harmless because we accidentally pick up a zero, but
otherwise we'll report "no owned sequence found" from an attempt
to alter a system attribute.  (It seems possible that a SIGSEGV
could occur, too, though I've not seen it in testing.)

It's not in this function's charter to complain that you can't
alter a system column, so instead just hard-wire an assumption
that system attributes aren't identities.  I didn't bother with
a regression test because the appearance of the bug is very
erratic.

Per bug #17465 from Roman Zharkov.  Back-patch to all supported
branches.  (There's not actually a live bug before v12, because
before that get_attidentity() did the right thing anyway.
But for consistency I changed the test in the older branches too.)

Discussion: https://postgr.es/m/17465-f2a554a6cb5740d3@postgresql.org

e805735a

Fix race in TAP test 002_archiving.pl when restoring history file · 8bcf90c7

Michael Paquier authored Apr 18, 2022

This test, introduced in df86e52c, uses a second standby to check that
it is able to remove correctly RECOVERYHISTORY and RECOVERYXLOG at the
end of recovery.  This standby uses the archives of the primary to
restore its contents, with some of the archive's contents coming from
the first standby previously promoted.  In slow environments, it was
possible that the test did not check what it should, as the history file
generated by the promotion of the first standby may not be stored yet on
the archives the second standby feeds on.  So, it could be possible that
the second standby selects an incorrect timeline, without restoring a
history file at all.

This commits adds a wait phase to make sure that the history file
required by the second standby is archived before this cluster is
created.  This relies on poll_query_until() with pg_stat_file() and an
absolute path, something not supported in REL_10_STABLE.

While on it, this adds a new test to check that the history file has
been restored by looking at the logs of the second standby.  This
ensures that a RECOVERYHISTORY, whose removal needs to be checked,
is created in the first place.  This should make the test more robust.

This test has been introduced by df86e52c, but it came in light as an
effect of the bug fixed by acf1dd42, where the extra restore_command
calls made the test much slower.

Reported-by: Andres Freund
Discussion: https://postgr.es/m/YlT23IvsXkGuLzFi@paquier.xyz
Backpatch-through: 11

8bcf90c7

17 Apr, 2022 1 commit

Add a temp-install prerequisite to src/interfaces/ecpg "checktcp". · acd0eb63

Noah Misch authored Apr 16, 2022

The target failed, tested $PATH binaries, or tested a stale temporary
installation.  Commit c66b438d missed
this.  Back-patch to v10 (all supported versions).

acd0eb63

14 Apr, 2022 2 commits

Rethink the delay-checkpoint-end mechanism in the back-branches. · 10520f43

Robert Haas authored Apr 14, 2022

The back-patch of commit bbace569 had
the unfortunate effect of changing the layout of PGPROC in the
back-branches, which could break extensions. This happened because it
changed the delayChkpt from type bool to type int. So, change it back,
and add a new bool delayChkptEnd field instead. The new field should
fall within what used to be padding space within the struct, and so
hopefully won't cause any extensions to break.

Per report from Markus Wanner and discussion with Tom Lane and others.

Patch originally by me, somewhat revised by Markus Wanner per a
suggestion from Michael Paquier. A very similar patch was developed
by Kyotaro Horiguchi, but I failed to see the email in which that was
posted before writing one of my own.

Discussion: http://postgr.es/m/CA+Tgmoao-kUD9c5nG5sub3F7tbo39+cdr8jKaOVEs_1aBWcJ3Q@mail.gmail.com
Discussion: http://postgr.es/m/20220406.164521.17171257901083417.horikyota.ntt@gmail.com

10520f43

pageinspect: Fix handling of all-zero pages · df6bbe73

Michael Paquier authored Apr 14, 2022

Getting from get_raw_page() an all-zero page is considered as a valid
case by the buffer manager and it can happen for example when finding a
corrupted page with zero_damaged_pages enabled (using zero_damaged_pages
to look at corrupted pages happens), or after a crash when a relation
file is extended before any WAL for its new data is generated (before a
vacuum or autovacuum job comes in to do some cleanup).

However, all the functions of pageinspect, as of the index AMs (except
hash that has its own idea of new pages), heap, the FSM or the page
header have never worked with all-zero pages, causing various crashes
when going through the page internals.

This commit changes all the pageinspect functions to be compliant with
all-zero pages, where the choice is made to return NULL or no rows for
SRFs when finding a new page.  get_raw_page() still works the same way,
returning a batch of zeros in the bytea of the page retrieved.  A hard
error could be used but NULL, while more invasive, is useful when
scanning relation files in full to get a batch of results for a single
relation in one query.  Tests are added for all the code paths
impacted.

Reported-by: Daria Lepikhova
Author: Michael Paquier
Discussion: https://postgr.es/m/561e187b-3549-c8d5-03f5-525c14e65bd0@postgrespro.ru
Backpatch-through: 10

df6bbe73

13 Apr, 2022 2 commits

Prevent access to no-longer-pinned buffer in heapam_tuple_lock(). · c590e514

Tom Lane authored Apr 13, 2022

heap_fetch() used to have a "keep_buf" parameter that told it to return
ownership of the buffer pin to the caller after finding that the
requested tuple TID exists but is invisible to the specified snapshot.
This was thoughtlessly removed in commit 5db6df0c, which broke
heapam_tuple_lock() (formerly EvalPlanQualFetch) because that function
needs to do more accesses to the tuple even if it's invisible. The net
effect is that we would continue to touch the page for a microsecond or
two after releasing pin on the buffer. Usually no harm would result;
but if a different session decided to defragment the page concurrently,
we could see garbage data and mistakenly conclude that there's no newer
tuple version to chain up to. (It's hard to say whether this has
happened in the field. The bug was actually found thanks to a later
change that allowed valgrind to detect accesses to non-pinned buffers.)

The most reasonable way to fix this is to reintroduce keep_buf,
although I made it behave slightly differently: buffer ownership
is passed back only if there is a valid tuple at the requested TID.
In HEAD, we can just add the parameter back to heap_fetch().
To avoid an API break in the back branches, introduce an additional
function heap_fetch_extended() in those branches.

In HEAD there is an additional, less obvious API change: tuple->t_data
will be set to NULL in all cases where buffer ownership is not returned,
in particular when the tuple exists but fails the time qual (and
!keep_buf). This is to defend against any other callers attempting to
access non-pinned buffers. We concluded that making that change in back
branches would be more likely to introduce problems than cure any.

In passing, remove a comment about heap_fetch that was obsoleted by
9a8ee1dc.

Per bug #17462 from Daniil Anisimov. Back-patch to v12 where the bug
was introduced.

Discussion: https://postgr.es/m/17462-9c98a0f00df9bd36@postgresql.org

c590e514

Docs: wording improvement for compute_query_id = regress · ea669b80

David Rowley authored Apr 13, 2022

It's more accurate to say that the query identifier is not shown when
compute_query_id = regress rather than to say it is hidden.

This change (ebf6c5249) appeared in v14, so it makes sense to backpatch
this small adjustment to keep the documents consistent between v14 and
master.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
Backpatch-through: 14, where compute_query_id = regress was added

ea669b80

12 Apr, 2022 3 commits

Docs: adjust pg_upgrade syntax to mark -B as optional · e286be5d

David Rowley authored Apr 13, 2022

This was made optional in 959f6d6a.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
Backpatch-through: 13, where -B was made optional

e286be5d

Doc: tweak textsearch.sgml for SEO purposes. · 8320a34d

Tom Lane authored Apr 12, 2022

Google seems to like to return textsearch.html for queries about
GIN and GiST indexes, even though it's not a primary reference
for either.  It seems likely that that's because those keywords
appear in the page title.  Since "GIN and GiST Index Types" is
not a very apposite title for this material anyway, rename the
section in hopes of stopping that.

Also provide explicit links to the GIN and GiST chapters, to help
anyone who finds their way to this page regardless.

Per gripe from Jan Piotrowski.  Back-patch to supported branches.
(Unfortunately Google is likely to continue returning the 9.1
version of this page, but improving that situation is a matter
for the www team.)

Discussion: https://postgr.es/m/164978902252.1276550.9330175733459697101@wrigleys.postgresql.org

8320a34d

Docs: avoid confusing use of the word "synchronized" · 3a95dfe4

David Rowley authored Apr 13, 2022

It's misleading to call the data directory the "synchronized data
directory" when discussing a crash scenario when using pg_rewind's
--no-sync option.  Here we just remove the word "synchronized" to avoid
any possible confusion.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com
Backpatch-through: 12, where --no-sync was added

3a95dfe4

06 Apr, 2022 2 commits

Suppress "variable 'pagesaving' set but not used" warning. · a65747b1

Tom Lane authored Apr 06, 2022

With asserts disabled, late-model clang notices that this variable
is incremented but never otherwise read.

Discussion: https://postgr.es/m/3171401.1649275153@sss.pgh.pa.us

a65747b1

Remove race condition in 022_crash_temp_files.pl test. · 9a722994

Tom Lane authored Apr 05, 2022

It's possible for the query that "waits for restart" to complete a
successful iteration before the postmaster has noticed its SIGKILL'd
child and begun the restart cycle.  (This is a bit hard to believe
perhaps, but it's been seen at least twice in the buildfarm, mainly
on ancient platforms that likely have quirky schedulers.)

To provide a more secure interlock, wait for the other session
we're using to report that it's been forcibly shut down.

Patch by me, based on a suggestion from Andres Freund.
Back-patch to v14 where this test case came in.

Discussion: https://postgr.es/m/1801850.1649047827@sss.pgh.pa.us

9a722994

05 Apr, 2022 1 commit

Update some tests in 013_crash_restart.pl. · 8803df4e

Tom Lane authored Apr 04, 2022

The expected backend message after SIGQUIT changed in commit
7e784d1d, but we missed updating this test case.  Also, experience
shows that we might sometimes get "could not send data to server"
instead of either of the libpq messages the test is looking for.

Per report from Mark Dilger.  Back-patch to v14 where the
backend message changed.

Discussion: https://postgr.es/m/17BD82D7-49AC-40C9-8204-E7ADD30321A0@enterprisedb.com

8803df4e

02 Apr, 2022 2 commits

Doc: Remove MultiXact wraparound section link. · 32558a8b

Peter Geoghegan authored Apr 02, 2022

Remove circular "25.1.5.1. Multixacts And Wraparound" link that
references the section that the link itself appears in.  An explanation
of MultiXactId age appears only a few sentences before the link, so
there's no question that the link is superfluous at best.

Oversight in commit d5409295.

Author: Peter Geoghegan <pg@bowt.ie>
Backpatch: 14-

32558a8b

Remove obsolete comment · d480ae06
Peter Eisentraut authored Apr 02, 2022
```
accidentally left behind by 4cb658af
```
d480ae06

01 Apr, 2022 1 commit

libpq: Fix pkg-config without OpenSSL · 7a278927

Peter Eisentraut authored Apr 01, 2022

Do not add OpenSSL dependencies to libpq pkg-config file if OpenSSL is
not enabled. Oversight in beff361b.

Author: Fabrice Fontaine <fontaine.fabrice@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/20220331163759.32665-1-fontaine.fabrice%40gmail.com

7a278927

31 Mar, 2022 3 commits

Fix postgres_fdw to check shippability of sort clauses properly. · 9f9489aa

Tom Lane authored Mar 31, 2022

postgres_fdw would push ORDER BY clauses to the remote side without
verifying that the sort operator is safe to ship.  Moreover, it failed
to print a suitable USING clause if the sort operator isn't default
for the sort expression's type.  The net result of this is that the
remote sort might not have anywhere near the semantics we expect,
which'd be disastrous for locally-performed merge joins in particular.

We addressed similar issues in the context of ORDER BY within an
aggregate function call in commit 7012b132, but failed to notice
that query-level ORDER BY was broken.  Thus, much of the necessary
logic already existed, but it requires refactoring to be usable
in both cases.

Back-patch to all supported branches.  In HEAD only, remove the
core code's copy of find_em_expr_for_rel, which is no longer used
and really should never have been pushed into equivclass.c in the
first place.

Ronan Dunklau, per report from David Rowley;
reviews by David Rowley, Ranier Vilela, and myself

Discussion: https://postgr.es/m/CAApHDvr4OeC2DBVY--zVP83-K=bYrTD7F8SZDhN4g+pj2f2S-A@mail.gmail.com

9f9489aa

Add missing newline in one libpq error message. · 402279af

Tom Lane authored Mar 31, 2022

Oversight in commit a59c79564.  Back-patch, as that was.
Noted by Peter Eisentraut.

Discussion: https://postgr.es/m/7f85ef6d-250b-f5ec-9867-89f0b16d019f@enterprisedb.com

402279af

doc: Fix typo in ANALYZE documentation · c5479178

Daniel Gustafsson authored Mar 31, 2022

Commit 61fa6ca79b3 accidentally wrote constrast instead of contrast.

Backpatch-through: 10
Discussion: https://postgr.es/m/88903179-5ce2-3d4d-af43-7830372bdcb6@enterprisedb.com

c5479178

30 Mar, 2022 1 commit
- Fix typo in comment. · 637afee3
  Etsuro Fujita authored Mar 30, 2022
  
  637afee3
29 Mar, 2022 1 commit

Revert "Fix replay of create database records on standby" · adc943b4

Alvaro Herrera authored Mar 29, 2022

This reverts commit 49d9cfc68bf4. The approach taken by this patch has
problems, so we'll come up with a radically different fix.

Discussion: https://postgr.es/m/CA+TgmoYcUPL+WOJL2ZzhH=zmrhj0iOQ=iCFM0SuYqBbqZEamEg@mail.gmail.com

adc943b4

28 Mar, 2022 3 commits

Document autoanalyze limitations for partitioned tables · 6b262f35

Tomas Vondra authored Mar 28, 2022

When dealing with partitioned tables, counters for partitioned tables
are not updated when modifying child tables. This means autoanalyze may
not update optimizer statistics for the parent relations, which can
result in poor plans for some queries.

It's worth documenting this limitation, so that people are aware of it
and can take steps to mitigate it (e.g. by setting up a script executing
ANALYZE regularly).

Backpatch to v10. Older branches are affected too, of couse, but we no
longer maintain those.

Author: Justin Pryzby
Reviewed-by: Zhihong Yu, Tomas Vondra
Backpatch-through: 10
Discussion: https://postgr.es/m/20210913035409.GA10647%40telsasoft.com

6b262f35

Fix NULL input behaviour of pg_stat_get_replication_slot(). · c1a0d7d1

Andres Freund authored Mar 27, 2022

pg_stat_get_replication_slot() accidentally was marked as non-strict, crashing
when called with NULL input. As it's already released, introduce an explicit
NULL check in 14, fix the catalog in HEAD.

Bumps catversion in HEAD.

Discussion: https://postgr.es/m/20220326212432.s5n2maw6kugnpyxw@alap3.anarazel.de
Backpatch: 14-, where replication slot stats were introduced

c1a0d7d1

waldump: fix use-after-free in search_directory(). · 6839aa7a

Andres Freund authored Mar 23, 2022

After closedir() dirent->d_name is not valid anymore. As there alerady are a
few places relying on the limited lifetime of pg_waldump, do so here as well,
and just pg_strdup() the string.

The bug was introduced in fc49e24f.

Found by UBSan, run locally.

Backpatch: 11-, like fc49e24f itself.

6839aa7a

27 Mar, 2022 2 commits

Fix breakage of get_ps_display() in the PS_USE_NONE case. · 3f7a59c5

Tom Lane authored Mar 27, 2022

Commit 8c6d30f2 caused this function to fail to set *displen
in the PS_USE_NONE code path.  If the variable's previous value
had been negative, that'd lead to a memory clobber at some call
sites.  We'd managed not to notice due to very thin test coverage
of such configurations, but this appears to explain buildfarm member
lorikeet's recent struggles.

Credit to Andrew Dunstan for spotting the problem.  Back-patch
to v13 where the bug was introduced.

Discussion: https://postgr.es/m/136102.1648320427@sss.pgh.pa.us

3f7a59c5

pageinspect: Add more sanity checks to prevent out-of-bound reads · 27d38444

Michael Paquier authored Mar 27, 2022

A couple of code paths use the special area on the page passed by the
function caller, expecting to find some data in it.  However, feeding
an incorrect page can lead to out-of-bound reads when trying to access
the page special area (like a heap page that has no special area,
leading PageGetSpecialPointer() to grab a pointer outside the allocated
page).

The functions used for hash and btree indexes have some protection
already against that, while some other functions using a relation OID
as argument would make sure that the access method involved is correct,
but functions taking in input a raw page without knowing the relation
the page is attached to would run into problems.

This commit improves the set of checks used in the code paths of BRIN,
btree (including one check if a leaf page is found with a non-zero
level), GIN and GiST to verify that the page given in input has a
special area size that fits with each access method, which is done
though PageGetSpecialSize(), becore calling PageGetSpecialPointer().

The scope of the checks done is limited to work with pages that one
would pass after getting a block with get_raw_page(), as it is possible
to craft byteas that could bypass existing code paths.  Having too many
checks would also impact the usability of pageinspect, as the existing
code is very useful to look at the content details in a corrupted page,
so the focus is really to avoid out-of-bound reads as this is never a
good thing even with functions whose execution is limited to
superusers.

The safest approach could be to rework the functions so as these fetch a
block using a relation OID and a block number, but there are also cases
where using a raw page is useful.

Tests are added to cover all the code paths that needed such checks, and
an error message for hash indexes is reworded to fit better with what
this commit adds.

Reported-By: Alexander Lakhin
Author: Julien Rouhaud, Michael Paquier
Discussion: https://postgr.es/m/16527-ef7606186f0610a1@postgresql.org
Discussion: https://postgr.es/m/561e187b-3549-c8d5-03f5-525c14e65bd0@postgrespro.ru
Backpatch-through: 10

27d38444

26 Mar, 2022 1 commit

Suppress compiler warning in relptr_store(). · 0144c9c7

Tom Lane authored Mar 26, 2022

clang 13 with -Wextra warns that "performing pointer subtraction with
a null pointer has undefined behavior" in the places where freepage.c
tries to set a relptr variable to constant NULL.  This appears to be
a compiler bug, but it's unlikely to get fixed instantly.  Fortunately,
we can work around it by introducing an inline support function, which
seems like a good change anyway because it removes the macro's existing
double-evaluation hazard.

Backpatch to v10 where this code was introduced.

Patch by me, based on an idea of Andres Freund's.

Discussion: https://postgr.es/m/48826.1648310694@sss.pgh.pa.us

0144c9c7

25 Mar, 2022 2 commits

Harden TAP tests that intentionally corrupt page checksums. · 579cef5f

Tom Lane authored Mar 25, 2022

The previous method for doing that was to write zeroes into a
predetermined set of page locations.  However, there's a roughly
1-in-64K chance that the existing checksum will match by chance,
and yesterday several buildfarm animals started to reproducibly
see that, resulting in test failures because no checksum mismatch
was reported.

Since the checksum includes the page LSN, test success depends on
the length of the installation's WAL history, which is affected by
(at least) the initial catalog contents, the set of locales installed
on the system, and the length of the pathname of the test directory.
Sooner or later we were going to hit a chance match, and today is
that day.

Harden these tests by specifically inverting the checksum field and
leaving all else alone, thereby guaranteeing that the checksum is
incorrect.

In passing, fix places that were using seek() to set up for syswrite(),
a combination that the Perl docs very explicitly warn against.  We've
probably escaped problems because no regular buffered I/O is done on
these filehandles; but if it ever breaks, we wouldn't deserve or get
much sympathy.

Although we've only seen problems in HEAD, now that we recognize the
environmental dependencies it seems like it might be just a matter
of time until someone manages to hit this in back-branch testing.
Hence, back-patch to v11 where we started doing this kind of test.

Discussion: https://postgr.es/m/3192026.1648185780@sss.pgh.pa.us

579cef5f

Fix replay of create database records on standby · ffd28516

Alvaro Herrera authored Mar 25, 2022

Crash recovery on standby may encounter missing directories when
replaying create database WAL records. Prior to this patch, the standby
would fail to recover in such a case. However, the directories could be
legitimately missing. Consider a sequence of WAL records as follows:

CREATE DATABASE
DROP DATABASE
DROP TABLESPACE

If, after replaying the last WAL record and removing the tablespace
directory, the standby crashes and has to replay the create database
record again, the crash recovery must be able to move on.

This patch adds a mechanism similar to invalid-page tracking, to keep a
tally of missing directories during crash recovery. If all the missing
directory references are matched with corresponding drop records at the
end of crash recovery, the standby can safely continue following the
primary.

Backpatch to 13, at least for now. The bug is older, but fixing it in
older branches requires more careful study of the interactions with
commit e6d80695, which appeared in 13.

A new TAP test file is added to verify the condition. However, because
it depends on commit d6d317dbf615, it can only be added to branch
master. I (Álvaro) manually verified that the code behaves as expected
in branch 14. It's a bit nervous-making to leave the code uncovered by
tests in older branches, but leaving the bug unfixed is even worse.
Also, the main reason this fix took so long is precisely that we
couldn't agree on a good strategy to approach testing for the bug, so
perhaps this is the best we can do.
Diagnosed-by: Paul Guo <paulguo@gmail.com>
Author: Paul Guo <paulguo@gmail.com>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Asim R Praveen <apraveen@pivotal.io>
Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com

ffd28516

24 Mar, 2022 1 commit

Fix possible recovery trouble if TRUNCATE overlaps a checkpoint. · bbace569

Robert Haas authored Mar 24, 2022

If TRUNCATE causes some buffers to be invalidated and thus the
checkpoint does not flush them, TRUNCATE must also ensure that the
corresponding files are truncated on disk. Otherwise, a replay
from the checkpoint might find that the buffers exist but have
the wrong contents, which may cause replay to fail.

Report by Teja Mupparti. Patch by Kyotaro Horiguchi, per a design
suggestion from Heikki Linnakangas, with some changes to the
comments by me. Review of this and a prior patch that approached
the issue differently by Heikki Linnakangas, Andres Freund, Álvaro
Herrera, Masahiko Sawada, and Tom Lane.

Discussion: http://postgr.es/m/BYAPR06MB6373BF50B469CA393C614257ABF00@BYAPR06MB6373.namprd06.prod.outlook.com

bbace569

23 Mar, 2022 6 commits

Don't try to translate NULL in GetConfigOptionByNum(). · 81045e1e

Andres Freund authored Mar 23, 2022

Noticed via -fsanitize=undefined. Introduced when a few columns in
GetConfigOptionByNum() / pg_settings started to be translated in 72be8c29 /
PG 12.

Backpatch to all affected branches, for the same reasons as 46ab07ffda9.

Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de
Backpatch: 12-

81045e1e

Don't call fwrite() with len == 0 when writing out relcache init file. · 89a94c24

Andres Freund authored Mar 23, 2022

Noticed via -fsanitize=undefined.

Backpatch to all branches, for the same reasons as 46ab07ffda9.

Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de
Backpatch: 10-

89a94c24

configure: check for dlsym instead of dlopen. · e52e9bd5

Andres Freund authored Mar 23, 2022

When building with sanitizers the sanitizer library provides dlopen, but not
dlsym(), making configure think that -ldl isn't needed. Just checking for
dlsym() ought to suffice, hard to see dlsym() being provided without dlopen()
also being provided.

Backpatch to all branches, for the same reasons as 46ab07ffda9.
Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20220323173537.ll7klrglnp4gn2um@alap3.anarazel.de
Backpatch: 10-

e52e9bd5

pg_upgrade: Upgrade an Assert to a real 'if' test · 9814c708

Alvaro Herrera authored Mar 23, 2022

It seems possible for the condition being tested to be true in
production, and nobody would never know (except when some data
eventually becomes corrupt?).

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m//202109040001.zky3wgv2qeqg@alvherre.pgsql

9814c708

Fix "missing continuation record" after standby promotion · caaeb88f

Alvaro Herrera authored Mar 23, 2022

Invalidate abortedRecPtr and missingContrecPtr after a missing
continuation record is successfully skipped on a standby. This fixes a
PANIC caused when a recently promoted standby attempts to write an
OVERWRITE_RECORD with an LSN of the previously read aborted record.

Backpatch to 10 (all stable versions).

Author: Sami Imseih <simseih@amazon.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/44D259DE-7542-49C4-8A52-2AB01534DCA9@amazon.com

caaeb88f

Try to stabilize vacuum test. · cd3a5055

Thomas Munro authored Mar 23, 2022

As commits b700f96c and 3414099c did for the reloptions test, make
sure VACUUM can always truncate the table as expected.

Back-patch to 12, where vacuum_truncate arrived.

Discussion: https://postgr.es/m/CAD21AoCNoWjYkdEtr%2BVDoF9v__V905AedKZ9iF%3DArgCtrbxZqw%40mail.gmail.com

cd3a5055

22 Mar, 2022 1 commit

Add missing dependency of pg_dumpall to WIN32RES. · 2d608c96

Andres Freund authored Mar 22, 2022

When cross-building to windows, or building with mingw on windows, the build
could fail with
  x86_64-w64-mingw32-gcc: error: win32ver.o: No such file or director
because pg_dumpall didn't depend on WIN32RES, but it's recipe references
it. The build nevertheless succeeded most of the time, due to
pg_dump/pg_restore having the required dependency, causing win32ver.o to be
built.
Reported-By: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA+hUKGJeekpUPWW6yCVdf9=oBAcCp86RrBivo4Y4cwazAzGPng@mail.gmail.com
Backpatch: 10-, omission present on all live branches

2d608c96