Commits · 5b6289c1e07dc45f09c3169a189e60d2fcaec2b3 · Abuhujair Javed / Postgres FD Implementation

14 Aug, 2017 5 commits

Handle elog(FATAL) during ROLLBACK more robustly. · 5b6289c1

Tom Lane authored Aug 14, 2017

Stress testing by Andreas Seltenreich disclosed longstanding problems that
occur if a FATAL exit (e.g. due to receipt of SIGTERM) occurs while we are
trying to execute a ROLLBACK of an already-failed transaction. In such a
case, xact.c is in TBLOCK_ABORT state, so that AbortOutOfAnyTransaction
would skip AbortTransaction and go straight to CleanupTransaction. This
led to an assert failure in an assert-enabled build (due to the ROLLBACK's
portal still having a cleanup hook) or without assertions, to a FATAL exit
complaining about "cannot drop active portal". The latter's not
disastrous, perhaps, but it's messy enough to want to improve it.

We don't really want to run all of AbortTransaction in this code path.
The minimum required to clean up the open portal safely is to do
AtAbort_Memory and AtAbort_Portals. It seems like a good idea to
do AtAbort_Memory unconditionally, to be entirely sure that we are
starting with a safe CurrentMemoryContext. That means that if the
main loop in AbortOutOfAnyTransaction does nothing, we need an extra
step at the bottom to restore CurrentMemoryContext = TopMemoryContext,
which I chose to do by invoking AtCleanup_Memory. This'll result in
calling AtCleanup_Memory twice in many of the paths through this function,
but that seems harmless and reasonably inexpensive.

The original motivation for the assertion in AtCleanup_Portals was that
we wanted to be sure that any user-defined code executed as a consequence
of the cleanup hook runs during AbortTransaction not CleanupTransaction.
That still seems like a valid concern, and now that we've seen one case
of the assertion firing --- which means that exactly that would have
happened in a production build --- let's replace the Assert with a runtime
check. If we see the cleanup hook still set, we'll emit a WARNING and
just drop the hook unexecuted.

This has been like this a long time, so back-patch to all supported
branches.

Discussion: https://postgr.es/m/877ey7bmun.fsf@ansel.ydns.eu

5b6289c1

Fix typo · 7f1bb1d7
Peter Eisentraut authored Aug 14, 2017
```
Author: Masahiko Sawada <sawada.mshk@gmail.com>
```
7f1bb1d7

doc: Fix logical replication protocol doc detail · 79e5de69

Peter Eisentraut authored Aug 14, 2017

Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-by: Kyle Conroy <kyle@kyleconroy.com>
Bug: #14775

79e5de69

Absorb -D_USE_32BIT_TIME_T switch from Perl, if relevant. · 5a5c2fec

Tom Lane authored Aug 14, 2017

Commit 3c163a7f's original choice to ignore all #define symbols whose
names begin with underscore turns out to be too simplistic. On Windows,
some Perl installations are built with -D_USE_32BIT_TIME_T, and we must
absorb that or we get the wrong result for sizeof(PerlInterpreter).

This effectively re-reverts commit ef58b87d, which injected that symbol
in a hacky way, making it apply to all of Postgres not just PL/Perl.
More significantly, it did so on *all* 32-bit Windows builds, even when
the Perl build to be used did not select this option; so that it fails
to work properly with some newer Perl builds.

By making this change, we would be introducing an ABI break in 32-bit
Windows builds; but fortunately we have not used type time_t in any
exported Postgres APIs in a long time. So it should be OK, both for
PL/Perl itself and for third-party extensions, if an extension library
is built with a different _USE_32BIT_TIME_T setting than the core code.

Patch by me, based on research by Ashutosh Sharma and Robert Haas.
Back-patch to all supported branches, as commit 3c163a7f was.

Discussion: https://postgr.es/m/CANFyU97OVQ3+Mzfmt3MhuUm5NwPU=-FtbNH5Eb7nZL9ua8=rcA@mail.gmail.com

5a5c2fec

Changed ecpg parser to allow RETURNING clauses without attached C variables. · ea0ca75d
Michael Meskes authored Aug 14, 2017

ea0ca75d

13 Aug, 2017 3 commits

Remove AtEOXact_CatCache(). · 004a9702

Tom Lane authored Aug 13, 2017

The sole useful effect of this function, to check that no catcache
entries have positive refcounts at transaction end, has really been
obsolete since we introduced ResourceOwners in PG 8.1. We reduced the
checks to assertions years ago, so that the function was a complete
no-op in production builds. There have been previous discussions about
removing it entirely, but consensus up to now was that it had some small
value as a cross-check for bugs in the ResourceOwner logic.

However, it now emerges that it's possible to trigger these assertions
if you hit an assert-enabled backend with SIGTERM during a call to
SearchCatCacheList, because that function temporarily increases the
refcounts of entries it's intending to add to a catcache list construct.
In a normal ERROR scenario, the extra refcounts are cleaned up by
SearchCatCacheList's PG_CATCH block; but in a FATAL exit we do a
transaction abort and exit without ever executing PG_CATCH handlers.

There's a case to be made that this is a generic hazard and we should
consider restructuring elog(FATAL) handling so that pending PG_CATCH
handlers do get run. That's pretty scary though: it could easily create
more problems than it solves. Preliminary stress testing by Andreas
Seltenreich suggests that there are not many live problems of this ilk,
so we rejected that idea.

There are more-localized ways to fix the problem; the most principled
one would be to use PG_ENSURE_ERROR_CLEANUP instead of plain PG_TRY.
But adding cycles to SearchCatCacheList isn't very appealing. We could
also weaken the assertions in AtEOXact_CatCache in some more or less
ad-hoc way, but that just makes its raison d'etre even less compelling.
In the end, the most reasonable solution seems to be to just remove
AtEOXact_CatCache altogether, on the grounds that it's not worth trying
to fix it. It hasn't found any bugs for us in many years.

Per report from Jeevan Chalke. Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAM2+6=VEE30YtRQCZX7_sCFsEpoUkFBV1gZazL70fqLn8rcvBA@mail.gmail.com

004a9702

Reword comment for clarity · 2336f842

Alvaro Herrera authored Aug 12, 2017

Reported by Masahiko Sawada
Discussion: https://postgr.es/m/CAD21AoB+ycZ2z-4Ye=6MfQ_r0aV5r6cvVPw4kOyPdp6bHqQoBQ@mail.gmail.com

2336f842

Fix vertical spanning in table "wait_event Description". · e88928c5

Noah Misch authored Aug 12, 2017

Michael Paquier

Discussion: https://postgr.es/m/CAB7nPqQr3KEQvXeuUNYcm7tDK2Fb9oLUQ8DU0+y0RZEoN_1_gg@mail.gmail.com

e88928c5

12 Aug, 2017 1 commit

Simplify fetch-slot-xmins logic in recovery TAP tests. · 3043c1dd

Tom Lane authored Aug 12, 2017

Merge wait_slot_xmins() into get_slot_xmins().  At this point the only
place that wasn't doing a wait was the initial-state test, and a wait
there seems pretty harmless.

Michael Paquier

Discussion: https://postgr.es/m/CAB7nPqSp_SLQb2uU7am+sn4V3g1UKv8j3yZU385oAG1cG_BN9Q@mail.gmail.com

3043c1dd

11 Aug, 2017 11 commits

Be more thorough about cleaning out gcov litter. · d6ecad81

Tom Lane authored Aug 11, 2017

At least on my machine, a run with code coverage enabled produces some
".gcov" files whose names begin with ".". "rm -f *.gcov" fails to match
those, so they don't get cleaned up by "make clean". Fix it.

d6ecad81

Add regression tests exercising more code paths in nodeLimit.c. · 3c8de959

Tom Lane authored Aug 11, 2017

Perusal of the code coverage report shows that the existing regression
test cases for LIMIT/OFFSET don't exercise the nodeLimit code paths
involving backwards scan, empty results, or null values of LIMIT/OFFSET.
Improve the coverage.

3c8de959

Add regression tests exercising the non-hashed code paths in nodeSetop.c. · 6efca23c

Tom Lane authored Aug 11, 2017

Perusal of the code coverage report shows that the existing regression
test cases for INTERSECT and EXCEPT seemingly all prefer the SETOP_HASHED
implementation.  Add some test cases in which we force use of the
SETOP_SORTED mode.

6efca23c

doc: Add example for inet vs cidr difference · ee844bb4
Peter Eisentraut authored Aug 11, 2017
```
Reported-by: kes-kes@yandex.ru
```
ee844bb4

doc: Update description of rolreplication column · fa65c8c7

Peter Eisentraut authored Aug 11, 2017

Since PostgreSQL 9.6, rolreplication no longer determines whether a role
can run pg_start_backup() and pg_stop_backup(), so remove that.

Add that this attribute determines whether a role can create and drop
replication slots.
Reported-by: Fujii Masao <masao.fujii@gmail.com>

fa65c8c7

doc: Small wording improvement · 22701a7e
Peter Eisentraut authored Aug 11, 2017
```
Author: Jeff Janes <jeff.janes@gmail.com>
```
22701a7e
pg_upgrade: Clarify one message · d4ede668
Peter Eisentraut authored Aug 11, 2017
```
Reported-by: Dennis Björklund <db@zigo.dhs.org>
```
d4ede668

Remove pgbench's restriction on placement of -M switch. · 79681844

Tom Lane authored Aug 11, 2017

Previously the -M switch had to appear before any switch that directly
or indirectly specified a benchmarking script. This was both confusing
and inadequately documented, as per gripe from Tatsuo Ishii. We can
remove the restriction at the cost of making an extra pass over the
lists of SQL commands, which seems like a cheap price (the string scans
themselves likely cost much more). The change is just to not extract
parameters from the SQL commands until we have finished parsing the
switches and know the final value of -M.

Per discussion, we'll treat this as a low-grade bug fix and sneak it
into v10, rather than holding it for v11.

Tom Lane, reviewed by Tatsuo Ishii and Fabien Coelho

Discussion: https://postgr.es/m/20170802.110328.1963639094551443169.t-ishii@sraoss.co.jp
Discussion: https://postgr.es/m/10208.1502465077@sss.pgh.pa.us

79681844

Remove uses of "slave" in replication contexts · a1ef920e

Peter Eisentraut authored Aug 07, 2017

This affects mostly code comments, some documentation, and tests.
Official APIs already used "standby".

a1ef920e

Reject use of ucol_strcollUTF8() before ICU 53 · d6391b03

Peter Eisentraut authored Aug 09, 2017

Various bugs can cause crashes, so don't use that function before ICU
53.  It will fall back to the code path used for other encodings.

Since we now tie the function availability to an ICU version, we don't
need the configure test anymore.  That also resolves the issue that the
test result was previously hardcoded for Windows.

researched by Daniel Verite <daniel@manitou-mail.org>, Peter Geoghegan
<pg@bowt.ie>, Tom Lane <tgl@sss.pgh.pa.us>

Discussion: https://www.postgresql.org/message-id/flat/f1438ec6-22aa-4029-9a3b-26f79d330e72%40manitou-mail.org

d6391b03

Fix order of ICU_CFLAGS · b83e5456

Peter Eisentraut authored Aug 09, 2017

It must be before CPPFLAGS so that an ICU installation in a nonstandard
path can take precedence over one in the system path.

b83e5456

10 Aug, 2017 5 commits

Improve the error message when creating an empty range partition. · bb5d6e80

Robert Haas authored Aug 10, 2017

The previous message didn't mention the name of the table or the
bounds. Put the table name in the primary error message and the
bounds in the detail message.

Amit Langote, changed slightly by me. Suggestions on the exac
phrasing from Tom Lane, David G. Johnston, and Dean Rasheed.

Discussion: http://postgr.es/m/CA+Tgmoae6bpwVa-1BMaVcwvCCeOoJ5B9Q9-RHWo-1gJxfPBZ5Q@mail.gmail.com

bb5d6e80

Make some more improvements to parallel query documentation. · c1ef4e5c

Robert Haas authored Aug 10, 2017

Many places that mentioned only Gather should also mention Gather
Merge, or should be phrased in a more neutral way. Be more clear
about the fact that max_parallel_workers_per_gather affects the number
of workers the planner may want to use. Fix a typo. Explain how
Gather Merge works. Adjust wording around parallel scans to be a bit
more clear. Adjust wording around parallel-restricted operations for
the fact that uncorrelated subplans are no longer restricted.

Patch by me, reviewed by Erik Rijkers

Discussion: http://postgr.es/m/CA+TgmoZsTjgVGn=ei5ht-1qGFKy_m1VgB3d8+Rg304hz91N5ww@mail.gmail.com

c1ef4e5c

Fix typo in comment. · e6940107

Robert Haas authored Aug 10, 2017

Etsuro Fujita

Discussion: http://postgr.es/m/5f794b91-67df-1ac6-8a4f-069f8e8e169d@lab.ntt.co.jp

e6940107

pgstatindex: Insert some casts to prevent overflow. · 0b7ba3d6

Robert Haas authored Aug 10, 2017

This could cause hash indexes to report greater than 100% free space.

Ashutosh Sharma, reviewed by Amit Kapila

Discussion: http://postgr.es/m/CAE9k0PnCKfg-ZK1CwGZJPF1yKcG2A=GUgC3BMdNMzLAXVOo4Eg@mail.gmail.com

0b7ba3d6

Remove incorrect assertion in clog.c · ec99dd5a

Robert Haas authored Aug 10, 2017

We must advance the oldest XID that can be safely looked up in clog
*before* truncating CLOG, and the oldest XID that can't be reused
*after* truncating CLOG.  This assertion, and the accompanying
comment, are confused; remove them.

Reported by Neha Sharma.

Discussion: http://postgr.es/m/CANiYTQumC3T=UMBMd1Hor=5XWZYuCEQBioL3ug0YtNQCMMT5wQ@mail.gmail.com

ec99dd5a

09 Aug, 2017 2 commits

Fix handling of container types in find_composite_type_dependencies. · 749c7c41

Tom Lane authored Aug 09, 2017

find_composite_type_dependencies correctly found columns that are of
the specified type, and columns that are of arrays of that type, but
not columns that are domains or ranges over the given type, its array
type, etc. The most general way to handle this seems to be to assume
that any type that is directly dependent on the specified type can be
treated as a container type, and processed recursively (allowing us
to handle nested cases such as ranges over domains over arrays ...).
Since a type's array type already has such a dependency, we can drop
the existing special case for the array type.

The very similar logic in get_rels_with_domain was likewise a few
bricks shy of a load, as it supposed that a directly dependent type
could *only* be a sub-domain. This is already wrong for ranges over
domains, and it'll someday be wrong for arrays over domains.

Add test cases illustrating the problems, and back-patch to all
supported branches.

Discussion: https://postgr.es/m/15268.1502309024@sss.pgh.pa.us

749c7c41

Prevent passing down MAKELEVEL/MAKEFLAGS from non-GNU make to GNU make. · a76200de

Tom Lane authored Aug 09, 2017

FreeBSD's make, for one, sets the MAKELEVEL environment variable when
invoking commands. In the special Makefile we provide to hand off control
from a non-GNU make to GNU make, this causes GNU make to think it is a
child make invocation rather than top-level. That interferes with the hack
added in commit dcae5fac to cause the temp-install tree to be made only by
the top-level invocation of gmake. Unset the variable to prevent that.

Likewise unset MAKEFLAGS, which FreeBSD's make also sets, and which could
easily confuse gmake. There are no reports of actual trouble from that,
but it seems better to be proactive.

Back-patch to 9.5 where dcae5fac came in.

Thomas Munro, hacked a bit more by me

Discussion: https://postgr.es/m/CAEepm=1ueww35AXTkt1A3gyzZUqv5XCzh8RUNvJZAQAW=eOhVw@mail.gmail.com

a76200de

08 Aug, 2017 8 commits

doc: Add missing pieces to logical replication protocol doc · 13f03a00
Peter Eisentraut authored Aug 08, 2017
```
Reported-by: Kyle Conroy <kyle@kyleconroy.com>
```
13f03a00

Fix datumSerialize infrastructure to not crash on non-varlena data. · 9bf4068c

Tom Lane authored Aug 08, 2017

Commit 1efc7e53 did a poor job of emulating existing logic for touching
Datums that might be expanded-object pointers. It didn't check for typlen
being -1 first, which meant it could crash on fixed-length pass-by-ref
values, and probably on cstring values as well. It also didn't use
DatumGetPointer before VARATT_IS_EXTERNAL_EXPANDED, which while currently
harmless is not according to documentation nor prevailing style.

I also think the lack of any explanation as to why datumSerialize makes
these particular nonobvious choices is pretty awful, so fix that.

Per report from Jarred Ward. Back-patch to 9.6 where this code came in.

Discussion: https://postgr.es/m/6F61E6D2-2F5E-4794-9479-A429BE1CEA4B@simple.com

9bf4068c

Reword some unclear comments · 77d2c00a
Alvaro Herrera authored Aug 08, 2017

77d2c00a
Fix typo in comment · f5d54ef9
Alvaro Herrera authored Aug 08, 2017

f5d54ef9

Fix yet another race condition in recovery/t/001_stream_rep.pl. · 4576a693

Tom Lane authored Aug 08, 2017

In commit 5c77690f, we added polling in front of most of the
get_slot_xmins calls in 001_stream_rep.pl, but today's results from
buildfarm member nightjar show that at least one more poll loop
is needed.

Proactively add a poll loop before the next-to-last get_slot_xmins call
as well.  It may be that there is no race condition there because the
standby_2 server is shut down at that point, but I'm quite tired of
fighting with this test script.  The empirical evidence that it's safe,
from the buildfarm, is no stronger than the evidence for the other
call that nightjar just proved unsafe.

The only remaining get_slot_xmins calls without wait_slot_xmins
protection are the first two, which should be OK since nothing has
happened at that point.  It's tempting to ignore that special case
and merge get_slot_xmins and wait_slot_xmins into a single function.
I didn't go that far though.

Discussion: https://postgr.es/m/18436.1502228036@sss.pgh.pa.us

4576a693

Fix replication origin-related race conditions · b2c95a37

Alvaro Herrera authored Aug 08, 2017

Similar to what was fixed in commit 9915de6c for replication slots,
but this time it's related to replication origins: DROP SUBSCRIPTION
attempts to drop the replication origin, but that fails if the
replication worker process hasn't yet marked it unused. This causes
failures in the buildfarm:
ERROR: could not drop replication origin with OID 1, in use by PID 34069

Like the aforementioned commit, fix by having the process running DROP
SUBSCRIPTION sleep until the worker marks the the replication origin
struct as free. This uses a condition variable on each replication
origin shmem state struct, so that the session trying to drop can sleep
and expect to be awakened by the process keeping the origin open.

Also fix a SGML markup in the previous commit.

Discussion: https://postgr.es/m/20170808001433.rozlseaf4m2wkw3n@alvherre.pgsql

b2c95a37

Fix inadequacies in recently added wait events · 030273b7

Alvaro Herrera authored Aug 08, 2017

In commit 9915de6c, we introduced a new wait point for replication
slots and incorrectly labelled it as wait event PG_WAIT_LOCK.  That's
wrong, so invent an appropriate new wait event instead, and document it
properly.

While at it, fix numerous other problems in the vicinity:
- two different walreceiver wait events were being mixed up in a single
  wait event (which wasn't documented either); split it out so that they
  can be distinguished, and document the new events properly.

- ParallelBitmapPopulate was documented but didn't exist.

- ParallelBitmapScan was not documented (I think this should be called
  "ParallelBitmapScanInit" instead.)

- Logical replication wait events weren't documented

- various symbols had been added in dartboard order in various places.
  Put them in alphabetical order instead, as was originally intended.

Discussion: https://postgr.es/m/20170808181131.mu4fjepuh5m75cyq@alvherre.pgsql

030273b7

Disclaim xmltable() support for non-UTF8 databases. · b4a2eea0
Noah Misch authored Aug 07, 2017
```
The xmltable() implementation mirrors xpath(), including its lack of
character encoding awareness.
```
b4a2eea0

07 Aug, 2017 5 commits

Stamp 10beta3. · 8d644237
Tom Lane authored Aug 07, 2017

8d644237

Skip test for IPC::Run if user is overriding our search for PROVE. · 8014d2af

Tom Lane authored Aug 07, 2017

The check for IPC::Run we added in commit c254970a is useful in simple
cases, but there are real use-cases where "prove" is coming from a
different Perl installation than the "perl" we want to use to build.
In such cases asking whether "perl" knows about IPC::Run is irrelevant
and can cause an unnecessary configure failure. Hence, if user has
specified a value for PROVE, skip the IPC::Run check. Per discussion
with Andrew Dunstan.

Discussion: https://postgr.es/m/E1dcE5n-0005Sk-UE@gemulon.postgresql.org

8014d2af

Update SQL features list · cdc47d1f
Peter Eisentraut authored Aug 07, 2017

cdc47d1f

Translation updates · f7668b2b

Peter Eisentraut authored Aug 07, 2017

Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 1a0b5e655d7871506c2b1c7ba562c2de6b6a55de

f7668b2b

Last-minute updates for release notes. · a8b37ebe
Tom Lane authored Aug 07, 2017
```
Security: CVE-2017-7546, CVE-2017-7547, CVE-2017-7548
```
a8b37ebe