Commits · 9b3d02c2a9eb93cc4754857361abee449a3fe0cb · Abuhujair Javed / Postgres FD Implementation

12 Dec, 2016 12 commits

Catversion bump for temporary replication slots. · 9b3d02c2
Tom Lane authored Dec 12, 2016
```
Missed in commit a924c327.
Per Fujii Masao.
```
9b3d02c2

Fix race condition in test_decoding "slot" test. · 23f722ba

Tom Lane authored Dec 12, 2016

This test, just added in commit a924c327, sometimes fails because
the old backend hasn't finished dropping the temporary replication slot
when the new backend looks. Borrow the previously-invented methodology
for waiting for the old process to disappear from pg_stat_activity.

Petr Jelinek

Discussion: https://postgr.es/m/62935e6f-4f1b-c433-e0fa-7f936a38b3e5@2ndquadrant.com

23f722ba

doc: Fix purported type of pg_am.amhandler to match reality. · b4630e01
Robert Haas authored Dec 12, 2016
```
Joel Jacobson
```
b4630e01

Make the different Unix-y semaphore implementations ABI-compatible. · be7b2848

Tom Lane authored Dec 12, 2016

Previously, the "sem" field of PGPROC varied in size depending on which
kernel semaphore API we were using. That was okay as long as there was
only one likely choice per platform, but in the wake of commit ecb0d20a,
that assumption seems rather shaky. It doesn't seem out of the question
anymore that an extension compiled against one API choice might be loaded
into a postmaster built with another choice. Moreover, this prevents any
possibility of selecting the semaphore API at postmaster startup, which
might be something we want to do in future.

Hence, change PGPROC.sem to be PGSemaphore (i.e. a pointer) for all Unix
semaphore APIs, and turn the pointed-to data into an opaque struct whose
contents are only known within the responsible modules.

For the SysV and unnamed-POSIX APIs, the pointed-to data has to be
allocated elsewhere in shared memory, which takes a little bit of
rejiggering of the InitShmemAllocation code sequence. (I invented a
ShmemAllocUnlocked() function to make that a little cleaner than it used
to be. That function is not meant for any uses other than the ones it
has now, but it beats having InitShmemAllocation() know explicitly about
allocation of space for semaphores and spinlocks.) This change means an
extra indirection to access the semaphore data, but since we only touch
that when blocking or awakening a process, there shouldn't be any
meaningful performance penalty. Moreover, at least for the unnamed-POSIX
case on Linux, the sem_t type is quite a bit wider than a pointer, so this
reduces sizeof(PGPROC) which seems like a good thing.

For the named-POSIX API, there's effectively no change: the PGPROC.sem
field was and still is a pointer to something returned by sem_open() in
the postmaster's memory space. Document and check the pre-existing
limitation that this case can't work in EXEC_BACKEND mode.

It did not seem worth unifying the Windows semaphore ABI with the Unix
cases, since there's no likelihood of needing ABI compatibility much less
runtime switching across those cases. However, we can simplify the Windows
code a bit if we define PGSemaphore as being directly a HANDLE, rather than
pointer to HANDLE, so let's do that while we're here. (This also ends up
being no change in what's physically stored in PGPROC.sem. We're just
moving the HANDLE fetch from callees to callers.)

It would take a bunch of additional code shuffling to get to the point of
actually choosing a semaphore API at postmaster start, but the effects
of that would now be localized in the port/XXX_sema.c files, so it seems
like fit material for a separate patch. The need for it is unproven as
yet, anyhow, whereas the ABI risk to extensions seems real enough.

Discussion: https://postgr.es/m/4029.1481413370@sss.pgh.pa.us

be7b2848

psql: Fix incorrect version check for table partitining. · 06e18487
Robert Haas authored Dec 12, 2016
```
Table partitioning was added in 10, not 9.6.

Fabrízio de Royes Mello, per report from Jeff Janes
```
06e18487

Fix creative, but unportable, spelling of "ptr != NULL". · 563d575f

Tom Lane authored Dec 12, 2016

Or at least I suppose that's what was really meant here.  But even
aside from the not-per-project-style use of "0" to mean "NULL",
I doubt it's safe to assume that all valid pointers are > NULL.
Per buildfarm member pademelon.

563d575f

Add support for temporary replication slots · a924c327

Peter Eisentraut authored Dec 08, 2016

This allows creating temporary replication slots that are removed
automatically at the end of the session or on error.

From: Petr Jelinek <petr.jelinek@2ndquadrant.com>

a924c327

Refactor the code for verifying user's password. · e7f051b8

Heikki Linnakangas authored Dec 12, 2016

Split md5_crypt_verify() into three functions:
* get_role_password() to fetch user's password from pg_authid, and check
  its expiration.
* md5_crypt_verify() to check an MD5 authentication challenge
* plain_crypt_verify() to check a plaintext password.

get_role_password() will be needed as a separate function by the upcoming
SCRAM authentication patch set. Most of the remaining functionality in
md5_crypt_verify() was different for MD5 and plaintext authentication, so
split that for readability.

While we're at it, simplify the *_crypt_verify functions by using
stack-allocated buffers to hold the temporary MD5 hashes, instead of
pallocing.

Reviewed by Michael Paquier.

Discussion: https://www.postgresql.org/message-id/3029e460-d47c-710e-507e-d8ba759d7cbb@iki.fi

e7f051b8

Further cleanup from the strong-random patch. · 58445c5c

Heikki Linnakangas authored Dec 12, 2016

Also use the new facility for generating RADIUS authenticator requests,
and salt in chkpass extension.

Reword the error messages to be nicer. Fix bogus error code used in the
message in BackendStartup.

58445c5c

Fix pgcrypto compilation with OpenSSL 1.1.0. · 9bbbf029

Heikki Linnakangas authored Dec 12, 2016

Was broken by the switch to using OpenSSL's EVP interface for ciphers, in
commit 5ff4a67f.

Reported by Andres Freund. Fix by Michael Paquier with some kibitzing by me.

Discussion: https://www.postgresql.org/message-id/20161201014826.ic72tfkahmevpwz7@alap3.anarazel.de

9bbbf029

Fix two thinkos related to strong random keys. · 41493bac

Heikki Linnakangas authored Dec 12, 2016

pg_backend_random() is used for MD5 salt generation, but it can fail, and
no checks were done on its status code.

Fix memory leak, if generating a random number for a cancel key failed.

Both issues were spotted by Coverity. Fix by Michael Paquier.

41493bac

Fix broken autoconf test for random number source. · ad365b2f

Heikki Linnakangas authored Dec 12, 2016

Hopefully this fixes buildfarm member jacana.

Discussion: https://www.postgresql.org/message-id/be25aa16-2f06-b7d1-8810-c69489a0e70b@dunslane.net

ad365b2f

11 Dec, 2016 2 commits

Use "%option prefix" to set API names in ecpg's lexer. · 92fb6498

Tom Lane authored Dec 11, 2016

Clean up some technical debt left behind by commit 72b1e3a2: instead of
quickly hacking the name of base_yylex() with a #define, set it properly
with "%option prefix". This causes the names of pgc.l's other exported
symbols to change as well, so run around and modify the outside references
to them as needed. Similarly, make pgc.l's external references to
base_yylval use that variable's true name instead of a macro.

The reason for doing this now is that the quick-hack solution will fail
with future versions of flex, as reported by Дилян Палаузов.
Hence, back-patch into 9.6 where the previous commit appeared, since
it's likely people will build 9.6 with newer flex versions during
its lifetime.

Discussion: https://postgr.es/m/d845c1af-e18d-6651-178f-9f08cdf37e10@aegee.org

92fb6498

Prevent crash when ts_rewrite() replaces a non-top-level subtree with null. · 0eaaaf00

Tom Lane authored Dec 11, 2016

When ts_rewrite()'s replacement argument is an empty tsquery, it's supposed
to simplify any operator nodes whose operand(s) become NULL; but it failed
to do that reliably, because dropvoidsubtree() only examined the top level
of the result tree. Rather than make a second recursive pass, let's just
give the responsibility to dofindsubquery() to simplify while it's doing
the main replacement pass. Per report from Andreas Seltenreich.

Artur Zakirov, with some cosmetic changes by me. Back-patch to all
supported branches.

Discussion: https://postgr.es/m/8737i01dew.fsf@credativ.de

0eaaaf00

09 Dec, 2016 2 commits

Be more careful about Python refcounts while creating exception objects. · 9cda81f0

Tom Lane authored Dec 09, 2016

PLy_generate_spi_exceptions neglected to do Py_INCREF on the new exception
objects, evidently supposing that PyModule_AddObject would do that --- but
it doesn't. This left us in a situation where a Python garbage collection
cycle could result in deletion of exception object(s), causing server
crashes or wrong answers if the exception objects are used later in the
session.

In addition, PLy_generate_spi_exceptions didn't bother to test for
a null result from PyErr_NewException, which at best is inconsistent
with the code in PLy_add_exceptions. And PLy_add_exceptions, while it
did do Py_INCREF on the exceptions it makes, waited to do that till
after some PyModule_AddObject calls, creating a similar risk for
failure if garbage collection happened within those calls.

To fix, refactor to have just one piece of code that creates an
exception object and adds it to the spiexceptions module, bumping the
refcount first.

Also, let's add an additional refcount to represent the pointer we're
going to store in a C global variable or hash table. This should only
matter if the user does something weird like delete the spiexceptions
Python module, but lack of paranoia has caused us enough problems in
PL/Python already.

The fact that PyModule_AddObject doesn't do a Py_INCREF of its own
explains the need for the Py_INCREF added in commit 4c966d92, so we
can improve the comment about that; also, this means we really want
to do that before not after the PyModule_AddObject call.

The missing Py_INCREF in PLy_generate_spi_exceptions was reported and
diagnosed by Rafa de la Torre; the other fixes by me. Back-patch
to all supported branches.

Discussion: https://postgr.es/m/CA+Fz15kR1OXZv43mDrJb3XY+1MuQYWhx5kx3ea6BRKQp6ezGkg@mail.gmail.com

9cda81f0

Fix crasher bug in array_position(s) · a73491e5

Alvaro Herrera authored Dec 09, 2016

array_position and its cousin array_positions were caching the element
type equality function's FmgrInfo without being careful enough to put it
in a long-lived context. This is obviously broken but it didn't matter
in most cases; only when using arrays of records (involving record_eq)
it becomes a problem. The fix is to ensure that the type's equality
function's FmgrInfo is cached in the array_position's flinfo->fn_mcxt
rather than the current memory context.

Apart from record types, the only other case that seems complex enough
to possibly cause the same problem are range types. I didn't find a way
to reproduce the problem with those, so I only include the test case
submitted with the bug report as regression test.

Bug report and patch: Junseok Yang
Discussion: https://postgr.es/m/CAE+byMupUURYiZ6bKYgMZb9pgV1CYAijJGqWj-90W=nS7uEOeA@mail.gmail.com
Backpatch to 9.5, where array_position appeared.

a73491e5

08 Dec, 2016 8 commits

Fix thinko in safeguard for negative availMem. · 64bc26f9

Heikki Linnakangas authored Dec 08, 2016

Also, use pass read_buffer_size * numInputTapes rather than just availMem
to USEMEM, to be neat.

Peter Geoghegan.

64bc26f9

Fix bogus comment. · 01ae881e

Robert Haas authored Dec 08, 2016

Commit 4212cb73 rendered a comment
in execMain.c incorrect.  Per complaint from Tom Lane, repair.

Patch from Amit Kapila, per wording suggested by Tom Lane and me.

01ae881e

Silence compiler warning. · ab4575dc
Robert Haas authored Dec 08, 2016
```
Per report from Stephen Frost.
```
ab4575dc

Log the creation of an init fork unconditionally. · fa0f466d

Robert Haas authored Dec 08, 2016

Previously, it was thought that this only needed to be done for the
benefit of possible standbys, so wal_level = minimal skipped it.
But that's not safe, because during crash recovery we might replay
XLOG_DBASE_CREATE or XLOG_TBLSPC_CREATE record which recursively
removes the directory that contains the new init fork.  So log it
always.

The user-visible effect of this bug is that if you create a database
or tablespace, then create an unlogged table, then crash without
checkpointing, then restart, accessing the table will fail, because
the it won't have been properly reset.  This commit fixes that.

Michael Paquier, per a report from Konstantin Knizhnik.  Wording of
the comments per a suggestion from me.

fa0f466d

Fix reporting of column typmods for multi-row VALUES constructs. · 0b78106c

Tom Lane authored Dec 08, 2016

expandRTE() and get_rte_attribute_type() reported the exprType() and
exprTypmod() values of the expressions in the first row of the VALUES as
being the column type/typmod returned by the VALUES RTE. That's fine for
the data type, since we coerce all expressions in a column to have the same
common type. But we don't coerce them to have a common typmod, so it was
possible for rows after the first one to return values that violate the
claimed column typmod. This leads to the incorrect result seen in bug
#14448 from Hassan Mahmood, as well as some other corner-case misbehaviors.

The desired behavior is the same as we use in other type-unification
cases: report the common typmod if there is one, but otherwise return -1
indicating no particular constraint. It's cheap for transformValuesClause
to determine the common typmod while transforming a multi-row VALUES, but
it'd be less cheap for expandRTE() and get_rte_attribute_type() to
re-determine that info every time they're asked --- possibly a lot less
cheap, if the VALUES has many rows. Therefore, the best fix is to record
the common typmods explicitly in a list in the VALUES RTE, as we were
already doing for column collations. This looks quite a bit like what
we're doing for CTE RTEs, so we can save a little bit of space and code by
unifying the representation for those two RTE types. They both now share
coltypes/coltypmods/colcollations fields. (At some point it might seem
desirable to populate those fields for all RTE types; but right now it
looks like constructing them for other RTE types would add more code and
cycles than it would save.)

The RTE change requires a catversion bump, so this fix is only usable
in HEAD. If we fix this at all in the back branches, the patch will
need to look quite different.

Report: https://postgr.es/m/20161205143037.4377.60754@wrigleys.postgresql.org
Discussion: https://postgr.es/m/27429.1480968538@sss.pgh.pa.us

0b78106c

Fix quoting and a compiler warning in dumping partitions. · 2560d244

Heikki Linnakangas authored Dec 08, 2016

Partition name needs to be quoted in the ATTACH PARTITION command
constructed in binary-upgrade mode.

Silence compiler warning about set but unused variable, without
--enable-cassert.

2560d244

Clean up password authentication code a bit. · fe7bdf0b

Heikki Linnakangas authored Dec 08, 2016

Commit fe0a0b59, which moved code to do MD5 authentication to a separate
CheckMD5Auth() function, left behind a comment that really belongs inside
the function, too. Also move the check for db_user_namespace inside the
function, seems clearer that way.

Now that the md5 salt is passed as argument to md5_crypt_verify, it's a bit
silly that it peeks into the Port struct to see if MD5 authentication was
used. Seems more straightforward to treat it as an MD5 authentication, if
the md5 salt argument is given. And after that, md5_crypt_verify only used
the Port argument to look at port->user_name, but that is redundant,
because it is also passed as a separate 'role' argument. So remove the Port
argument altogether.

fe7bdf0b

Fix accounting of memory needed for merge heap. · f7d54f4f

Heikki Linnakangas authored Dec 08, 2016

We allegedly allocated all remaining memory for the read buffers of the
sort tapes, but we allocated the merge heap only after that. That means
that the allocation of the merge heap was guaranteed to go over the memory
limit. Fix by allocating the merge heap first. This makes little difference
in practice, because the merge heap is tiny, but let's tidy.

While we're at it, add a safeguard for the case that we are already over
the limit when allocating the read buffers. That shouldn't happen, but
better safe than sorry.

The memory accounting error was reported off-list by Peter Geoghegan.

f7d54f4f

07 Dec, 2016 10 commits

Replace references to COLLATE "en_CA" with COLLATE "POSIX". · cd5d3af4
Robert Haas authored Dec 07, 2016
```
Another attmempt to fix the tests which were added by commit
f0e44751.
```
cd5d3af4
Replace references to COLLATE "en_US" with COLLATE "C". · 71efd34f
Robert Haas authored Dec 07, 2016
```
Commit f0e44751 is turning the
buildfarm red; let's try something hopefully more portable.
```
71efd34f

Implement table partitioning. · f0e44751

Robert Haas authored Dec 07, 2016

Table partitioning is like table inheritance and reuses much of the
existing infrastructure, but there are some important differences.
The parent is called a partitioned table and is always empty; it may
not have indexes or non-inherited constraints, since those make no
sense for a relation with no data of its own. The children are called
partitions and contain all of the actual data. Each partition has an
implicit partitioning constraint. Multiple inheritance is not
allowed, and partitioning and inheritance can't be mixed. Partitions
can't have extra columns and may not allow nulls unless the parent
does. Tuples inserted into the parent are automatically routed to the
correct partition, so tuple-routing ON INSERT triggers are not needed.
Tuple routing isn't yet supported for partitions which are foreign
tables, and it doesn't handle updates that cross partition boundaries.

Currently, tables can be range-partitioned or list-partitioned. List
partitioning is limited to a single column, but range partitioning can
involve multiple columns. A partitioning "column" can be an
expression.

Because table partitioning is less general than table inheritance, it
is hoped that it will be easier to reason about properties of
partitions, and therefore that this will serve as a better foundation
for a variety of possible optimizations, including query planner
optimizations. The tuple routing based which this patch does based on
the implicit partitioning constraints is an example of this, but it
seems likely that many other useful optimizations are also possible.

Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat,
Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova,
Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.

f0e44751

Restore psql's SIGPIPE setting if popen() fails. · b7e1ae23

Tom Lane authored Dec 07, 2016

Ancient oversight in PageOutput(): if popen() fails, we'd better reset
the SIGPIPE handler before returning stdout, because ClosePager() won't.
Noticed while fixing the empty-PAGER issue.

b7e1ae23

Handle empty or all-blank PAGER setting more sanely in psql. · 18f8f784

Tom Lane authored Dec 07, 2016

If the PAGER environment variable is set but contains an empty string,
psql would pass it to "sh" which would silently exit, causing whatever
query output we were printing to vanish entirely. This is quite
mystifying; it took a long time for us to figure out that this was the
cause of Joseph Brenner's trouble report. Rather than allowing that
to happen, we should treat this as another way to specify "no pager".
(We could alternatively treat it as selecting the default pager, but
it seems more likely that the former is what the user meant to achieve
by setting PAGER this way.)

Nonempty, but all-white-space, PAGER values have the same behavior, and
it's pretty easy to test for that, so let's handle that case the same way.

Most other cases of faulty PAGER values will result in the shell printing
some kind of complaint to stderr, which should be enough to diagnose the
problem, so we don't need to work harder than this. (Note that there's
been an intentional decision not to be very chatty about apparent failure
returns from the pager process, since that may happen if, eg, the user
quits the pager with control-C or some such. I'd just as soon not start
splitting hairs about which exit codes might merit making our own report.)

libpq's old PQprint() function was already on board with ignoring empty
PAGER values, but for consistency, make it ignore all-white-space values
as well.

It's been like this a long time, so back-patch to all supported branches.

Discussion: https://postgr.es/m/CAFfgvXWLOE2novHzYjmQK8-J6TmHz42G8f3X0SORM44+stUGmw@mail.gmail.com

18f8f784

Fix query cancellation. · 81f2e514

Heikki Linnakangas authored Dec 07, 2016

In commit fe0a0b59, the datatype used for MyCancelKey and other variables
that store cancel keys were changed from long to uint32, but I missed this
one. That broke query cancellation on platforms where long is wider than 32
bits.

Report by Andres Freund, fix by Michael Paquier.

81f2e514

Fix whitespace. · 9790b87f
Heikki Linnakangas authored Dec 07, 2016
```
Thomas Munro
```
9790b87f

Silence compiler warnings · d97b14dd

Stephen Frost authored Dec 06, 2016

Rearrange a bit of code to ensure that 'mode' in LWLockRelease is
obviously always set, which seems a bit cleaner and avoids a compiler
warning (thanks to Robert for the suggestion!).

In GetCachedPlan(), initialize 'plan' to silence a compiler warning, but
also add an Assert() to make sure we don't ever actually fall through
with 'plan' still being set to NULL, since we are about to dereference
it.

Neither of these appear to be live bugs but at least gcc
5.4.0-6ubuntu1~16.04.4 doesn't quite have the smarts to realize that.

Discussion: https://www.postgresql.org/message-id/20161129152102.GR13284%40tamriel.snowman.net

d97b14dd

Fix unsafe assumption that struct timeval.tv_sec is a "long". · 0645dacc

Tom Lane authored Dec 06, 2016

It typically is a "long", but it seems possible that on some platforms
it wouldn't be.  In any case, this silences a compiler warning on
OpenBSD (cf buildfarm member curculio).

While at it, use snprintf not sprintf.  This format string couldn't
possibly overrun the supplied buffer, but that doesn't seem like
a good reason not to use the safer style.

Oversight in commit f828654e.  Back-patch to 9.6 where that came in.

0645dacc

Put AC_MSG_RESULT() call in the right place. · c648f058

Tom Lane authored Dec 06, 2016

Thinko in ecb0d20a --- this needs to go one level further out in
the "if" nest. As it stood, nothing got printed in the case of
selecting named POSIX semaphores. Cosmetic issue only, but a bug.

c648f058

06 Dec, 2016 4 commits

Fix interaction of parallel query with prepared statements. · 4212cb73

Robert Haas authored Dec 06, 2016

Previously, a prepared statement created via a Parse message could get
a parallel plan, but one created with a PREPARE statement could not.
This state of affairs was due to confusion on my (rhaas) part: I
erroneously believed that a CREATE TABLE .. AS EXECUTE statement could
only be performed with a prepared statement by PREPARE, but in fact
one created by a Prepare message works just as well.  Therefore, it
makes no sense to allow parallel query in one case but not the other.

To fix, allow parallel query with all prepared statements, but run
the parallel plan serially (i.e. without workers) in the case of
CREATE TABLE .. AS EXECUTE.  Also, document this.

Amit Kapila and Tobias Bussman, plus an extra sentence of
documentation by me.

4212cb73

Bump catversion for restrictive RLS changes · cb9dcbc1
Stephen Frost authored Dec 06, 2016
```
Mea culpa.

Pointed out by Andres.
```
cb9dcbc1

Improve documentation about pg_stat_replication view. · dfe530a0

Fujii Masao authored Dec 06, 2016

Add the descriptions of possible values in "state" and "sync_state" columns
of pg_stat_replication view.

Author: Michael Paquier, slightly modified by me
Discussion: <CAB7nPqT7APWrvPFZrcjKEHoq4=g3z2ErxtTdojSf+sDALzuemA@mail.gmail.com>

dfe530a0

Remove extraneous semicolon from uses of relptr_declare(). · 3ebf2b45

Tom Lane authored Dec 05, 2016

If we're going to write a semicolon after calls of relptr_declare(),
then we don't need one inside the macro, and removing it suppresses
"empty declaration" warnings from pickier compilers (eg pademelon).

While at it, we might as well use relptr() inside relptr_declare(),
because otherwise that macro would likely go unused altogether.

Also improve the comment, which I for one found unclear,
and provide a specific example of intended usage.

3ebf2b45

05 Dec, 2016 2 commits

Fix typo in new message in configure. · 44a977f5
Heikki Linnakangas authored Dec 06, 2016
```
Remove spurious "of", and reformat to fit on a 80 chars wide line.
```
44a977f5

Ensure gatherstate->nextreader is properly initialized. · 53c7cff7

Robert Haas authored Dec 05, 2016

The previously code worked OK as long as a Gather node was never
rescanned, or if it was rescanned, as long as it got at least as
many workers on rescan as it had originally.  But if the number
of workers ever decreased on a rescan, then it could crash.

Andreas Seltenreich

53c7cff7