Commits · 591ccb66d24258f6d1084343b3c33c96e3e2b36d · Abuhujair Javed / Postgres FD Implementation

16 Dec, 2016 9 commits

Fix outdated comment in lwlock.c · 591ccb66

Robert Haas authored Dec 16, 2016

Commit 3761fe3c should have made
this change, but didn't.

Reported by Álvaro Herrera.

591ccb66

Ensure that num_sync is greater than zero in synchronous_standby_names. · 93eb619c

Fujii Masao authored Dec 17, 2016

Previously num_sync could be set to zero and this setting caused
an assertion failure. This means that multiple synchronous standbys
code should assume that num_sync is greater than zero.
Also setting num_sync to zero is nonsense because it's basically
the configuration for synchronous replication. If users want not to
make transaction commits wait for any standbys,
synchronous_standby_names should be emptied to disable synchronous
replication instead of setting num_sync to zero.

This patch forbids users from setting num_sync to zero in
synchronous_standby_names. If zero is specified, an error will
happen during processing the parameter settings.

Back-patch to 9.6 where multiple synchronous standbys feature was added.

Patch by me. Reviewed by Tom Lane.
Discussion: <CAHGQGwHWB3izc6cXuFLh5kOcAbFXaRhhgwd-X5PeN9TEjxqXwg@mail.gmail.com>

93eb619c

Improve documentation around TS_execute(). · 23c75b55

Tom Lane authored Dec 16, 2016

I got frustrated by the lack of commentary in this area, so here is some
reverse-engineered documentation, along with minor stylistic cleanup.
No code changes more significant than removal of unused variables.

Back-patch to 9.6, not because that's useful in itself, but because
we have some bugs to fix in phrase search and this would cause merge
failures if it's only in HEAD.

23c75b55

Simplify LWLock tranche machinery by removing array_base/array_stride. · 3761fe3c

Robert Haas authored Dec 16, 2016

array_base and array_stride were added so that we could identify the
offset of an LWLock within a tranche, but this facility is only very
marginally used apart from the main tranche. So, give every lock in
the main tranche its own tranche ID and get rid of array_base,
array_stride, and all that's attached. For debugging facilities
(Trace_lwlocks and LWLOCK_STATS) print the pointer address of the
LWLock using %p instead of the offset. This is arguably more useful,
and certainly a lot cheaper. Drop the offset-within-tranche from
the information reported to dtrace and from one can't-happen message
inside lwlock.c.

The main user-visible impact of this change is that pg_stat_activity
will now report all waits for LWLocks as "LWLock" rather than
reporting some as "LWLockTranche" and others as "LWLockNamed".

The main motivation for this change is that the need to specify an
array_base and an array_stride is awkward for parallel query. There
is only a very limited supply of tranche IDs so we can't just keep
allocating new ones, and if we try to use the same tranche IDs every
time then we run into trouble when multiple parallel contexts are
use simultaneously. So if we didn't get rid of this mechanism we'd
have to make it even more complicated. By simplifying it in this
way, we instead reduce the size of the generated code for lwlock.c
by about 5%.

Discussion: http://postgr.es/m/CA+TgmoYsFn6NUW1x0AZtupJGUAs1UDY4dJtCN47_Q6D0sP80PA@mail.gmail.com

3761fe3c

Add missing documentation for effective_io_concurrency tablespace option. · 4e344c2c

Fujii Masao authored Dec 17, 2016

The description of effective_io_concurrency option was missing in ALTER
TABLESPACE docs though it's included in CREATE TABLESPACE one.

Back-patch to 9.6 where effective_io_concurrency tablespace option was added.

Michael Paquier, reported by Marc-Olaf Jaschke

4e344c2c

Unbreak Finalize HashAggregate over Partial HashAggregate. · b81b5a96

Robert Haas authored Dec 16, 2016

Commit 5dfc1981 introduced the use
of a new type of hash table with linear reprobing for hash aggregates.
Such a hash table behaves very poorly if keys are inserted in hash
order, which does in fact happen in the case where a query use a
Finalize HashAggregate node fed (via Gather) by a Partial
HashAggregate node.  In fact, queries with this type of plan tend
to run effectively forever.

Fix that by seeding the hash value differently in each worker
(and in the leader, if it participates).

Andres Freund and Robert Haas

b81b5a96

Fix more hash index bugs around marking buffers dirty. · 6a4fe112

Robert Haas authored Dec 16, 2016

In _hash_freeovflpage(), if we're freeing the overflow page that
immediate follows the page to which tuples are being moved (the
confusingly-named "write buffer"), don't forget to mark that
page dirty after updating its hasho_nextblkno.

In _hash_squeezebucket(), it's not necessary to mark the primary
bucket page dirty if there are no overflow pages, because there's
nothing to squeeze in that case.

Amit Kapila, with help from Kuntal Ghosh and Dilip Kumar, after
an initial trouble report by Jeff Janes.

6a4fe112

Remove _hash_wrtbuf() in favor of calling MarkBufferDirty(). · 25216c98

Robert Haas authored Dec 16, 2016

The whole concept of _hash_wrtbuf() is that we need to know at the
time we're releasing the buffer lock (and pin) whether we dirtied the
buffer, but this is easy to get wrong.  This patch actually fixes one
non-obvious bug of that form: hashbucketcleanup forgot to signal
_hash_squeezebucket, which gets the primary bucket page already
locked, as to whether it had already dirtied the page.  Calling
MarkBufferDirty() at the places where we dirty the buffer is more
intuitive and lets us simplify the code in various places as well.

On top of all that, the ultimate goal here is to make hash indexes
WAL-logged, and as the comments to _hash_wrtbuf() note, it should
go away when that happens.  Making it go away a little earlier than
that seems like a good preparatory step.

Report by Jeff Janes.  Diagnosis by Amit Kapila, Kuntal Ghosh,
and Dilip Kumar.  Patch by me, after studying an alternative patch
submitted by Amit Kapila.

Discussion: http://postgr.es/m/CAA4eK1Kf6tOY0oVz_SEdngiNFkeXrA3xUSDPPORQvsWVPdKqnA@mail.gmail.com

25216c98

Fix off-by-one in memory allocation for quote_literal_cstr(). · 4f5182e1

Heikki Linnakangas authored Dec 16, 2016

The calculation didn't take into account the NULL terminator. That lead
to overwriting the palloc'd buffer by one byte, if the input consists
entirely of backslashes. For example "format('%L', E'\\')".

Fixes bug #14468. Backpatch to all supported versions.

Report: https://www.postgresql.org/message-id/20161216105001.13334.42819%40wrigleys.postgresql.org

4f5182e1

15 Dec, 2016 3 commits

Sync our copy of the timezone library with IANA release tzcode2016j. · 93513d1b

Tom Lane authored Dec 15, 2016

This is a trivial update (consisting in fact only in the addition of
a comment). The point is just to get back to being synced with an
official release of tzcode, rather than some ad-hoc point in their
commit history, which is where commit 1f87181e left it.

93513d1b

Add missing newline in message · 8cb545bf
Magnus Hagander authored Dec 15, 2016

8cb545bf

Various temporary slots test improvements · fdf71389

Peter Eisentraut authored Dec 15, 2016

Fix the tests on slow machines (per buildfarm).

Add test for dropping on error.  And also try to consume real changes
from temporary slots.

From: Petr Jelinek <petr.jelinek@2ndquadrant.com>

fdf71389

13 Dec, 2016 7 commits

Improve handling of array elements as getdiag_targets and cursor_variables. · 55caaaeb

Tom Lane authored Dec 13, 2016

There's no good reason why plpgsql's GET DIAGNOSTICS statement can't
support an array element as target variable, since the execution code
already uses the generic exec_assign_value() function to assign to it.
Hence, refactor the grammar to allow that, by making getdiag_target
depend on the assign_var production.

Ideally we'd also let a cursor_variable expand to an element of a
refcursor[] array, but that's substantially harder since those statements
also have to handle bound-cursor-variable cases. For now, just make sure
the reported error is sensible, ie "cursor variable must be a simple
variable" not "variable must be of type cursor or refcursor". The latter
was quite confusing from the user's viewpoint, since what he wrote
satisfies the claimed restriction.

Per bug #14463 from Zhou Digoal. Given the lack of previous complaints,
I see no need for a back-patch.

Discussion: https://postgr.es/m/20161213152548.14897.81245@wrigleys.postgresql.org

55caaaeb

Prevent planagg.c from failing on queries containing CTEs. · 1f542a2e

Tom Lane authored Dec 13, 2016

The existing tests in preprocess_minmax_aggregates() usually prevent it
from trying to do anything with queries containing CTEs, but there's an
exception: a CTE could be present as a member of an appendrel, if we
flattened a UNION ALL that contains CTE references. If it did try to
generate an optimized path for a query using a CTE, it failed with
"could not find plan for CTE", as reported by Torsten Förtsch.

The proximate cause is an unwise decision in commit 3fc6e2d7 to clear
subroot->cte_plan_ids in build_minmax_path(). That left the subroot's
cte_plan_ids list out of step with its parse->cteList.

Removing the "subroot->cte_plan_ids = NIL;" assignment is enough to let
the case work again, but really it's pretty silly to be expending any
cycles at all in this module when there are CTEs: we always treat their
outputs as unordered so there's no way for the optimization to win.
Hence, also add an early-exit test so we don't waste time like that.

Back-patch to 9.6 where the misbehavior was introduced.

Report: https://postgr.es/m/CAKkG4_=gjY5QiHtqSZyWMwDuTd_CftKoTaCqxjJ7uUz1-Gw=qw@mail.gmail.com

1f542a2e

Fix bug in hashbulkdelete. · 501c7b94

Robert Haas authored Dec 13, 2016

Commit 6d46f478 failed to account for
the possibility that hashbulkdelete() might encounter a bucket that
has been split since it began scanning the bucket array.  Repair.

Extracted from a larger pathc by Amit Kapila; I rewrote the comment.

501c7b94

Fix bugs in RelationGetPartitionDispatchInfo. · a2566508

Robert Haas authored Dec 13, 2016

The previous coding was not quite right for cases involving multiple
levels of partitioning.

Amit Langote

a2566508

Clean up code, comments, and formatting for table partitioning. · 4b9a98e1
Robert Haas authored Dec 13, 2016
```
Amit Langote, plus pgindent-ing by me.  Inspired in part by review
comments from Tomas Vondra.
```
4b9a98e1
Update typedefs.list · acddbe22
Robert Haas authored Dec 13, 2016
```
So developers can more easily run pgindent locally
```
acddbe22

doc: Improve documentation related to table partitioning feature. · a1a4459c

Robert Haas authored Dec 13, 2016

Commit f0e44751 implemented table
partitioning, but failed to mention the "no row movement"
restriction in the documentation.  Fix that and a few other issues.

Amit Langote, with some additional wordsmithing by me.

a1a4459c

12 Dec, 2016 13 commits

Remove should_free arguments to tuplesort routines. · 3856cf96

Robert Haas authored Dec 12, 2016

Since commit e94568ec, the answer is
always "false", and we do not need to complicate the API by arranging
to return a constant value.

Peter Geoghegan

Discussion: http://postgr.es/m/CAM3SWZQWZZ_N=DmmL7tKy_OUjGH_5mN=N=A6h7kHyyDvEhg2DA@mail.gmail.com

3856cf96

Catversion bump for temporary replication slots. · 9b3d02c2
Tom Lane authored Dec 12, 2016
```
Missed in commit a924c327.
Per Fujii Masao.
```
9b3d02c2

Fix race condition in test_decoding "slot" test. · 23f722ba

Tom Lane authored Dec 12, 2016

This test, just added in commit a924c327, sometimes fails because
the old backend hasn't finished dropping the temporary replication slot
when the new backend looks. Borrow the previously-invented methodology
for waiting for the old process to disappear from pg_stat_activity.

Petr Jelinek

Discussion: https://postgr.es/m/62935e6f-4f1b-c433-e0fa-7f936a38b3e5@2ndquadrant.com

23f722ba

doc: Fix purported type of pg_am.amhandler to match reality. · b4630e01
Robert Haas authored Dec 12, 2016
```
Joel Jacobson
```
b4630e01

Make the different Unix-y semaphore implementations ABI-compatible. · be7b2848

Tom Lane authored Dec 12, 2016

Previously, the "sem" field of PGPROC varied in size depending on which
kernel semaphore API we were using. That was okay as long as there was
only one likely choice per platform, but in the wake of commit ecb0d20a,
that assumption seems rather shaky. It doesn't seem out of the question
anymore that an extension compiled against one API choice might be loaded
into a postmaster built with another choice. Moreover, this prevents any
possibility of selecting the semaphore API at postmaster startup, which
might be something we want to do in future.

Hence, change PGPROC.sem to be PGSemaphore (i.e. a pointer) for all Unix
semaphore APIs, and turn the pointed-to data into an opaque struct whose
contents are only known within the responsible modules.

For the SysV and unnamed-POSIX APIs, the pointed-to data has to be
allocated elsewhere in shared memory, which takes a little bit of
rejiggering of the InitShmemAllocation code sequence. (I invented a
ShmemAllocUnlocked() function to make that a little cleaner than it used
to be. That function is not meant for any uses other than the ones it
has now, but it beats having InitShmemAllocation() know explicitly about
allocation of space for semaphores and spinlocks.) This change means an
extra indirection to access the semaphore data, but since we only touch
that when blocking or awakening a process, there shouldn't be any
meaningful performance penalty. Moreover, at least for the unnamed-POSIX
case on Linux, the sem_t type is quite a bit wider than a pointer, so this
reduces sizeof(PGPROC) which seems like a good thing.

For the named-POSIX API, there's effectively no change: the PGPROC.sem
field was and still is a pointer to something returned by sem_open() in
the postmaster's memory space. Document and check the pre-existing
limitation that this case can't work in EXEC_BACKEND mode.

It did not seem worth unifying the Windows semaphore ABI with the Unix
cases, since there's no likelihood of needing ABI compatibility much less
runtime switching across those cases. However, we can simplify the Windows
code a bit if we define PGSemaphore as being directly a HANDLE, rather than
pointer to HANDLE, so let's do that while we're here. (This also ends up
being no change in what's physically stored in PGPROC.sem. We're just
moving the HANDLE fetch from callees to callers.)

It would take a bunch of additional code shuffling to get to the point of
actually choosing a semaphore API at postmaster start, but the effects
of that would now be localized in the port/XXX_sema.c files, so it seems
like fit material for a separate patch. The need for it is unproven as
yet, anyhow, whereas the ABI risk to extensions seems real enough.

Discussion: https://postgr.es/m/4029.1481413370@sss.pgh.pa.us

be7b2848

psql: Fix incorrect version check for table partitining. · 06e18487
Robert Haas authored Dec 12, 2016
```
Table partitioning was added in 10, not 9.6.

Fabrízio de Royes Mello, per report from Jeff Janes
```
06e18487

Fix creative, but unportable, spelling of "ptr != NULL". · 563d575f

Tom Lane authored Dec 12, 2016

Or at least I suppose that's what was really meant here.  But even
aside from the not-per-project-style use of "0" to mean "NULL",
I doubt it's safe to assume that all valid pointers are > NULL.
Per buildfarm member pademelon.

563d575f

Add support for temporary replication slots · a924c327

Peter Eisentraut authored Dec 08, 2016

This allows creating temporary replication slots that are removed
automatically at the end of the session or on error.

From: Petr Jelinek <petr.jelinek@2ndquadrant.com>

a924c327

Refactor the code for verifying user's password. · e7f051b8

Heikki Linnakangas authored Dec 12, 2016

Split md5_crypt_verify() into three functions:
* get_role_password() to fetch user's password from pg_authid, and check
  its expiration.
* md5_crypt_verify() to check an MD5 authentication challenge
* plain_crypt_verify() to check a plaintext password.

get_role_password() will be needed as a separate function by the upcoming
SCRAM authentication patch set. Most of the remaining functionality in
md5_crypt_verify() was different for MD5 and plaintext authentication, so
split that for readability.

While we're at it, simplify the *_crypt_verify functions by using
stack-allocated buffers to hold the temporary MD5 hashes, instead of
pallocing.

Reviewed by Michael Paquier.

Discussion: https://www.postgresql.org/message-id/3029e460-d47c-710e-507e-d8ba759d7cbb@iki.fi

e7f051b8

Further cleanup from the strong-random patch. · 58445c5c

Heikki Linnakangas authored Dec 12, 2016

Also use the new facility for generating RADIUS authenticator requests,
and salt in chkpass extension.

Reword the error messages to be nicer. Fix bogus error code used in the
message in BackendStartup.

58445c5c

Fix pgcrypto compilation with OpenSSL 1.1.0. · 9bbbf029

Heikki Linnakangas authored Dec 12, 2016

Was broken by the switch to using OpenSSL's EVP interface for ciphers, in
commit 5ff4a67f.

Reported by Andres Freund. Fix by Michael Paquier with some kibitzing by me.

Discussion: https://www.postgresql.org/message-id/20161201014826.ic72tfkahmevpwz7@alap3.anarazel.de

9bbbf029

Fix two thinkos related to strong random keys. · 41493bac

Heikki Linnakangas authored Dec 12, 2016

pg_backend_random() is used for MD5 salt generation, but it can fail, and
no checks were done on its status code.

Fix memory leak, if generating a random number for a cancel key failed.

Both issues were spotted by Coverity. Fix by Michael Paquier.

41493bac

Fix broken autoconf test for random number source. · ad365b2f

Heikki Linnakangas authored Dec 12, 2016

Hopefully this fixes buildfarm member jacana.

Discussion: https://www.postgresql.org/message-id/be25aa16-2f06-b7d1-8810-c69489a0e70b@dunslane.net

ad365b2f

11 Dec, 2016 2 commits

Use "%option prefix" to set API names in ecpg's lexer. · 92fb6498

Tom Lane authored Dec 11, 2016

Clean up some technical debt left behind by commit 72b1e3a2: instead of
quickly hacking the name of base_yylex() with a #define, set it properly
with "%option prefix". This causes the names of pgc.l's other exported
symbols to change as well, so run around and modify the outside references
to them as needed. Similarly, make pgc.l's external references to
base_yylval use that variable's true name instead of a macro.

The reason for doing this now is that the quick-hack solution will fail
with future versions of flex, as reported by Дилян Палаузов.
Hence, back-patch into 9.6 where the previous commit appeared, since
it's likely people will build 9.6 with newer flex versions during
its lifetime.

Discussion: https://postgr.es/m/d845c1af-e18d-6651-178f-9f08cdf37e10@aegee.org

92fb6498

Prevent crash when ts_rewrite() replaces a non-top-level subtree with null. · 0eaaaf00

Tom Lane authored Dec 11, 2016

When ts_rewrite()'s replacement argument is an empty tsquery, it's supposed
to simplify any operator nodes whose operand(s) become NULL; but it failed
to do that reliably, because dropvoidsubtree() only examined the top level
of the result tree. Rather than make a second recursive pass, let's just
give the responsibility to dofindsubquery() to simplify while it's doing
the main replacement pass. Per report from Andreas Seltenreich.

Artur Zakirov, with some cosmetic changes by me. Back-patch to all
supported branches.

Discussion: https://postgr.es/m/8737i01dew.fsf@credativ.de

0eaaaf00

09 Dec, 2016 2 commits

Be more careful about Python refcounts while creating exception objects. · 9cda81f0

Tom Lane authored Dec 09, 2016

PLy_generate_spi_exceptions neglected to do Py_INCREF on the new exception
objects, evidently supposing that PyModule_AddObject would do that --- but
it doesn't. This left us in a situation where a Python garbage collection
cycle could result in deletion of exception object(s), causing server
crashes or wrong answers if the exception objects are used later in the
session.

In addition, PLy_generate_spi_exceptions didn't bother to test for
a null result from PyErr_NewException, which at best is inconsistent
with the code in PLy_add_exceptions. And PLy_add_exceptions, while it
did do Py_INCREF on the exceptions it makes, waited to do that till
after some PyModule_AddObject calls, creating a similar risk for
failure if garbage collection happened within those calls.

To fix, refactor to have just one piece of code that creates an
exception object and adds it to the spiexceptions module, bumping the
refcount first.

Also, let's add an additional refcount to represent the pointer we're
going to store in a C global variable or hash table. This should only
matter if the user does something weird like delete the spiexceptions
Python module, but lack of paranoia has caused us enough problems in
PL/Python already.

The fact that PyModule_AddObject doesn't do a Py_INCREF of its own
explains the need for the Py_INCREF added in commit 4c966d92, so we
can improve the comment about that; also, this means we really want
to do that before not after the PyModule_AddObject call.

The missing Py_INCREF in PLy_generate_spi_exceptions was reported and
diagnosed by Rafa de la Torre; the other fixes by me. Back-patch
to all supported branches.

Discussion: https://postgr.es/m/CA+Fz15kR1OXZv43mDrJb3XY+1MuQYWhx5kx3ea6BRKQp6ezGkg@mail.gmail.com

9cda81f0

Fix crasher bug in array_position(s) · a73491e5

Alvaro Herrera authored Dec 09, 2016

array_position and its cousin array_positions were caching the element
type equality function's FmgrInfo without being careful enough to put it
in a long-lived context. This is obviously broken but it didn't matter
in most cases; only when using arrays of records (involving record_eq)
it becomes a problem. The fix is to ensure that the type's equality
function's FmgrInfo is cached in the array_position's flinfo->fn_mcxt
rather than the current memory context.

Apart from record types, the only other case that seems complex enough
to possibly cause the same problem are range types. I didn't find a way
to reproduce the problem with those, so I only include the test case
submitted with the bug report as regression test.

Bug report and patch: Junseok Yang
Discussion: https://postgr.es/m/CAE+byMupUURYiZ6bKYgMZb9pgV1CYAijJGqWj-90W=nS7uEOeA@mail.gmail.com
Backpatch to 9.5, where array_position appeared.

a73491e5

08 Dec, 2016 4 commits

Fix thinko in safeguard for negative availMem. · 64bc26f9

Heikki Linnakangas authored Dec 08, 2016

Also, use pass read_buffer_size * numInputTapes rather than just availMem
to USEMEM, to be neat.

Peter Geoghegan.

64bc26f9

Fix bogus comment. · 01ae881e

Robert Haas authored Dec 08, 2016

Commit 4212cb73 rendered a comment
in execMain.c incorrect.  Per complaint from Tom Lane, repair.

Patch from Amit Kapila, per wording suggested by Tom Lane and me.

01ae881e

Silence compiler warning. · ab4575dc
Robert Haas authored Dec 08, 2016
```
Per report from Stephen Frost.
```
ab4575dc

Log the creation of an init fork unconditionally. · fa0f466d

Robert Haas authored Dec 08, 2016

Previously, it was thought that this only needed to be done for the
benefit of possible standbys, so wal_level = minimal skipped it.
But that's not safe, because during crash recovery we might replay
XLOG_DBASE_CREATE or XLOG_TBLSPC_CREATE record which recursively
removes the directory that contains the new init fork.  So log it
always.

The user-visible effect of this bug is that if you create a database
or tablespace, then create an unlogged table, then crash without
checkpointing, then restart, accessing the table will fail, because
the it won't have been properly reset.  This commit fixes that.

Michael Paquier, per a report from Konstantin Knizhnik.  Wording of
the comments per a suggestion from me.

fa0f466d