Commits · aced5a92bf46532466417ab485bc94006cf60d91 · Abuhujair Javed / Postgres FD Implementation

06 Jan, 2018 1 commit

Rewrite ConditionVariableBroadcast() to avoid live-lock. · aced5a92

Tom Lane authored Jan 05, 2018

The original implementation of ConditionVariableBroadcast was, per its
self-description, "the dumbest way possible". Thomas Munro found out
it was a bit too dumb. An awakened process may immediately re-queue
itself, if the specific condition it's waiting for is not yet satisfied.
If this happens before ConditionVariableBroadcast is able to see the wait
queue as empty, then ConditionVariableBroadcast will re-awaken the same
process, repeating the cycle. Given unlucky timing this back-and-forth
can repeat indefinitely; loops lasting thousands of seconds have been
seen in testing.

To fix, add our own process to the end of the wait queue to serve as a
sentinel, and exit the broadcast loop once our process is not there
anymore. There are various special considerations described in the
comments, the principal disadvantage being that wakers can no longer
be sure whether they awakened a real waiter or just a sentinel. But in
practice nobody pays attention to the result of ConditionVariableSignal
or ConditionVariableBroadcast anyway, so that problem seems hypothetical.

Back-patch to v10 where condition_variable.c was introduced.

Tom Lane and Thomas Munro

Discussion: https://postgr.es/m/CAEepm=0NWKehYw7NDoUSf8juuKOPRnCyY3vuaSvhrEWsOTAa3w@mail.gmail.com

aced5a92

05 Jan, 2018 6 commits

Factor error generation out of ExecPartitionCheck. · 19c47e7c

Robert Haas authored Jan 05, 2018

At present, we always raise an ERROR if the partition constraint
is violated, but a pending patch for UPDATE tuple routing will
consider instead moving the tuple to the correct partition.
Refactor to make that simpler.

Amit Khandekar, reviewed by Amit Langote, David Rowley, and me.

Discussion: http://postgr.es/m/CAJ3gD9cue54GbEzfV-61nyGpijvjZgCcghvLsB0_nL8Nm8HzCA@mail.gmail.com

19c47e7c

pg_upgrade: remove C comment · 84a6f63e
Bruce Momjian authored Jan 05, 2018
```
Revert another part of 959ee6d2 .

Backpatch-through: 10
```
84a6f63e
pg_upgrade: revert part of patch for ease of translation · 3e6f01fd
Bruce Momjian authored Jan 05, 2018
```
Revert part of 959ee6d2 .

Backpatch-through: 10
```
3e6f01fd
pg_upgrade: simplify code layout in a few places · 959ee6d2
Bruce Momjian authored Jan 05, 2018
```
Backpatch-through: 9.4 (9.3 didn't need improving)
```
959ee6d2

Fix failure to delete spill files of aborted transactions · df9f682c

Alvaro Herrera authored Jan 05, 2018

Logical decoding's reorderbuffer.c may spill transaction files to disk
when transactions are large. These are supposed to be removed when they
become "too old" by xid; but file removal requires the boundary LSNs of
the transaction to be known. The final_lsn is only set when we see the
commit or abort record for the transaction, but nothing sets the value
for transactions that crash, so the removal code misbehaves -- in
assertion-enabled builds, it crashes by a failed assertion.

To fix, modify the final_lsn of transactions that don't have a value
set, to the LSN of the very latest change in the transaction. This
causes the spilled files to be removed appropriately.

Author: Atsushi Torikoshi
Reviewed-by: Kyotaro HORIGUCHI, Craig Ringer, Masahiko Sawada
Discussion: https://postgr.es/m/54e4e488-186b-a056-6628-50628e4e4ebc@lab.ntt.co.jp

df9f682c

Another attempt at fixing build with various OpenSSL versions · 054e8c6c

Peter Eisentraut authored Jan 04, 2018

It seems we can't easily work around the lack of
X509_get_signature_nid(), so revert the previous attempts and just
disable the tls-server-end-point feature if we don't have it.

054e8c6c

04 Jan, 2018 11 commits

Add missing includes · 1834c1e4

Peter Eisentraut authored Jan 04, 2018

<openssl/x509.h> is necessary to look into the X509 struct, used by
ac3ff8b1.

1834c1e4

Minor preparatory refactoring for UPDATE row movement. · ef6087ee

Robert Haas authored Jan 04, 2018

Generalize is_partition_attr to has_partition_attrs and make it
accessible from outside tablecmds.c. Change map_partition_varattnos
to clarify that it can be used for mapping between any two relations
in a partitioning hierarchy, not just parent -> child.

Amit Khandekar, reviewed by Amit Langote, David Rowley, and me.
Some comment changes by me.

Discussion: http://postgr.es/m/CAJ3gD9fWfxgKC+PfJZF3hkgAcNOy-LpfPxVYitDEXKHjeieWQQ@mail.gmail.com

ef6087ee

Fix build with older OpenSSL versions · ac3ff8b1

Peter Eisentraut authored Jan 04, 2018

Apparently, X509_get_signature_nid() is only in fairly new OpenSSL
versions, so use the lower-level interface it is built on instead.

ac3ff8b1

Fix new test case to not be endian-dependent. · 18869e20

Tom Lane authored Jan 04, 2018

Per buildfarm.

Discussion: https://postgr.es/m/ec295792-a69f-350f-6287-25a20e8f31d5@gmail.com

18869e20

Simplify and encapsulate tuple routing support code. · cc6337d2

Robert Haas authored Jan 04, 2018

Instead of having ExecSetupPartitionTupleRouting return multiple out
parameters, have it return a pointer to a structure containing all of
those different things. Also, provide and use a cleanup function,
ExecCleanupTupleRouting, instead of cleaning up all of the resources
allocated by ExecSetupPartitionTupleRouting individually.

Amit Khandekar, reviewed by Amit Langote, David Rowley, and me

Discussion: http://postgr.es/m/CAJ3gD9fWfxgKC+PfJZF3hkgAcNOy-LpfPxVYitDEXKHjeieWQQ@mail.gmail.com

cc6337d2

Implement channel binding tls-server-end-point for SCRAM · d3fb72ea

Peter Eisentraut authored Jan 04, 2018

This adds a second standard channel binding type for SCRAM.  It is
mainly intended for third-party clients that cannot implement
tls-unique, for example JDBC.

Author: Michael Paquier <michael.paquier@gmail.com>

d3fb72ea

Fix incorrect computations of length of null bitmap in pageinspect. · 39cfe861

Tom Lane authored Jan 04, 2018

Instead of using our standard macro for this calculation, this code
did it itself ... and got it wrong, leading to incorrect display of
the null bitmap in some cases.  Noted and fixed by Maksim Milyutin.

In passing, remove a uselessly duplicative error check.

Errors were introduced in commit d6061f83; back-patch to 9.6
where that came in.

Maksim Milyutin, reviewed by Andrey Borodin

Discussion: https://postgr.es/m/ec295792-a69f-350f-6287-25a20e8f31d5@gmail.com

39cfe861

Refactor channel binding code to fetch cbind_data only when necessary · f3049a60

Peter Eisentraut authored Jan 04, 2018

As things stand now, channel binding data is fetched from OpenSSL and
saved into the SCRAM exchange context for any SSL connection attempted
for a SCRAM authentication, resulting in data fetched but not used if no
channel binding is used or if a different channel binding type is used
than what the data is here for.

Refactor the code in such a way that binding data is fetched from the
SSL stack only when a specific channel binding is used for both the
frontend and the backend.  In order to achieve that, save the libpq
connection context directly in the SCRAM exchange state, and add a
dependency to SSL in the low-level SCRAM routines.

This makes the interface in charge of initializing the SCRAM context
cleaner as all its data comes from either PGconn* (for frontend) or
Port* (for the backend).

Author: Michael Paquier <michael.paquier@gmail.com>

f3049a60

Define LDAPS_PORT if it's missing and disable implicit LDAPS on Windows · 3ad2afc2

Peter Eisentraut authored Jan 04, 2018

Some versions of Windows don't define LDAPS_PORT.

Also, Windows' ldap_sslinit() is documented to use LDAPS even if you
said secure=0 when the port number happens to be 636 or 3269. Let's
avoid using the port number to imply that you want LDAPS, so that
connection strings have the same meaning on Windows and Unix.

Author: Thomas Munro
Discussion: https://postgr.es/m/CAEepm%3D23B7GV4AUz3MYH1TKpTv030VHxD2Sn%2BLYWDv8d-qWxww%40mail.gmail.com

3ad2afc2

Code review for Parallel Append. · c7593956

Robert Haas authored Jan 04, 2018

- Remove unnecessary #include mistakenly added in execnodes.h.
- Fix mistake in comment in choose_next_subplan_for_leader.
- Adjust row estimates in cost_append for a possibly-different
  parallel divisor.
- Clamp row estimates in cost_append after operations that may
  not produce integers.

Amit Kapila, with cosmetic adjustments by me.

Discussion: http://postgr.es/m/CAA4eK1+qcbeai3coPpRW=GFCzFeLUsuY4T-AKHqMjxpEGZBPQg@mail.gmail.com

c7593956

Tweak parallel hash join test case in hopes of improving stability. · 934c7986

Tom Lane authored Jan 04, 2018

This seems to make things better on gaur, let's see what the rest
of the buildfarm thinks.

Thomas Munro

Discussion: https://postgr.es/m/CAEepm=1uuT8iJxMEsR=jL+3zEi87DB2v0+0H9o_rUXXCZPZT3A@mail.gmail.com

934c7986

03 Jan, 2018 13 commits

Clean up tupdesc.c for recent changes. · 47c6772e

Tom Lane authored Jan 03, 2018

TupleDescCopy needs to have the same effects as CreateTupleDescCopy in
that, since it doesn't copy constraints, it should clear the per-attribute
fields associated with them.  Oversight in commit cc5f8136.

Since TupleDescCopy has already established the presumption that it
can just flat-copy the entire attribute array in one go, propagate
that approach into CreateTupleDescCopy and CreateTupleDescCopyConstr.
(I'm suspicious that this would lead to valgrind complaints if we
had any trailing padding in the struct, but we do not, and anyway
fixing that seems like a job for a separate commit.)

Add some better comments.

Thomas Munro, reviewed by Vik Fearing, some additional hacking by me

Discussion: https://postgr.es/m/CAEepm=0NvOGZ8B6GbQyQe2C_c2m3LKJ9w=8OMBaYRLgZ_Gw6Nw@mail.gmail.com

47c6772e

Fix typo · bab29698

Alvaro Herrera authored Jan 03, 2018

Author: Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/d8jefpk4jtd.fsf@dalvik.ping.uio.no

bab29698

Revert "Fix isolation test to be less timing-dependent" · 6c8be596

Alvaro Herrera authored Jan 03, 2018

This reverts commit 2268e6af. It turned out that inconsistency in
the report is still possible, so go back to the simpler formulation of
the test and instead add an alternate expected output.

Discussion: https://postgr.es/m/20180103193728.ysqpcp2xjnqpiep7@alvherre.pgsql

6c8be596

Rename pg_rewind's copy_file_range() to avoid conflict with new linux syscall. · 3e68686e

Andres Freund authored Jan 03, 2018

Upcoming versions of glibc will contain copy_file_range(2), a wrapper
around a new linux syscall for in-kernel copying of data ranges. This
conflicts with pg_rewinds function of the same name.

Therefore rename pg_rewinds version. As our version isn't a generic
copying facility we decided to choose a rewind specific function name.

Per buildfarm animal caiman and subsequent discussion with Tom Lane.

Author: Andres Freund
Discussion:
    https://postgr.es/m/20180103033425.w7jkljth3e26sduc@alap3.anarazel.de
    https://postgr.es/m/31122.1514951044@sss.pgh.pa.us
Backpatch: 9.5-, where pg_rewind was introduced

3e68686e

Fix use of config-specific libraries for Windows OpenSSL · 99d5a3ff

Andrew Dunstan authored Jan 03, 2018

Commit 614350a3 allowed for an different builds of OpenSSL libraries on
Windows, but ignored the fact that the alternative builds don't have
config-specific libraries. This patch fixes the Solution file to ask for
the correct libraries.

per offline discussions with Leonardo Cecchi and Marco Nenciarini,

Backpatch to all live branches.

99d5a3ff

Make XactLockTableWait work for transactions that are not yet self-locked · 3c27944f

Alvaro Herrera authored Jan 03, 2018

XactLockTableWait assumed that its xid argument has already added itself
to the lock table.  That assumption led to another assumption that if
locking the xid has succeeded but the xid is reported as still in
progress, then the input xid must have been a subtransaction.

These assumptions hold true for the original uses of this code in
locking related to on-disk tuples, but they break down in logical
replication slot snapshot building -- in particular, when a standby
snapshot logged contains an xid that's already in ProcArray but not yet
in the lock table.  This leads to assertion failures that can be
reproduced all the way back to 9.4, when logical decoding was
introduced.

To fix, change SubTransGetParent to SubTransGetTopmostTransaction which
has a slightly different API: it returns the argument Xid if there is no
parent, and it goes all the way to the top instead of moving up the
levels one by one.  Also, to avoid busy-waiting, add a 1ms sleep to give
the other process time to register itself in the lock table.

For consistency, change ConditionalXactLockTableWait the same way.

Author: Petr Jelínek
Discussion: https://postgr.es/m/1B3E32D8-FCF4-40B4-AEF9-5C0E3AC57969@postgrespro.ru
Reported-by: Konstantin Knizhnik
Diagnosed-by: Stas Kelvich, Petr Jelínek
Reviewed-by: Andres Freund, Robert Haas

3c27944f

Fix some minor errors in new PHJ code. · 6fcde240

Tom Lane authored Jan 03, 2018

Correct ExecParallelHashTuplePrealloc's estimate of whether the
space_allowed limit is exceeded.  Be more consistent about tuples that
are exactly HASH_CHUNK_THRESHOLD in size (they're "small", not "large").
Neither of these things explain the current buildfarm unhappiness, but
they're still bugs.

Thomas Munro, per gripe by me

Discussion: https://postgr.es/m/CAEepm=34PDuR69kfYVhmZPgMdy8pSA-MYbpesEN1SR+2oj3Y+w@mail.gmail.com

6fcde240

Teach eval_const_expressions() to handle some more cases. · 3decd150

Tom Lane authored Jan 03, 2018

Add some infrastructure (mostly macros) to make it easier to write
typical cases for constant-expression simplification. Add simplification
processing for ArrayRef, RowExpr, and ScalarArrayOpExpr node types,
which formerly went unsimplified even if all their inputs were constants.
Also teach it to simplify FieldSelect from a composite constant.
Make use of the new infrastructure to reduce the amount of code needed
for the existing ArrayExpr and ArrayCoerceExpr cases.

One existing test case changes output as a result of the fact that
RowExpr can now be folded to a constant. All the new code is exercised
by existing test cases according to gcov, so I feel no need to add
additional tests.

Tom Lane, reviewed by Dmitry Dolgov

Discussion: https://postgr.es/m/3be3b82c-e29c-b674-2163-bf47d98817b1@iki.fi

3decd150

Allow ldaps when using ldap authentication · 35c0754f

Peter Eisentraut authored Jan 03, 2018

While ldaptls=1 provides an RFC 4513 conforming way to do LDAP
authentication with TLS encryption, there was an earlier de facto
standard way to do LDAP over SSL called LDAPS.  Even though it's not
enshrined in a standard, it's still widely used and sometimes required
by organizations' network policies.  There seems to be no reason not to
support it when available in the client library.  Therefore, add support
when using OpenLDAP 2.4+ or Windows.  It can be configured with
ldapscheme=ldaps or ldapurl=ldaps://...

Add tests for both ways of requesting LDAPS and a test for the
pre-existing ldaptls=1.  Modify the 001_auth.pl test for "diagnostic
messages", which was previously relying on the server rejecting
ldaptls=1.

Author: Thomas Munro
Reviewed-By: Peter Eisentraut
Discussion: https://postgr.es/m/CAEepm=1s+pA-LZUjQ-9GQz0Z4rX_eK=DFXAF1nBQ+ROPimuOYQ@mail.gmail.com

35c0754f

Fix isolation test to be less timing-dependent · 2268e6af

Alvaro Herrera authored Jan 03, 2018

I did this by adding another locking process, which makes the other two
wait.  This way the output should be stable enough.

Per buildfarm and Andres Freund
Discussion: https://postgr.es/m/20180103034445.t3utrtrnrevfsghm@alap3.anarazel.de

2268e6af

Update copyright for 2018 · 9d4649ca
Bruce Momjian authored Jan 02, 2018
```
Backpatch-through: certain files through 9.3
```
9d4649ca

Simplify representation of aggregate transition values a bit. · f9ccf92e

Andres Freund authored Jan 02, 2018

Previously aggregate transition values for hash and other forms of
aggregation (i.e. sort and no group by) were represented
differently. Hash based aggregation used a grouping set indexed array
pointing to an array of transition values, whereas other forms of
aggregation used one flattened array with the index being computed out
of grouping set and transition offsets.

That made upcoming changes hard, so represent both as grouping set
indexed array of per-group data.

As a nice side-effect this also makes aggregation slightly faster,
because computing offsets with `transno + (setno * numTrans)` turns
out not to be that cheap (too big for x86 lea for example).

Author: Andres Freund
Discussion: https://postgr.es/m/20171128003121.nmxbm2ounxzb6n2t@alap3.anarazel.de

f9ccf92e

Ensure proper alignment of tuples in HashMemoryChunkData buffers. · 5dc692f7

Tom Lane authored Jan 02, 2018

The previous coding relied (without any documentation) on the data[]
member of HashMemoryChunkData being at a MAXALIGN'ed offset. If it
was not, the tuples would not be maxaligned either, leading to failures
on alignment-picky machines. While there seems to be no live bug on any
platform we support, this is clearly pretty fragile: any addition to or
rearrangement of the fields in HashMemoryChunkData could break it.
Let's remove the hazard by getting rid of the data[] member and instead
using pointer arithmetic with an explicitly maxalign'ed offset.

Discussion: https://postgr.es/m/14483.1514938129@sss.pgh.pa.us

5dc692f7

02 Jan, 2018 2 commits

Fix deadlock hazard in CREATE INDEX CONCURRENTLY · 54eff531

Alvaro Herrera authored Jan 02, 2018

Multiple sessions doing CREATE INDEX CONCURRENTLY simultaneously are
supposed to be able to work in parallel, as evidenced by fixes in commit
c3d09b3b specifically to support this case.  In reality, one of the
sessions would be aborted by a misterious "deadlock detected" error.

Jeff Janes diagnosed that this is because of leftover snapshots used for
system catalog scans -- this was broken by 8aa3e47510b9 keeping track of
(registering) the catalog snapshot.  To fix the deadlocks, it's enough
to de-register that snapshot prior to waiting.

Backpatch to 9.4, which introduced MVCC catalog scans.

Include an isolationtester spec that 8 out of 10 times reproduces the
deadlock with the unpatched code for me (Álvaro).

Author: Jeff Janes
Diagnosed-by: Jeff Janes
Reported-by: Jeremy Finzel
Discussion: https://postgr.es/m/CAMa1XUhHjCv8Qkx0WOr1Mpm_R4qxN26EibwCrj0Oor2YBUFUTg%40mail.gmail.com

54eff531

Don't cast between GinNullCategory and bool · 43803626

Peter Eisentraut authored Dec 26, 2017

The original idea was that we could use an isNull-style bool array
directly as a GinNullCategory array. However, the existing code already
acknowledges that that doesn't really work, because of the possibility
that bool as currently defined can have arbitrary bit patterns for true
values. So it has to loop through the nullFlags array to set each bool
value to an acceptable value. But if we are looping through the whole
array anyway, we might as well build a proper GinNullCategory array
instead and abandon the type casting. That makes the code much safer in
case bool is ever changed to something else.
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>

43803626

01 Jan, 2018 2 commits

Fix EXPLAIN ANALYZE output for Parallel Hash. · 93ea78b1

Andres Freund authored Jan 01, 2018

In a race case, EXPLAIN ANALYZE could fail to display correct nbatch
and size information.  Refactor so that participants report only on
batches they worked on rather than trying to report on all of them,
and teach explain.c to consider the HashInstrumentation object from
all participants instead of picking the first one it can find.  This
should fix an occasional build farm failure in the "join" regression
test.

Author: Thomas Munro
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/30219.1514428346%40sss.pgh.pa.us

93ea78b1

In tests, await an LSN no later than the recovery target. · 6078770c

Noah Misch authored Dec 31, 2017

Otherwise, the test fails with "Timed out while waiting for standby to
catch up". This happened rarely, perhaps only when autovacuum wrote WAL
between our choosing the recovery target and choosing the LSN to await.
Commit b26f7fa6 fixed one case of this.
Fix two more. Back-patch to 9.6, which introduced the affected test.

Discussion: https://postgr.es/m/20180101055227.GA2952815@rfd.leadboat.com

6078770c

31 Dec, 2017 2 commits

Merge coding of return/exit/continue cases in plpgsql's loop statements. · 3e724aac

Tom Lane authored Dec 31, 2017

plpgsql's five different loop control statements contained three distinct
implementations of the same (or what ought to be the same, at least)
logic for handling return/exit/continue result codes from their child
statements. At best, that's trouble waiting to happen, and there seems
no very good reason for the coding to be so different. Refactor so that
all the common logic is expressed in a single macro.

Discussion: https://postgr.es/m/26314.1514670401@sss.pgh.pa.us

3e724aac

Improve regression tests' code coverage for plpgsql control structures. · dd2243f2

Tom Lane authored Dec 31, 2017

I noticed that our code coverage report showed considerable deficiency
in test coverage for PL/pgSQL control statements. Notably, both
exec_stmt_block and most of the loop control statements had very poor
coverage of handling of return/exit/continue result codes from their
child statements; and exec_stmt_fori was seriously lacking in feature
coverage, having no test that exercised its BY or REVERSE features,
nor verification that its overflow defenses work.

Now that we have some infrastructure for plpgsql-specific test scripts,
the natural thing to do is make a new script rather than further extend
plpgsql.sql. So I created a new script plpgsql_control.sql with the
charter to test plpgsql control structures, and moved a few existing
tests there because they fell entirely under that charter. I then
added new test cases that exercise the bits of code complained of above.

Of the five kinds of loop statements, only exec_stmt_while's result code
handling is fully exercised by these tests. That would be a deficiency
as things stand, but a follow-on commit will merge the loop statements'
result code handling into one implementation. So testing each usage of
that implementation separately seems redundant.

In passing, also add a couple test cases to plpgsql.sql to more fully
exercise plpgsql's code related to expanded arrays --- I had thought
that area was sufficiently covered already, but the coverage report
showed a couple of un-executed code paths.

Discussion: https://postgr.es/m/26314.1514670401@sss.pgh.pa.us

dd2243f2

29 Dec, 2017 3 commits

Fix typo · 5303ffe7
Alvaro Herrera authored Dec 29, 2017

5303ffe7

Perform slot validity checks in a separate pass over expression. · b4093310

Andres Freund authored Dec 29, 2017

This reduces code duplication a bit, but the primary benefit that it
makes JITing expression evaluation easier. When doing so we can't, as
previously done in the interpreted case, really change opcode without
recompiling. Nor dow we just carry around unnecessary branches to
avoid re-checking over and over.

As a minor side-effect this makes ExecEvalStepOp() O(log(N)) rather
than O(N).

Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de

b4093310

Rely on executor utils to build targetlist for DML RETURNING. · 4717fdb1

Andres Freund authored Dec 29, 2017

This is useful because it gets rid of the sole direct user of
ExecAssignResultType(). A future commit will likely make use of that
and combine creating the targetlist with the initialization of the
result slot. But it seems like good code hygiene anyway.

Author: Andres Freund
Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de

4717fdb1