Commits · baaf272ac908ea27c09076e34f62c45fa7d1e448 · Abuhujair Javed / Postgres FD Implementation

01 Sep, 2017 9 commits

Use group updates when setting transaction status in clog. · baaf272a

Robert Haas authored Sep 01, 2017

Commit 0e141c0f introduced a mechanism
to reduce contention on ProcArrayLock by having a single process clear
XIDs in the procArray on behalf of multiple processes, reducing the
need to hand the lock around. A previous attempt to introduce a similar
mechanism for CLogControlLock in ccce90b3
crashed and burned, but the design problem which resulted in those
failures is believed to have been corrected in this version.

Amit Kapila, with some cosmetic changes by me. See the previous commit
message for additional credits.

Discussion: http://postgr.es/m/CAA4eK1KudxzgWhuywY_X=yeSAhJMT4DwCjroV5Ay60xaeB2Eew@mail.gmail.com

baaf272a

Fix two-phase commit test for recovery mode · 89c59b74

Alvaro Herrera authored Sep 01, 2017

The original code had a race condition because it never ensured the
standby was caught up before proceeding; add a wait similar to every
other place that does this.

Author: Michaël Paquier
Discussion: https://postgr.es/m/CAB7nPqTm9p+LCm1mVJYvgpwagRK+uibT-pKq0O2-paOWxT62jw@mail.gmail.com

89c59b74

Restore behavior for replication origin drop · a6979c3a

Alvaro Herrera authored Sep 01, 2017

Do for replication origins what the previous commit did for replication
slots: restore the original behavior of replication origin drop to raise
an error rather than blocking, because users might be depending on the
original behavior. Maintain the blocking behavior when invoked
internally from logical replication subscription handling.

Discussion: https://postgr.es/m/20170830133922.tlpo3lgfejm4n2cs@alvherre.pgsql

a6979c3a

Avoid race condition in logical replication test · 4f27c674
Simon Riggs authored Sep 01, 2017
```
Wait for slot to become inactive before continuing.

Author: Petr Jelinek
```
4f27c674

Add a WAIT option to DROP_REPLICATION_SLOT · be716156

Alvaro Herrera authored Sep 01, 2017

Commit 9915de6c changed the default behavior of
DROP_REPLICATION_SLOT so that it would wait until any session holding
the slot active would release it, instead of raising an error. But
users are already depending on the original behavior, so revert to it by
default and add a WAIT option to invoke the new behavior.

Per complaint from Simone Gotti, in
Discussion: https://postgr.es/m/CAEvsy6Wgdf90O6pUvg2wSVXL2omH5OPC-38OD4Zzgk-FXavj3Q@mail.gmail.com

be716156

Add note about diskspace usage of pg_commit_ts · abe85ef1
Simon Riggs authored Sep 01, 2017
```
Author: Thomas Munro
```
abe85ef1
Fix assorted carelessness about Datum vs. int64 vs. uint64 · 7b69b6ce
Robert Haas authored Sep 01, 2017
```
Bugs introduced by commit 81c5e46c
```
7b69b6ce
Try to repair poorly-considered code in previous commit. · 0d9506d1
Robert Haas authored Aug 31, 2017

0d9506d1

Introduce 64-bit hash functions with a 64-bit seed. · 81c5e46c

Robert Haas authored Aug 31, 2017

This will be useful for hash partitioning, which needs a way to seed
the hash functions to avoid problems such as a hash index on a hash
partitioned table clumping all values into a small portion of the
bucket space; it's also useful for anything that wants a 64-bit hash
value rather than a 32-bit hash value.

Just in case somebody wants a 64-bit hash value that is compatible
with the existing 32-bit hash values, make the low 32-bits of the
64-bit hash value match the 32-bit hash value when the seed is 0.

Robert Haas and Amul Sul

Discussion: http://postgr.es/m/CA+Tgmoafx2yoJuhCQQOL5CocEi-w_uG4S2xT0EtgiJnPGcHW3g@mail.gmail.com

81c5e46c

31 Aug, 2017 5 commits

Avoid memory leaks when a GatherMerge node is rescanned. · 2d44c58c

Tom Lane authored Aug 31, 2017

Rescanning a GatherMerge led to leaking some memory in the executor's
query-lifespan context, because most of the node's working data structures
were simply abandoned and rebuilt from scratch. In practice, this might
never amount to much, given the cost of relaunching worker processes ---
but it's still pretty messy, so let's fix it.

We can rearrange things so that the tuple arrays are simply cleared and
reused, and we don't need to rebuild the TupleTableSlots either, just
clear them. One small complication is that because we might get a
different number of workers on each iteration, we can't keep the old
convention that the leader's gm_slots[] entry is the last one; the leader
might clobber a TupleTableSlot that we need for a worker in a future
iteration. Hence, adjust the logic so that the leader has slot 0 always,
while the active workers have slots 1..n.

Back-patch to v10 to keep all the existing versions of nodeGatherMerge.c
in sync --- because of the renumbering of the slots, there would otherwise
be a very large risk that any future backpatches in this module would
introduce bugs.

Discussion: https://postgr.es/m/8670.1504192177@sss.pgh.pa.us

2d44c58c

Expand partitioned tables in PartDesc order. · 30833ba1

Robert Haas authored Aug 31, 2017

Previously, we expanded the inheritance hierarchy in the order in
which find_all_inheritors had locked the tables, but that turns out
to block quite a bit of useful optimization.  For example, a
partition-wise join can't count on two tables with matching bounds
to get expanded in the same order.

Where possible, this change results in expanding partitioned tables in
*bound* order.  Bound order isn't well-defined for a list-partitioned
table with a null-accepting partition or for a list-partitioned table
where the bounds for a single partition are interleaved with other
partitions.  However, when expansion in bound order is possible, it
opens up further opportunities for optimization, such as
strength-reducing MergeAppend to Append when the expansion order
matches the desired sort order.

Patch by me, with cosmetic revisions by Ashutosh Bapat.

Discussion: http://postgr.es/m/CA+TgmoZrKj7kEzcMSum3aXV4eyvvbh9WD=c6m=002WMheDyE3A@mail.gmail.com

30833ba1

Clean up shm_mq cleanup. · 6708e447

Tom Lane authored Aug 31, 2017

The logic around shm_mq_detach was a few bricks shy of a load, because
(contrary to the comments for shm_mq_attach) all it did was update the
shared shm_mq state.  That left us leaking a bit of process-local
memory, but much worse, the on_dsm_detach callback for shm_mq_detach
was still armed.  That means that whenever we ultimately detach from
the DSM segment, we'd run shm_mq_detach again for already-detached,
possibly long-dead queues.  This accidentally fails to fail today,
because we only ever re-use a shm_mq's memory for another shm_mq, and
multiple detach attempts on the last such shm_mq are fairly harmless.
But it's gonna bite us someday, so let's clean it up.

To do that, change shm_mq_detach's API so it takes a shm_mq_handle
not the underlying shm_mq.  This makes the callers simpler in most
cases anyway.  Also fix a few places in parallel.c that were just
pfree'ing the handle structs rather than doing proper cleanup.

Back-patch to v10 because of the risk that the revenant shm_mq_detach
callbacks would cause a live bug sometime.  Since this is an API
change, it's too late to do it in 9.6.  (We could make a variant
patch that preserves API, but I'm not excited enough to do that.)

Discussion: https://postgr.es/m/8670.1504192177@sss.pgh.pa.us

6708e447

Improve code coverage of select_parallel test. · 4b1dd62a
Tom Lane authored Aug 31, 2017
```
Make sure that rescans of parallel indexscans are tested.
Per code coverage report.
```
4b1dd62a

Remove to pre-8.2 coding convention for PG_MODULE_MAGIC · b5c75fec

Peter Eisentraut authored Aug 30, 2017

PG_MODULE_MAGIC has been around since 8.2, with 8.1 long since in EOL,
so remove the mention of #ifdef guards for compiling against pre-8.2
sources from the documentation.

Author: Daniel Gustafsson <daniel@yesql.se>

b5c75fec

30 Aug, 2017 4 commits

Code review for nodeGatherMerge.c. · 04e96786

Tom Lane authored Aug 30, 2017

Comment the fields of GatherMergeState, and organize them a bit more
sensibly.  Comment GMReaderTupleBuffer more usefully too.  Improve
assorted other comments that were obsolete or just not very good English.

Get rid of the use of a GMReaderTupleBuffer for the leader process;
that was confusing, since only the "done" field was used, and that
in a way redundant with need_to_scan_locally.

In gather_merge_init, avoid calling load_tuple_array for
already-known-exhausted workers.  I'm not sure if there's a live bug there,
but the case is unlikely to be well tested due to timing considerations.

Remove some useless code, such as duplicating the tts_isempty test done by
TupIsNull.

Remove useless initialization of ps.qual, replacing that with an assertion
that we have no qual to check.  (If we did, the code would fail to check
it.)

Avoid applying heap_copytuple to a null tuple.  While that fails to crash,
it's confusing and it makes the code less legible not more so IMO.

Propagate a couple of these changes into nodeGather.c, as well.

Back-patch to v10, partly because of the possibility that the
gather_merge_init change is fixing a live bug, but mostly to keep
the branches in sync to ease future bug fixes.

04e96786

Separate reinitialization of shared parallel-scan state from ExecReScan. · 41b0dd98

Tom Lane authored Aug 30, 2017

Previously, the parallel executor logic did reinitialization of shared
state within the ExecReScan code for parallel-aware scan nodes. This is
problematic, because it means that the ExecReScan call has to occur
synchronously (ie, during the parent Gather node's ReScan call). That is
swimming very much against the tide so far as the ExecReScan machinery is
concerned; the fact that it works at all today depends on a lot of fragile
assumptions, such as that no plan node between Gather and a parallel-aware
scan node is parameterized. Another objection is that because ExecReScan
might be called in workers as well as the leader, hacky extra tests are
needed in some places to prevent unwanted shared-state resets.

Hence, let's separate this code into two functions, a ReInitializeDSM
call and the ReScan call proper. ReInitializeDSM is called only in
the leader and is guaranteed to run before we start new workers.
ReScan is returned to its traditional function of resetting only local
state, which means that ExecReScan's usual habits of delaying or
eliminating child rescan calls are safe again.

As with the preceding commit 7df2c1f8, it doesn't seem to be necessary
to make these changes in 9.6, which is a good thing because the FDW and
CustomScan APIs are impacted.

Discussion: https://postgr.es/m/CAA4eK1JkByysFJNh9M349u_nNjqETuEnY_y1VUc_kJiU0bxtaQ@mail.gmail.com

41b0dd98

Restore test case from . · 6c2c5bea

Tom Lane authored Aug 30, 2017

Revert the reversion commits a20aac89 and 9b644745c. In the wake of
commit 7df2c1f8, we should get stable buildfarm results from this test;
if not, I'd like to know sooner not later.

Discussion: https://postgr.es/m/CAA4eK1JkByysFJNh9M349u_nNjqETuEnY_y1VUc_kJiU0bxtaQ@mail.gmail.com

6c2c5bea

Force rescanning of parallel-aware scan nodes below a Gather[Merge]. · 7df2c1f8

Tom Lane authored Aug 30, 2017

The ExecReScan machinery contains various optimizations for postponing
or skipping rescans of plan subtrees; for example a HashAgg node may
conclude that it can re-use the table it built before, instead of
re-reading its input subtree. But that is wrong if the input contains
a parallel-aware table scan node, since the portion of the table scanned
by the leader process is likely to vary from one rescan to the next.
This explains the timing-dependent buildfarm failures we saw after
commit a2b70c89.

The established mechanism for showing that a plan node's output is
potentially variable is to mark it as depending on some runtime Param.
Hence, to fix this, invent a dummy Param (one that has a PARAM_EXEC
parameter number, but carries no actual value) associated with each Gather
or GatherMerge node, mark parallel-aware nodes below that node as dependent
on that Param, and arrange for ExecReScanGather[Merge] to flag that Param
as changed whenever the Gather[Merge] node is rescanned.

This solution breaks an undocumented assumption made by the parallel
executor logic, namely that all rescans of nodes below a Gather[Merge]
will happen synchronously during the ReScan of the top node itself.
But that's fundamentally contrary to the design of the ExecReScan code,
and so was doomed to fail someday anyway (even if you want to argue
that the bug being fixed here wasn't a failure of that assumption).
A follow-on patch will address that issue. In the meantime, the worst
that's expected to happen is that given very bad timing luck, the leader
might have to do all the work during a rescan, because workers think
they have nothing to do, if they are able to start up before the eventual
ReScan of the leader's parallel-aware table scan node has reset the
shared scan state.

Although this problem exists in 9.6, there does not seem to be any way
for it to manifest there. Without GatherMerge, it seems that a plan tree
that has a rescan-short-circuiting node below Gather will always also
have one above it that will short-circuit in the same cases, preventing
the Gather from being rescanned. Hence we won't take the risk of
back-patching this change into 9.6. But v10 needs it.

Discussion: https://postgr.es/m/CAA4eK1JkByysFJNh9M349u_nNjqETuEnY_y1VUc_kJiU0bxtaQ@mail.gmail.com

7df2c1f8

29 Aug, 2017 6 commits

doc: Avoid sidebar element · 00f6d5c2

Peter Eisentraut authored Aug 29, 2017

The formatting of the sidebar element didn't carry over to the new tool
chain.  Instead of inventing a whole new way of dealing with it, just
convert the one use to a "note".

00f6d5c2

Doc: document libpq's restriction to INT_MAX rows in a PGresult. · 6cbee65e

Tom Lane authored Aug 29, 2017

As long as PQntuples, PQgetvalue, etc, use "int" for row numbers, we're
pretty much stuck with this limitation. The documentation formerly stated
that the result of PQntuples "might overflow on 32-bit operating systems",
which is just nonsense: that's not where the overflow would happen, and
if you did reach an overflow it would not be on a 32-bit machine, because
you'd have OOM'd long since.

Discussion: https://postgr.es/m/CA+FnnTxyLWyjY1goewmJNxC==HQCCF4fKkoCTa9qR36oRAHDPw@mail.gmail.com

6cbee65e

Teach libpq to detect integer overflow in the row count of a PGresult. · 2e70d6b5

Tom Lane authored Aug 29, 2017

Adding more than 1 billion rows to a PGresult would overflow its ntups and
tupArrSize fields, leading to client crashes. It'd be desirable to use
wider fields on 64-bit machines, but because all of libpq's external APIs
use plain "int" for row counters, that's going to be hard to accomplish
without an ABI break. Given the lack of complaints so far, and the general
pain that would be involved in using such huge PGresults, let's settle for
just preventing the overflow and reporting a useful error message if it
does happen. Also, for a couple more lines of code we can increase the
threshold of trouble from INT_MAX/2 to INT_MAX rows.

To do that, refactor pqAddTuple() to allow returning an error message that
replaces the default assumption that it failed because of out-of-memory.

Along the way, fix PQsetvalue() so that it reports all failures via
pqInternalNotice(). It already did so in the case of bad field number,
but neglected to report anything for other error causes.

Because of the potential for crashes, this seems like a back-patchable
bug fix, despite the lack of field reports.

Michael Paquier, per a complaint from Igor Korot.

Discussion: https://postgr.es/m/CA+FnnTxyLWyjY1goewmJNxC==HQCCF4fKkoCTa9qR36oRAHDPw@mail.gmail.com

2e70d6b5

Propagate sort instrumentation from workers back to leader. · bf11e7ee

Robert Haas authored Aug 29, 2017

Up until now, when parallel query was used, no details about the
sort method or space used by the workers were available; details
were shown only for any sorting done by the leader.  Fix that.

Commit 1177ab1d forced the test case
added by commit 1f6d515a to run
without parallelism; now that we have this infrastructure, allow
that again, with a little tweaking to make it pass with and without
force_parallel_mode.

Robert Haas and Tom Lane

Discussion: http://postgr.es/m/CA+Tgmoa2VBZW6S8AAXfhpHczb=Rf6RqQ2br+zJvEgwJ0uoD_tQ@mail.gmail.com

bf11e7ee

Push tuple limits through Gather and Gather Merge. · 3452dc52

Robert Haas authored Aug 29, 2017

If we only need, say, 10 tuples in total, then we certainly don't need
more than 10 tuples from any single process. Pushing down the limit
lets workers exit early when possible. For Gather Merge, there is
an additional benefit: a Sort immediately below the Gather Merge can
be done as a bounded sort if there is an applicable limit.

Robert Haas and Tom Lane

Discussion: http://postgr.es/m/CA+TgmoYa3QKKrLj5rX7UvGqhH73G1Li4B-EKxrmASaca2tFu9Q@mail.gmail.com

3452dc52

Improve docs about numeric formatting patterns (to_char/to_number). · ce5dcf54

Tom Lane authored Aug 29, 2017

The explanation about "0" versus "9" format characters was confusing
and arguably wrong; the discussion of sign handling wasn't very good
either. Notably, while it's accurate to say that "FM" strips leading
zeroes in date/time values, what it really does with numeric values
is to strip *trailing* zeroes, and then only if you wrote "9" rather
than "0". Per gripes from Erwin Brandstetter.

Discussion: https://postgr.es/m/CAGHENJ7jgRbTn6nf48xNZ=FHgL2WQ4X8mYsUAU57f-vq8PubEw@mail.gmail.com
Discussion: https://postgr.es/m/CAGHENJ45ymd=GOCu1vwV9u7GmCR80_5tW0fP9C_gJKbruGMHvQ@mail.gmail.com

ce5dcf54

28 Aug, 2017 3 commits

Doc: adjust release-note credit for parallel pg_restore fix. · dc4f356b
Tom Lane authored Aug 28, 2017
```
Discussion: https://postgr.es/m/CAFcNs+pJ6_Ud-zg3vY_Y0mzfESdM34Humt8avKrAKq_H+v18Cg@mail.gmail.com
```
dc4f356b

Fix over-aggressive sanity check in misc_sanity.sql. · 95e28b7f

Tom Lane authored Aug 28, 2017

Fix thinko in commit 8be8510c: it's okay to have dbid == 0 in normal
(non-pin) entries in pg_shdepend, because global objects such as
databases are entered that way.  The test would pass so long as it
was run in a cluster containing no databases/tablespaces owned by,
or granted to, roles other than the bootstrap superuser.  That's the
expected situation for "make check", but for "make installcheck", not
so much.

Reported by Ryan Murphy.

Discussion: https://postgr.es/m/CAHeEsBc6EQe0mxGBKDXAwJbntgfvoAd5MQC-5362SmC3Tng_6g@mail.gmail.com

95e28b7f

Clarify documentation · 46596f8d

Peter Eisentraut authored Aug 27, 2017

Discussion: https://www.postgresql.org/message-id/flat/20170618071607.GA16418%40nol.local

46596f8d

27 Aug, 2017 1 commit
- Release notes for 9.6.5, 9.5.9, 9.4.14, 9.3.19, 9.2.23. · f97c55c7
  Tom Lane authored Aug 27, 2017
  
  f97c55c7
26 Aug, 2017 5 commits

Doc: update v10 release notes through today. · 6a5366e6
Tom Lane authored Aug 26, 2017

6a5366e6

First-draft release notes for 9.6.5. · f1b10496

Tom Lane authored Aug 26, 2017

As usual, the release notes for other branches will be made by cutting
these down, but put them up for community review first.  Note the first
entry is only for 9.4.

f1b10496

Changed order of statements and added an additiona MSVC safeguard to make ecpg · 04fbe0e4
Michael Meskes authored Aug 26, 2017
```
thread test cases work on Windows.
```
04fbe0e4

pg_test_timing: Some NLS fixes · 2073c641

Peter Eisentraut authored Aug 26, 2017

The string "% of total" was marked by xgettext to be a c-format, but it
is actually not, so mark up the source to prevent that.

Compute the column widths of the final display dynamically based on the
translated strings, so that translations don't mess up the display
accidentally.

2073c641

Make setlocale in ECPG test cases thread aware on Windows. · a772624b

Michael Meskes authored Aug 26, 2017

Fix threaded test cases on Windows not to crash in setlocale() which can be
global or local to a thread on Windows.

Author: Christian Ullrich

a772624b

25 Aug, 2017 7 commits

Improve low-level backup documentation. · 449338cc

Robert Haas authored Aug 25, 2017

Our documentation hasn't really caught up with the fact that
non-exclusive backups can now be taken using pg_start_backup and
pg_stop_backup even on standbys.  Update, also correcting some
errors introduced by 52f8a59d.
Updates to the 9.6 documentation are needed as well, but that
will need a separate patch as some things are different on that
version.

David Steele, reviewed by Robert Haas and Michael Paquier

Discussion: http://postgr.es/m/d4d951b9-89c0-6bc1-b6ff-d0b2dd5a8966@pgmasters.net

449338cc

Fix locale dependency in new ecpg test case. · aae62278

Tom Lane authored Aug 25, 2017

Force sorting in "C" locale so that the output ordering doesn't vary,
per buildfarm.

In passing, add missing .gitignore entries.

Discussion: https://postgr.es/m/0975f4bb-5dee-c33c-b719-3ce44026d397@chrullrich.net

aae62278

pg_upgrade: Remove more dead code · 99ce446a
Peter Eisentraut authored Aug 25, 2017
```
related to 6ce6a618Reported-by: Christoph Berg <myon@debian.org>
```
99ce446a
Message translatability fixes · e86ac70d
Peter Eisentraut authored Aug 25, 2017

e86ac70d
Implement DO CONTINUE action for ECPG WHENEVER statement. · d22e9d53
Michael Meskes authored Aug 25, 2017
```
Author: Vinayak Pokale
Reviewed-By: Masahiko Sawada
```
d22e9d53

Code review for pushing LIMIT through subqueries. · 3f4c7917

Tom Lane authored Aug 25, 2017

Minor improvements for commit 1f6d515a. We do not need the (rather
expensive) test for SRFs in the targetlist, because since v10 any
such SRFs would appear in separate ProjectSet nodes. Also, make the
code look more like the existing cases by turning it into a simple
recursion --- the argument that there might be some performance
benefit to contorting the code seems unfounded to me, especially since
any good compiler should turn the tail-recursion into iteration anyway.

Discussion: http://postgr.es/m/CADE5jYLuugnEEUsyW6Q_4mZFYTxHxaVCQmGAsF0yiY8ZDggi-w@mail.gmail.com

3f4c7917

Add minimal regression test for blessed record type transfer. · d36f7efb

Andres Freund authored Aug 24, 2017

Test that blessed records can be transferred through a TupleQueue and
correctly decoded by another backend.  While touching the file, make
sure that force_parallel_mode settings only cover relevant tests.

Author: Thomas Munro, editorialized by Andres Freund
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/20170823054644.efuzftxjpfi6wwqs%40alap3.anarazel.de

d36f7efb