Commits · 978b2f65aa1262eb4ecbf8b3785cb1b9cf4db78e · Abuhujair Javed / Postgres FD Implementation

21 Jan, 2016 3 commits

Speedup 2PC by skipping two phase state files in normal path · 978b2f65

Simon Riggs authored Jan 20, 2016

2PC state info is written only to WAL at PREPARE, then read back from WAL at
COMMIT PREPARED/ABORT PREPARED. Prepared transactions that live past one bufmgr
checkpoint cycle will be written to disk in the same form as previously. Crash
recovery path is not altered. Measured performance gains of 50-100% for short
2PC transactions by completely avoiding writing files and fsyncing. Other
optimizations still available, further patches in related areas expected.

Stas Kelvich and heavily edited by Simon Riggs

Based upon earlier ideas and patches by Michael Paquier and Heikki Linnakangas,
a concrete example of how Postgres-XC has fed back ideas into PostgreSQL.

Reviewed by Michael Paquier, Jeff Janes and Andres Freund
Performance testing by Jesper Pedersen

978b2f65

psql: Add tab completion for COPY with query · d0f2f53c
Peter Eisentraut authored Jan 20, 2016
```
From: Andreas Karlsson <andreas@proxel.se>
```
d0f2f53c

Refactor to create generic WAL page read callback · 422a55a6

Simon Riggs authored Jan 20, 2016

Previously we didn’t have a generic WAL page read callback function,
surprisingly. Logical decoding has logical_read_local_xlog_page(), which was
actually generic, so move that to xlogfunc.c and rename to
read_local_xlog_page().
Maintain logical_read_local_xlog_page() so existing callers still work.

As requested by Michael Paquier, Alvaro Herrera and Andres Freund

422a55a6

20 Jan, 2016 5 commits

Support parallel joins, and make related improvements. · 45be99f8

Robert Haas authored Jan 20, 2016

The core innovation of this patch is the introduction of the concept
of a partial path; that is, a path which if executed in parallel will
generate a subset of the output rows in each process. Gathering a
partial path produces an ordinary (complete) path. This allows us to
generate paths for parallel joins by joining a partial path for one
side (which at the baserel level is currently always a Partial Seq
Scan) to an ordinary path on the other side. This is subject to
various restrictions at present, especially that this strategy seems
unlikely to be sensible for merge joins, so only nested loops and
hash joins paths are generated.

This also allows an Append node to be pushed below a Gather node in
the case of a partitioned table.

Testing revealed that early versions of this patch made poor decisions
in some cases, which turned out to be caused by the fact that the
original cost model for Parallel Seq Scan wasn't very good. So this
patch tries to make some modest improvements in that area.

There is much more to be done in the area of generating good parallel
plans in all cases, but this seems like a useful step forward.

Patch by me, reviewed by Dilip Kumar and Amit Kapila.

45be99f8

Support multi-stage aggregation. · a7de3dc5

Robert Haas authored Jan 20, 2016

Aggregate nodes now have two new modes: a "partial" mode where they
output the unfinalized transition state, and a "finalize" mode where
they accept unfinalized transition states rather than individual
values as input.

These new modes are not used anywhere yet, but they will be necessary
for parallel aggregation.  The infrastructure also figures to be
useful for cases where we want to aggregate local data and remote
data via the FDW interface, and want to bring back partial aggregates
from the remote side that can then be combined with locally generated
partial aggregates to produce the final value.  It may also be useful
even when neither FDWs nor parallelism are in play, as explained in
the comments in nodeAgg.c.

David Rowley and Simon Riggs, reviewed by KaiGai Kohei, Heikki
Linnakangas, Haribabu Kommi, and me.

a7de3dc5

PostgresNode: Add names to nodes · c8642d90

Alvaro Herrera authored Jan 20, 2016

This makes the log files easier to follow when investigating a test
failure.

Author: Michael Paquier
Review: Noah Misch

c8642d90

Properly install dynloader.h on MSVC builds · 216d5684

Bruce Momjian authored Jan 19, 2016

This will enable PL/Java to be cleanly compiled, as dynloader.h is a
requirement.

Report by Chapman Flack

Patch by Michael Paquier

Backpatch through 9.1

216d5684

Fix assorted inconsistencies in GIN opclass support function declarations. · dbe23289

Tom Lane authored Jan 19, 2016

GIN had some minor issues too, mostly using "internal" where something
else would be more appropriate.  I went with the same approach as in
9ff60273, namely preferring the opclass' indexed datatype for
arguments that receive an operator RHS value, even if that's not
necessarily what they really are.

Again, this is with an eye to having a uniform rule for ginvalidate()
to check support function signatures.

dbe23289

19 Jan, 2016 3 commits

Add two HyperLogLog functions · 948c9795

Alvaro Herrera authored Jan 19, 2016

New functions initHyperLogLogError() and freeHyperLogLog() simplify
using this module from elsewhere.

Author: Tomáš Vondra
Review: Peter Geoghegan

948c9795

Fix assorted inconsistencies in GiST opclass support function declarations. · 9ff60273

Tom Lane authored Jan 19, 2016

The conventions specified by the GiST SGML documentation were widely
ignored. For example, the strategy-number argument for "consistent" and
"distance" functions is specified to be a smallint, but most of the
built-in support functions declared it as an integer, and for that matter
the core code passed it using Int32GetDatum not Int16GetDatum. None of
that makes any real difference at runtime, but it's quite confusing for
newcomers to the code, and it makes it very hard to write an amvalidate()
function that checks support function signatures. So let's try to instill
some consistency here.

Another similar issue is that the "query" argument is not of a single
well-defined type, but could have different types depending on the strategy
(corresponding to search operators with different righthand-side argument
types). Some of the functions threw up their hands and declared the query
argument as being of "internal" type, which surely isn't right ("any" would
have been more appropriate); but the majority position seemed to be to
declare it as being of the indexed data type, corresponding to a search
operator with both input types the same. So I've specified a convention
that that's what to do always.

Also, the result of the "union" support function actually must be of the
index's storage type, but the documentation suggested declaring it to
return "internal", and some of the functions followed that. Standardize
on telling the truth, instead.

Similarly, standardize on declaring the "same" function's inputs as
being of the storage type, not "internal".

Also, somebody had forgotten to add the "recheck" argument to both
the documentation of the "distance" support function and all of their
SQL declarations, even though the C code was happily using that argument.
Clean that up too.

Fix up some other omissions in the docs too, such as documenting that
union's second input argument is vestigial.

So far as the errors in core function declarations go, we can just fix
pg_proc.h and bump catversion. Adjusting the erroneous declarations in
contrib modules is more debatable: in principle any change in those
scripts should involve an extension version bump, which is a pain.
However, since these changes are purely cosmetic and make no functional
difference, I think we can get away without doing that.

9ff60273

Remove Cygwin-specific code from pg_ctl · 53c949c1

Andrew Dunstan authored Jan 19, 2016

This code has been there for a long time, but it's never really been
needed. Cygwin has its own utility for registering, unregistering,
stopping and starting Windows services, and that's what's used in the
Cygwin postgres packages. So now pg_ctl for Cygwin looks like it is for
any Unix platform.

Michael Paquier and me

53c949c1

18 Jan, 2016 4 commits

Fix typo. · 85f22281
Tatsuo Ishii authored Jan 18, 2016
```
Reported by KOIZUMI Satoru.
```
85f22281
Add explicit cast to amcostestimate call. · 49b49506
Tom Lane authored Jan 17, 2016
```
My compiler doesn't complain here, but David Rowley's does ...
```
49b49506

Restructure index access method API to hide most of it at the C level. · 65c5fcd3

Tom Lane authored Jan 17, 2016

This patch reduces pg_am to just two columns, a name and a handler
function.  All the data formerly obtained from pg_am is now provided
in a C struct returned by the handler function.  This is similar to
the designs we've adopted for FDWs and tablesample methods.  There
are multiple advantages.  For one, the index AM's support functions
are now simple C functions, making them faster to call and much less
error-prone, since the C compiler can now check function signatures.
For another, this will make it far more practical to define index access
methods in installable extensions.

A disadvantage is that SQL-level code can no longer see attributes
of index AMs; in particular, some of the crosschecks in the opr_sanity
regression test are no longer possible from SQL.  We've addressed that
by adding a facility for the index AM to perform such checks instead.
(Much more could be done in that line, but for now we're content if the
amvalidate functions more or less replace what opr_sanity used to do.)
We might also want to expose some sort of reporting functionality, but
this patch doesn't do that.

Alexander Korotkov, reviewed by Petr Jelínek, and rather heavily
editorialized on by me.

65c5fcd3

Re-pgindent a few files. · 8d290c8e
Tom Lane authored Jan 17, 2016
```
In preparation for landing index AM interface changes.
```
8d290c8e

17 Jan, 2016 2 commits

Remove dead code in pg_dump. · 57ce9acc

Tom Lane authored Jan 17, 2016

Coverity quite reasonably complained that this check for fout==NULL
occurred after we'd already dereferenced fout.  However, the check
is just dead code since there is no code path by which CreateArchive
can return a null pointer.  Errors such as can't-open-that-file are
reported down inside CreateArchive, and control doesn't return.
So let's silence the warning by removing the dead code, rather than
continuing to pretend it does something.

Coverity didn't complain about this before 5b5fea2a, so back-patch
to 9.5 like that patch.

57ce9acc

psql: Add completion support for DROP INDEX CONCURRENTLY · 4189e3d6
Peter Eisentraut authored Jan 16, 2016
```
based on patch by Kyotaro Horiguchi
```
4189e3d6

15 Jan, 2016 2 commits
- Fix minor typo in comment · cf7dfbf2
  Magnus Hagander authored Jan 15, 2016
```
Tatsuro Yamada
```
  cf7dfbf2
- Fix spelling mistakes. · 23c2dd03
  Robert Haas authored Jan 14, 2016
```
Same patch submitted independently by David Rowley and Peter Geoghegan.
```
  23c2dd03
14 Jan, 2016 2 commits

Fix build_grouping_chain() to not clobber its input lists. · a923af38

Tom Lane authored Jan 14, 2016

There's no good reason for stomping on the input data; it makes the logic
in this function no simpler, in fact probably the reverse.  And it makes
it impossible to separate path generation from plan generation, as I'm
working towards doing; that will require more than one traversal of these
lists.

a923af38

Properly close token in sspi authentication · 6a61d1ff

Magnus Hagander authored Jan 14, 2016

We can never leak more than one token, but we shouldn't do that. We
don't bother closing it in the error paths since the process will
exit shortly anyway.

Christian Ullrich

6a61d1ff

13 Jan, 2016 6 commits

Handle extension members when first setting object dump flags in pg_dump. · e72d7d85

Tom Lane authored Jan 13, 2016

pg_dump's original approach to handling extension member objects was to
run around and clear (or set) their dump flags rather late in its data
collection process. Unfortunately, quite a lot of code expects those flags
to be valid before that; which was an entirely reasonable expectation
before we added extensions. In particular, this explains Karsten Hilbert's
recent report of pg_upgrade failing on a database in which an extension
has been installed into the pg_catalog schema. Its objects are initially
marked as not-to-be-dumped on the strength of their schema, and later we
change them to must-dump because we're doing a binary upgrade of their
extension; but we've already skipped essential tasks like making associated
DO_SHELL_TYPE objects.

To fix, collect extension membership data first, and incorporate it in the
initial setting of the dump flags, so that those are once again correct
from the get-go. This has the undesirable side effect of slightly
lengthening the time taken before pg_dump acquires table locks, but testing
suggests that the increase in that window is not very much.

Along the way, get rid of ugly special-case logic for deciding whether
to dump procedural languages, FDWs, and foreign servers; dump decisions
for those are now correct up-front, too.

In 9.3 and up, this also fixes erroneous logic about when to dump event
triggers (basically, they were *always* dumped before). In 9.5 and up,
transform objects had that problem too.

Since this problem came in with extensions, back-patch to all supported
versions.

e72d7d85

Access pg_dump's options structs through Archive struct, not directly. · 5b5fea2a

Tom Lane authored Jan 13, 2016

Rather than passing around DumpOptions and RestoreOptions as separate
arguments, add fields to struct Archive to carry pointers to these objects,
and access them through those fields when needed. There already was a
RestoreOptions pointer in Archive, though for no obvious reason it was part
of the "private" struct rather than out where pg_dump.c could see it.

Doing this allows reversion of quite a lot of parameter-addition changes
made in commit 0eea8047, which is a good thing IMO because this will
reduce the code delta between 9.4 and 9.5, probably easing a few future
back-patch efforts. Moreover, the previous commit only added a DumpOptions
argument to functions that had to have it at the time, which means we could
anticipate still more code churn (and more back-patch hazard) as the
requirement spread further. I'd hit exactly that problem in my upcoming
patch to fix extension membership marking, which is what motivated me to
do this.

5b5fea2a

Run pgindent on src/bin/pg_dump/* · 26905e00
Tom Lane authored Jan 13, 2016
```
To ease doing indent fixups on a couple of patches I have in progress.
```
26905e00

psql: Improve CREATE INDEX CONCURRENTLY tab completion · b1bfb28b

Peter Eisentraut authored Jan 12, 2016

The completion of CREATE INDEX CONCURRENTLY was lacking in several ways
compared to a plain CREATE INDEX command:

- CREATE INDEX <name> ON completes table names, but didn't with
  CONCURRENTLY.

- CREATE INDEX completes ON and existing index names, but with
  CONCURRENTLY it only completed ON.

- CREATE INDEX <name> completes ON, but didn't with CONCURRENTLY.

These are now all fixed.

b1bfb28b

psql: Fix CREATE INDEX tab completion · bc56d589

Peter Eisentraut authored Jan 10, 2016

The previous code supported a syntax like CREATE INDEX name
CONCURRENTLY, which never existed.  Mistake introduced in commit
37ec19a1.  Remove the addition of
CONCURRENTLY at that point.

bc56d589

psql: Update tab completion comment · 70327030
Peter Eisentraut authored Jan 10, 2016
```
This just updates a comment to match the code.

from Michael Paquier
```
70327030

12 Jan, 2016 5 commits

Add new user fn pg_current_xlog_flush_location() · e63bb454
Simon Riggs authored Jan 12, 2016
```
Tomas Vondra, reviewed by Michael Paquier and Amit Kapila
Minor edits by me
```
e63bb454

Maintain local LogwrtResult consistently · 1e29e632

Simon Riggs authored Jan 12, 2016

Teach GetFlushRecPtr() to update LogwrtResult cache as performed by all other
functions in xlog.c

1e29e632

Remove no-longer-needed old-style check for incompatible plpythons. · 796d1e88

Tom Lane authored Jan 11, 2016

Commit 866566a6 introduced a new mechanism for incompatible
plpythons to detect each other. I left the old mechanism in place,
because it seems possible that a plpython predating that commit might be
used with one postdating it. (This would require updating plpython3 but
not plpython2 or vice versa, but that seems well within the realm of
possibility.) However, surely it will not be able to happen in 9.6 or
later, so we can delete the old mechanism in HEAD.

796d1e88

Use LOAD not actual code execution to pull in plpython library. · fb6fcbd3

Tom Lane authored Jan 11, 2016

Commit 866566a6 is insufficient to prevent dump/reload failures
when using transform modules in a database with both plpython2 and
plpython3 installed.  The reason is that the transform extension scripts
use DO blocks as a mechanism to pull in the libpython library before
creating the transform function.  It's necessary to preload the library
because the dynamic loader won't do it for us on every platform, leading
to "unresolved symbol" failures when the transform library is loaded.
But it's *not* necessary to execute Python code, and doing so will
provoke a multiple-Pythons-are-loaded error even after the preceding
commit.

To fix, use LOAD instead of a DO block.  That requires superuser privilege,
but creation of a C function does anyway.  It also embeds knowledge of
the underlying library name for each PL language; but that's wired into
the initdb-time contents of pg_pltemplate too, so that doesn't seem like
a large problem either.  Note that CREATE TRANSFORM as such doesn't call
the language module at all.

Per a report from Paul Jones.  Back-patch to 9.5 where transform modules
were introduced.

fb6fcbd3

Avoid dump/reload problems when using both plpython2 and plpython3. · 866566a6

Tom Lane authored Jan 11, 2016

Commit 80371601 installed a safeguard against loading plpython2
and plpython3 at the same time, but asserted that both could still be
used in the same database, just not in the same session. However, that's
not actually all that practical because dumping and reloading will fail
(since both libraries necessarily get loaded into the restoring session).
pg_upgrade is even worse, because it checks for missing libraries by
loading every .so library mentioned in the entire installation into one
session, so that you can have only one across the whole cluster.

We can improve matters by not throwing the error immediately in _PG_init,
but only when and if we're asked to do something that requires calling
into libpython. This ameliorates both of the above situations, since
while execution of CREATE LANGUAGE, CREATE FUNCTION, etc will result in
loading plpython, it isn't asked to do anything interesting (at least
not if check_function_bodies is off, as it will be during a restore).

It's possible that this opens some corner-case holes in which a crash
could be provoked with sufficient effort. However, since plpython
only exists as an untrusted language, any such crash would require
superuser privileges, making it "don't do that" not a security issue.
To reduce the hazards in this area, the error is still FATAL when it
does get thrown.

Per a report from Paul Jones. Back-patch to 9.2, which is as far back
as the patch applies without work. (It could be made to work in 9.1,
but given the lack of previous complaints, I'm disinclined to expend
effort so far back. We've been pretty desultory about support for
Python 3 in 9.1 anyway.)

866566a6

11 Jan, 2016 2 commits
- Remove obsolete comment. · 950ab82c
  Robert Haas authored Jan 10, 2016
```
Noted while reviewing a question from Dickson S. Guedes.
```
  950ab82c
- doc: Fix typo in logical decoding documentation · c618e1b5
  Peter Eisentraut authored Jan 10, 2016
```
From: Petr Jelinek <petr@2ndquadrant.com>
```
  c618e1b5
09 Jan, 2016 6 commits

Remove a useless PG_GETARG_DATUM() call from jsonb_build_array. · 820bdccc