Commits · ad4a7ed0996ee044ee7291559deddf9842d8bbf7 · Abuhujair Javed / Postgres FD Implementation

12 Oct, 2017 9 commits

Robert Haas authored Oct 12, 2017

Commits 6476b261
and 14f67a8e didn't use quite the
same error message for what is basically the same situation.

Amit Langote, pared back a bit by me.

Discussion: http://postgr.es/m/54dc76d0-3b5b-ba5a-27dc-fb31a3975b61@lab.ntt.co.jp

ad4a7ed0

Doc: fix typo in release notes. · 0a047a1e

Tom Lane authored Oct 12, 2017

Ioseph Kim

Discussion: https://postgr.es/m/e7a79f91-8244-5bcb-afcc-96c817e86f4e@postgresql.kr

0a047a1e

Infer functional dependency past RelabelType · e9ef11ac

Alvaro Herrera authored Oct 12, 2017

Vars hidden within a RelabelType would not be detected as compatible
with some functional dependency.  Repair by properly ignoring the
RelabelType.

Author: David Rowley
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/CAKJS1f-y-UEy=rsBXynBOgiW1fKMr_LVoYSGL9QOc36mLEC-ww@mail.gmail.com

e9ef11ac

Fix logical replication to fire BEFORE ROW DELETE triggers. · 360fd1a7

Robert Haas authored Oct 12, 2017

Before, that would fail to happen unless a BEFORE ROW UPDATE trigger
was also present.

Noted by me while reviewing a patch from Masahiko Sawada, who also
wrote this patch.  Reviewed by Petr Jelinek.

Discussion: http://postgr.es/m/CA+TgmobAZvCxduG8y_mQKBK7nz-vhbdLvjM354KEFozpuzMN5A@mail.gmail.com

360fd1a7

Replace remaining uses of pq_sendint with pq_sendint{8,16,32}. · 31079a4a

Andres Freund authored Oct 11, 2017

pq_sendint() remains, so extension code doesn't unnecessarily break.

Author: Andres Freund
Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de

31079a4a

Prevent sharing transition states between ordered-set aggregates. · 52328727

Tom Lane authored Oct 11, 2017

This ought to work, but the built-in OSAs are not capable of coping,
because their final-functions destructively modify their transition
state (specifically, the contained tuplesort object). That was fine
when those functions were written, but commit 804163bc moved the
goalposts without telling orderedsetaggs.c.

We should fix the built-in OSAs to support this, but it will take
a little work, especially if we don't want to sacrifice performance
in the normal non-shared-state case. Given that it took a year after
9.6 release for anyone to notice this bug, we should not prioritize
sharable-state over nonsharable-state performance. And a proper fix
is likely to be more complicated than we'd want to back-patch, too.

Therefore, let's just put in this stop-gap patch to prevent nodeAgg.c
from choosing to use shared state for OSAs. We can revert it in HEAD
when we get a better fix.

Report from Lukas Eder, diagnosis by me, patch by David Rowley.
Back-patch to 9.6 where the problem was introduced.

Discussion: https://postgr.es/m/CAB4ELO5RZhOamuT9Xsf72ozbenDLLXZKSk07FiSVsuJNZB861A@mail.gmail.com

52328727

Temporary attempt at a workaround for further MSVC restrict build failures. · 36b4b91b

Andres Freund authored Oct 11, 2017

It appears some versions of msvc use __declspec(restrict) in stdlib.h
and subsidiary headers. Including those after defining 'restrict' to
'__restrict' doesn't work.  Try to get the buildfarm green to see
whether there's further problems, by including stdlib.h just before
said define.

36b4b91b

Work around overly strict restrict checks by MSVC. · 060b0699

Andres Freund authored Oct 11, 2017

Apparently MSVC requires a * before a restrict in a variable
declaration, even if the adorned type already is a pointer, just via
typedef.

As reported by buildfarm animal woodlouse.

Author: Andres Freund
Discussion: https://postgr.es/m/20171012001320.4putagiruuehtvb6@alap3.anarazel.de

060b0699

Improve performance of SendRowDescriptionMessage. · 4c119fbc

Andres Freund authored Oct 11, 2017

There's three categories of changes leading to better performance:
- Splitting the per-attribute part of SendRowDescriptionMessage into a
  v2 and a v3 version allows avoiding branches for every attribute.
- Preallocating the size of the buffer to be big enough for all
  attributes and then using pq_write* avoids unnecessary buffer
  size checks & resizing.
- Reusing a persistently allocated StringInfo for all
  SendRowDescriptionMessage() invocations avoids repeated allocations
  & reallocations.

Author: Andres Freund
Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de

4c119fbc

11 Oct, 2017 12 commits

pg_stat_statements: Widen query IDs from 32 bits to 64 bits. · cff440d3

Robert Haas authored Oct 11, 2017

This takes advantage of the infrastructure introduced by commit
81c5e46c to greatly reduce the
likelihood that two different queries will end up with the same query
ID.  It's still possible, of course, but whereas before it the chances
of a collision reached 25% around 50,000 queries, it will now take
more than 3 billion queries.

Backward incompatibility: Because the type exposed at the SQL level is
int8, users may now see negative query IDs in the pg_stat_statements
view (and also, query IDs more than 4 billion, which was the old
limit).

Patch by me, reviewed by Michael Paquier and Peter Geoghegan.

Discussion: http://postgr.es/m/CA+TgmobG_Kp4cBKFmsznUAaM1GWW6hhRNiZC0KjRMOOeYnz5Yw@mail.gmail.com

cff440d3

Use one stringbuffer for all rows printed in printtup.c. · f2dec34e

Andres Freund authored Oct 11, 2017

This avoids newly allocating, and then possibly growing, the
stringbuffer for every row. For wide rows this can substantially
reduce memory allocator overhead, at the price of not immediately
reducing memory usage after outputting an especially wide row.

Author: Andres Freund
Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de

f2dec34e

Add more efficient functions to pqformat API. · 1de09ad8

Andres Freund authored Oct 11, 2017

There's three prongs to achieve greater efficiency here:

1) Allow reusing a stringbuffer across pq_beginmessage/endmessage,
   with the new pq_beginmessage_reuse/endmessage_reuse. This can be
   beneficial both because it avoids allocating the initial buffer,
   and because it's more likely to already have an correctly sized
   buffer.

2) Replacing pq_sendint() with pq_sendint$width() inline
   functions. Previously unnecessary and unpredictable branches in
   pq_sendint() were needed. Additionally the replacement functions
   are implemented more efficiently.  pq_sendint is now deprecated, a
   separate commit will convert all in-tree callers.

3) Add pq_writeint$width(), pq_writestring(). These rely on sufficient
   space in the StringInfo's buffer, avoiding individual space checks
   & potential individual resizing.  To allow this to be used for
   strings, expose mbutil.c's MAX_CONVERSION_GROWTH.

Followup commits will make use of these facilities.

Author: Andres Freund
Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de

1de09ad8

Allow to avoid NUL-byte management for stringinfos and use in format.c. · 70c2d1be

Andres Freund authored Oct 11, 2017

In a lot of the places having appendBinaryStringInfo() maintain a
trailing NUL byte wasn't actually meaningful, e.g. when appending an
integer which can contain 0 in one of its bytes.

Removing this yields some small speedup, but more importantly will be
more consistent when providing faster variants of pq_sendint etc.

Author: Andres Freund
Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de

70c2d1be

Add configure infrastructure to detect support for C99's restrict. · 0b974dba

Andres Freund authored Oct 11, 2017

Will be used in later commits improving performance for a few key
routines where information about aliasing allows for significantly
better code generation.

This allows to use the C99 'restrict' keyword without breaking C89, or
for that matter C++, compilers. If not supported it's defined to be
empty.

Author: Andres Freund
Discussion: https://postgr.es/m/20170914063418.sckdzgjfrsbekae4@alap3.anarazel.de

0b974dba

Remove unnecessary PG_TRY overhead for CurrentResourceOwner changes. · 5fa6b0d1

Tom Lane authored Oct 11, 2017

resowner/README contained advice to use a PG_TRY block to restore the
old CurrentResourceOwner value anywhere that that variable is transiently
changed. That advice was only inconsistently followed, however, and
on reflection it seems like unnecessary overhead. We don't bother
with such a convention for transient CurrentMemoryContext changes,
on the grounds that any (sub)transaction abort will start out by
resetting CurrentMemoryContext to what it wants. But the same is
true of CurrentResourceOwner, so there seems no need to treat it
differently.

Hence, remove PG_TRY blocks that exist only to restore CurrentResourceOwner
before re-throwing the error. There are a couple of places that restore
it along with some other actions, and I left those alone; the restore is
probably unnecessary but no noticeable gain will result from removing it.

Discussion: https://postgr.es/m/5236.1507583529@sss.pgh.pa.us

5fa6b0d1

Prevent idle in transaction session timeout from sometimes being ignored. · f6766166

Andres Freund authored Oct 11, 2017

The previous coding in ProcessInterrupts() could lead to
idle_in_transaction_session_timeout being ignored, when
statement_timeout occurred earlier.

The problem was that ProcessInterrupts() would return before
processing the transaction timeout if QueryCancelPending was set while
QueryCancelHoldoffCount != 0 - which is the case when reading new
commands from the client. Ergo when the idle transaction timeout would
hit.

Fix that by removing the early return. Alternatively the transaction
timeout code could have been moved up, but that early return seems
like an issue that could hit other cases too.

Author: Lukas Fittl
Bug: #14821
Discussion:
    https://www.postgresql.org/message-id/20170921010956.17345.61461%40wrigleys.postgresql.org
    https://www.postgresql.org/message-id/CAP53PkxQnv3OWJpyNPGJYT62uY=n1=2CF_Lpc6gVOFnc0-gazw@mail.gmail.com
Backpatch: 9.6-, where idle_in_transaction_session_timeout was introduced.

f6766166

Doc: fix missing explanation of default object privileges. · 28605968

Tom Lane authored Oct 11, 2017

The GRANT reference page, which lists the default privileges for new
objects, failed to mention that USAGE is granted by default for data
types and domains. As a lesser sin, it also did not specify anything
about the initial privileges for sequences, FDWs, foreign servers,
or large objects. Fix that, and add a comment to acldefault() in the
probably vain hope of getting people to maintain this list in future.

Noted by Laurenz Albe, though I editorialized on the wording a bit.
Back-patch to all supported branches, since they all have this behavior.

Discussion: https://postgr.es/m/1507620895.4152.1.camel@cybertec.at

28605968

Fix mistakes in comments. · 20d210bf

Robert Haas authored Oct 11, 2017

Masahiko Sawada

Discussion: http://postgr.es/m/CAD21AoBsfYsMHD6_SL9iN3n_Foaa+oPbL5jG55DxU1ChaujqwQ@mail.gmail.com

20d210bf

Fix low-probability loss of NOTIFY messages due to XID wraparound. · 118e99c3

Tom Lane authored Oct 11, 2017

Up to now async.c has used TransactionIdIsInProgress() to detect whether
a notify message's source transaction is still running. However, that
function has a quick-exit path that reports that XIDs before RecentXmin
are no longer running. If a listening backend is doing nothing but
listening, and not running any queries, there is nothing that will advance
its value of RecentXmin. Once 2 billion transactions elapse, the
RecentXmin check causes active transactions to be reported as not running.
If they aren't committed yet according to CLOG, async.c decides they
aborted and discards their messages. The timing for that is a bit tight
but it can happen when multiple backends are sending notifies concurrently.
The net symptom therefore is that a sufficiently-long-surviving
listen-only backend starts to miss some fraction of NOTIFY traffic,
but only under heavy load.

The only function that updates RecentXmin is GetSnapshotData().
A brute-force fix would therefore be to take a snapshot before
processing incoming notify messages. But that would add cycles,
as well as contention for the ProcArrayLock. We can be smarter:
having taken the snapshot, let's use that to check for running
XIDs, and not call TransactionIdIsInProgress() at all. In this
way we reduce the number of ProcArrayLock acquisitions from one
per message to one per notify interrupt; that's the same under
light load but should be a benefit under heavy load. Light testing
says that this change is a wash performance-wise for normal loads.

I looked around for other callers of TransactionIdIsInProgress()
that might be at similar risk, and didn't find any; all of them
are inside transactions that presumably have already taken a
snapshot.

Problem report and diagnosis by Marko Tiikkaja, patch by me.
Back-patch to all supported branches, since it's been like this
since 9.0.

Discussion: https://postgr.es/m/20170926182935.14128.65278@wrigleys.postgresql.org

118e99c3

Add port/strnlen support to libpq and ecpg Makefiles. · 46912d9b

Tom Lane authored Oct 11, 2017

In the wake of fffd651e, any makefile that pulls in snprintf.c
from src/port/ needs to be prepared to pull in strnlen.c as well.
Per buildfarm.

46912d9b

Fix whitespace · e9e0f78b
Peter Eisentraut authored Oct 11, 2017

e9e0f78b

10 Oct, 2017 4 commits

Regenerate configure script. · f4128ab4

Tom Lane authored Oct 10, 2017

Not sure how fffd651e ended up
probing for "strnlenfrak" rather than "strnlen".
My autoconf doesn't do that ...

f4128ab4

Rewrite strnlen replacement implementation from . · fffd651e

Andres Freund authored Oct 10, 2017

The previous placement of the fallback implementation in libpgcommon
was problematic, because libpqport functions need strnlen
functionality.

Move replacement into libpgport. Provide strnlen() under its posix
name, instead of pg_strnlen(). Fix stupid configure bug, executing the
test only when compiled with threading support.

Author: Andres Freund
Discussion: https://postgr.es/m/E1e1gR2-0005fB-SI@gemulon.postgresql.org

fffd651e

Add missing clean step to src/test/modules/brin/Makefile. · fa5e119d

Tom Lane authored Oct 10, 2017

I noticed the tmp_check subdirectory wasn't getting cleaned up
after a check-world run.  Apparently pgxs.mk will only do this
for you if you've defined REGRESS.  The only other src/test/modules
Makefile that does not set that is snapshot_too_old, and it
does it like this.

fa5e119d

Use lower-case SGML attribute values · 44b3230e
Peter Eisentraut authored Oct 08, 2017
```
for DocBook XML compatibility
```
44b3230e

09 Oct, 2017 3 commits

Fix pnstrdup() to not memcpy() the maximum allowed length. · 82c117cb

Andres Freund authored Oct 09, 2017

The previous behaviour was dangerous if the length passed wasn't the
size of the underlying buffer, but the maximum size of the underlying
buffer.

Author: Andres Freund
Discussion: https://postgr.es/m/20161003215524.mwz5p45pcverrkyk@alap3.anarazel.de

82c117cb

Add pg_strnlen() a portable implementation of strlen. · 8a241792

Andres Freund authored Oct 09, 2017

As the OS version is likely going to be more optimized, fall back to
it if available, as detected by configure.

8a241792

Remove unused documentation file · 71c75ddf
Peter Eisentraut authored Oct 08, 2017

71c75ddf

08 Oct, 2017 3 commits

Reduce memory usage of targetlist SRFs. · 84ad4b03

Andres Freund authored Oct 08, 2017

Previously nodeProjectSet only released memory once per input tuple,
rather than once per returned tuple. If the computation of an
individual returned tuple requires a lot of memory, that can lead to
problems.

Instead change things so that the expression context can be reset once
per output tuple, which requires a new memory context to store SRF
arguments in.

This is a longstanding issue, but was hard to fix before 9.6, due to
the way tSRFs where evaluated. But it's fairly easy to fix now. We
could backpatch this into 10, but given there've been fewc omplaints
that doesn't seem worth the risk so far.

Reported-By: Lucas Fairchild
Author: Andres Freund, per discussion with Tom Lane
Discussion: https://postgr.es/m/4514.1507318623@sss.pgh.pa.us

84ad4b03

Increase distance between flush requests during bulk file copies. · 643c27e3

Tom Lane authored Oct 08, 2017

copy_file() reads and writes data 64KB at a time (with default BLCKSZ),
and historically has issued a pg_flush_data request after each write.
This turns out to interact really badly with macOS's new APFS file
system: a large file copy takes over 100X longer than it ought to on
APFS, as reported by Brent Dearth. While that's arguably a macOS bug,
it's not clear whether Apple will do anything about it in the near
future, and in any case experimentation suggests that issuing flushes
a bit less often can be helpful on other platforms too.

Hence, rearrange the logic in copy_file() so that flush requests are
issued once per N writes rather than every time through the loop.
I set the FLUSH_DISTANCE to 32MB on macOS (any less than that still
results in a noticeable speed degradation on APFS), but 1MB elsewhere.
In limited testing on Linux and FreeBSD, this seems slightly faster
than the previous code, and certainly no worse. It helps noticeably
on macOS even with the older HFS filesystem.

A simpler change would have been to just increase the size of the
copy buffer without changing the loop logic, but that seems likely
to trash the processor cache without really helping much.

Back-patch to 9.6 where we introduced msync() as an implementation
option for pg_flush_data(). The problem seems specific to APFS's
mmap/msync support, so I don't think we need to go further back.

Discussion: https://postgr.es/m/CADkxhTNv-j2jw2g8H57deMeAbfRgYBoLmVuXkC=YCFBXRuCOww@mail.gmail.com

643c27e3

Reduce "X = X" to "X IS NOT NULL", if it's easy to do so. · 8ec5429e

Tom Lane authored Oct 08, 2017

If the operator is a strict btree equality operator, and X isn't volatile,
then the clause must yield true for any non-null value of X, or null if X
is null. At top level of a WHERE clause, we can ignore the distinction
between false and null results, so it's valid to simplify the clause to
"X IS NOT NULL". This is a useful improvement mainly because we'll get
a far better selectivity estimate in most cases.

Because such cases seldom arise in well-written queries, it is unappetizing
to expend a lot of planner cycles looking for them ... but it turns out
that there's a place we can shoehorn this in practically for free, because
equivclass.c already has to detect and reject candidate equivalences of the
form X = X. That doesn't catch every place that it would be valid to
simplify to X IS NOT NULL, but it catches the typical case. Working harder
doesn't seem justified.

Patch by me, reviewed by Petr Jelinek

Discussion: https://postgr.es/m/CAMjNa7cC4X9YR-vAJS-jSYCajhRDvJQnN7m2sLH1wLh-_Z2bsw@mail.gmail.com

8ec5429e

07 Oct, 2017 3 commits

Improve pg_regress's error reporting for schedule-file problems. · b11f0d36

Tom Lane authored Oct 07, 2017

The previous coding here trashed the line buffer as it scanned it,
making it impossible to print the source line in subsequent error
messages.  With a few save/restore/strdup pushups we can improve
that situation.

In passing, move the free'ing of the various strings that are collected
while processing one set of tests down to the bottom of the loop.
That's simpler, less surprising, and should make valgrind less unhappy
about the strings that were previously leaked by the last iteration.

b11f0d36

Enforce our convention about max number of parallel regression tests. · ef73a816

Tom Lane authored Oct 07, 2017

We have a very old rule that parallel_schedule should have no more
than twenty tests in any one parallel group, so as to provide a
bound on the number of concurrently running processes needed to
pass the tests. But people keep forgetting the rule, so let's add
a few lines of code to check it.

Discussion: https://postgr.es/m/a37e9c57-22d4-1b82-1270-4501cd2e984e@2ndquadrant.com

ef73a816

Clean up sloppy maintenance of regression test schedule files. · 1fdab4d5

Tom Lane authored Oct 07, 2017

The partition_join test was added to a parallel group that was already
at the maximum of 20 concurrent tests. The hash_func test wasn't
added to serial_schedule at all. The identity and partition_join tests
were added to serial_schedule with the aid of a dartboard, rather than
maintaining consistency with parallel_schedule.

There are proposals afoot to make these sorts of errors harder to make,
but in the meantime let's fix the ones already in place.

Discussion: https://postgr.es/m/a37e9c57-22d4-1b82-1270-4501cd2e984e@2ndquadrant.com

1fdab4d5

06 Oct, 2017 6 commits

Fix crash when logical decoding is invoked from a PL function. · 1518d078

Tom Lane authored Oct 06, 2017

The logical decoding functions do BeginInternalSubTransaction and
RollbackAndReleaseCurrentSubTransaction to clean up after themselves.
It turns out that AtEOSubXact_SPI has an unrecognized assumption that
we always need to cancel the active SPI operation in the SPI context
that surrounds the subtransaction (if there is one).  That's true
when the RollbackAndReleaseCurrentSubTransaction call is coming from
the SPI-using function itself, but not when it's happening inside
some unrelated function invoked by a SPI query.  In practice the
affected callers are the various PLs.

To fix, record the current subtransaction ID when we begin a SPI
operation, and clean up only if that ID is the subtransaction being
canceled.

Also, remove AtEOSubXact_SPI's assertion that it must have cleaned
up the surrounding SPI context's active tuptable.  That's proven
wrong by the same test case.

Also clarify (or, if you prefer, reinterpret) the calling conventions
for _SPI_begin_call and _SPI_end_call.  The memory context cleanup
in the latter means that these have always had the flavor of a matched
resource-management pair, but they weren't documented that way before.

Per report from Ben Chobot.

Back-patch to 9.4 where logical decoding came in.  In principle,
the SPI changes should go all the way back, since the problem dates
back to commit 7ec1c5a8.  But given the lack of field complaints
it seems few people are using internal subtransactions in this way.
So I don't feel a need to take any risks in 9.2/9.3.

Discussion: https://postgr.es/m/73FBA179-C68C-4540-9473-71E865408B15@silentmedia.com

1518d078

Copy information from the relcache instead of pointing to it. · 45866c75

Robert Haas authored Oct 06, 2017

We have the relations continuously locked, but not open, so relcache
pointers are not guaranteed to be stable.  Per buildfarm member
prion.

Ashutosh Bapat.  I fixed a typo.

Discussion: http://postgr.es/m/CAFjFpRcRBqoKLZSNmRsjKr81uEP=ennvqSQaXVCCBTXvJ2rW+Q@mail.gmail.com

45866c75

Fix intra-query memory leakage in nodeProjectSet.c. · a1c2c430

Tom Lane authored Oct 06, 2017

Both ExecMakeFunctionResultSet() and evaluation of simple expressions
need to be done in the per-tuple memory context, not per-query, else
we leak data until end of query. This is a consideration that was
missed while refactoring code in the ProjectSet patch (note that in
pre-v10, ExecMakeFunctionResult is called in the per-tuple context).

Per bug #14843 from Ben M. Diagnosed independently by Andres and myself.

Discussion: https://postgr.es/m/20171005230321.28561.15927@wrigleys.postgresql.org

a1c2c430

Fix access-off-end-of-array in clog.c. · 6b87416c

Tom Lane authored Oct 06, 2017

Sloppy loop coding in set_status_by_pages() resulted in fetching one array
element more than it should from the subxids[] array. The odds of this
resulting in SIGSEGV are pretty small, but we've certainly seen that happen
with similar mistakes elsewhere. While at it, we can get rid of an extra
TransactionIdToPage() calculation per loop.

Per report from David Binderman. Back-patch to all supported branches,
since this code is quite old.

Discussion: https://postgr.es/m/HE1PR0802MB2331CBA919CBFFF0C465EB429C710@HE1PR0802MB2331.eurprd08.prod.outlook.com

6b87416c

Support coverage on vpath builds · c3d9a660

Peter Eisentraut authored Aug 10, 2017

A few paths needed to be tweaked so everything looks into the
appropriate directories.
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>

c3d9a660

Run coverage commands quietly · 52e1b1b0

Peter Eisentraut authored Aug 10, 2017

They are very chatty by default, but the output doesn't seem all that
useful for normal operation.
Reviewed-by: Michael Paquier <michael.paquier@gmail.com>

52e1b1b0