Commits · 46d73e0d65eef19e25bb0d31f1e5c23ff40a3444 · Abuhujair Javed / Postgres FD Implementation

12 Apr, 2016 11 commits

Remove unused function GetOldestWALSendPointer from walsender code. · 46d73e0d

Fujii Masao authored Apr 13, 2016

That unused function was introduced as a sample because synchronous
replication or replication monitoring tools might need it in the future.
Recently commit 989be081 added the function SyncRepGetOldestSyncRecPtr
which provides almost the same functionality for multiple synchronous
standbys feature. So it's time to remove that unused sample function.
This commit does that.

46d73e0d

Redefine create_upper_paths_hook as being invoked once per upper relation. · f1f01de1

Tom Lane authored Apr 12, 2016

Per discussion, this gives potential users of the hook more flexibility,
because they can build custom Paths that implement only one stage of
upper processing atop core-provided Paths for earlier stages.

f1f01de1

Improve coding of column-name parsing in psql's new crosstabview.c. · 7a5f8b5c

Tom Lane authored Apr 12, 2016

Coverity complained about this code, not without reason because it was
rather messy.  Adjust it to not scribble on the passed string; that adds
one malloc/free cycle per column name, which is going to be insignificant
in context.  We can actually const-ify both the string argument and the
PGresult.

Daniel Verité, with some further cleanup by me

7a5f8b5c

Avoid extra locks in GetSnapshotData if old_snapshot_threshold < 0 · 2201d801

Kevin Grittner authored Apr 12, 2016

On a big NUMA machine with 1000 connections in saturation load
there was a performance regression due to spinlock contention, for
acquiring values which were never used.  Just fill with dummy
values if we're not going to use them.

This patch has not been benchmarked yet on a big NUMA machine, but
it seems like a good idea on general principle, and it seemed to
prevent an apparent 2.2% regression on a single-socket i7 box
running 200 connections at saturation load.

2201d801

Improve API of GenericXLogRegister(). · 5713f039

Tom Lane authored Apr 12, 2016

Rename this function to GenericXLogRegisterBuffer() to make it clearer
what it does, and leave room for other sorts of "register" actions in
future.  Also, replace its "bool isNew" argument with an integer flags
argument, so as to allow adding more flags in future without an API
break.

Alexander Korotkov, adjusted slightly by me

5713f039

In generic WAL application and replay, ensure page "hole" is always zero. · bdf7db81

Tom Lane authored Apr 12, 2016

The previous coding could allow the contents of the "hole" between pd_lower
and pd_upper to diverge during replay from what it had been when the update
was originally applied. This would pose a problem if checksums were in
use, and in any case would complicate forensic comparisons between master
and slave servers. So force the "hole" to contain zeroes, both at initial
application of a generically-logged action, and at replay.

Alexander Korotkov, adjusted slightly by me

bdf7db81

Add page id to bloom index · 813b456e

Teodor Sigaev authored Apr 12, 2016

Added to ensure that bloom index pages can be distinguished from other pages
by pg_filedump. Because there wasn't any public/production versions before,
it doesn't pay attention to any compatibility issues.

Per notice from Tom Lane

813b456e

Remove unnecessary definition of _WIN64 in libpq/win32.mak. · e7bcde8c

Tom Lane authored Apr 12, 2016

In commit b0e40d18, I should have just
removed the /D switch defining WIN64.  The reason the code worked before
is that all Windows64 compilers automatically predefine _WIN64.  Perhaps
at one time we had code that depended on WIN64 being defined, but it's
long gone, and we should not encourage any reappearance.  Per discussion
with Christian Ullrich.

e7bcde8c

Correct copyright for newly added genericdesc.c · cd13471f
Stephen Frost authored Apr 12, 2016
```
It's 2016 these days (no, not entirely sure how we got here either).

Pointed out by Amit Langote
```
cd13471f
Fix whitespace · 70715e6a
Peter Eisentraut authored Apr 11, 2016

70715e6a

Fix _SPI_execute_plan() for CREATE TABLE IF NOT EXISTS foo AS ... · 39c283e4

Tom Lane authored Apr 11, 2016

When IF NOT EXISTS was added to CREATE TABLE AS, this logic didn't get
the memo, possibly resulting in an Assert failure.  It looks like there
would have been no ill effects in a non-Assert build, though.  Back-patch
to 9.5 where the IF NOT EXISTS option was added.

Stas Kelvich

39c283e4

11 Apr, 2016 18 commits

Fix two places that thought Windows64 is indicated by WIN64 macro. · b0e40d18

Tom Lane authored Apr 11, 2016

Everyplace else thinks it's _WIN64, so make these places fall in line.

The pg_regress.c usage is not going to result in any change in behavior,
only suppressing (or not) a compiler warning about downcasting HANDLEs.
So there seems no need for back-patching there.

The libpq/win32.mak usage might represent an actual bug, if anyone were
using this script to build for Windows64, which perhaps nobody is.
Given the lack of field complaints, no back-patch here either.

pg_regress.c problem found by Christian Ullrich, the other by me.

b0e40d18

Fix freshly-introduced PL/Python portability bug. · 1d2f9de3

Tom Lane authored Apr 11, 2016

It turns out that those PyErr_Clear() calls I removed from plpy_elog.c
in 7e3bb080 et al were not quite as random as they appeared: they
mask a Python 2.3.x bug. (Specifically, it turns out that PyType_Ready()
can fail if the error indicator is set on entry, and PLy_traceback's fetch
of frame.f_code may be the first operation in a session that requires the
"frame" type to be readied. Ick.) Put back the clear call, but in a more
centralized place closer to what it's protecting, and this time with a
comment warning what it's really for.

Per buildfarm member prairiedog. Although prairiedog was only failing
on HEAD, it seems clearly possible for this to occur in older branches
as well, so back-patch to 9.2 the same as the previous patch.

1d2f9de3

Use static inline function for BufferGetPage() · a6f6b781

Kevin Grittner authored Apr 11, 2016

I was initially concerned that the some of the hundreds of
references to BufferGetPage() where the literal
BGP_NO_SNAPSHOT_TEST were passed might not optimize as well as a
macro, leading to some hard-to-find performance regressions in
corner cases.  Inspection of disassembled code has shown identical
code at all inspected locations, and the size difference doesn't
amount to even one byte per such call.  So make it readable.

Per gripes from Álvaro Herrera and Tom Lane

a6f6b781

Make oldSnapshotControl a pointer to a volatile structure · 80647bf6

Kevin Grittner authored Apr 11, 2016

It was incorrectly declared as a volatile pointer to a non-volatile
structure.  Eliminate the OldSnapshotControl struct definition; it
is really not needed.  Pointed out by Tom Lane.

While at it, add OldSnapshotControlData to pgindent's list of
structures.

80647bf6

Fix whitespace · d8ed83cd
Peter Eisentraut authored Apr 11, 2016

d8ed83cd

Prefix RLS regression test roles with 'regress_' · 6c7b0388

Stephen Frost authored Apr 11, 2016

To avoid any possible overlap with existing roles on a system when
doing a 'make installcheck', use role names which start with
'regress_'.

Pointed out by Tom.

6c7b0388

Add directory created during build to gitignore · 29ca231b
Peter Eisentraut authored Apr 11, 2016

29ca231b

Fix missing "volatile" in PLy_output(). · 81ba9348

Tom Lane authored Apr 11, 2016

Commit 5c3c3cd0 plastered "volatile" on a bunch of variables
in PLy_output(), but removed the one that actually mattered, ie the
one on "oldcontext". This allows some versions of clang to generate
code in which "oldcontext" has been trashed when control reaches the
PG_CATCH block. Per buildfarm member tick.

81ba9348

cpluspluscheck: Update include path · ee5dbc81

Peter Eisentraut authored Apr 11, 2016

Some things in src/include/fe_utils require libpq headers, so add
libpq's include path to the command line used here.

ee5dbc81

Fix documented return type of pg_logical_emit_message() in func.sgml. · cfe96ae2
Fujii Masao authored Apr 11, 2016

cfe96ae2

Use ereport(ERROR) instead of Assert() to emit syncrep_parser error. · 0038c1e2

Fujii Masao authored Apr 11, 2016

The existing code would either Assert or generate an invalid
SyncRepConfig variable, neither of which is desirable. A regular
error should be thrown instead.

This commit silences compiler warning in non assertion-enabled builds.

Per report from Jeff Janes.
Suggested fix by Tom Lane.

0038c1e2

Fix poorly thought-through code from commit . · f73b2bbb

Tom Lane authored Apr 11, 2016

It's not entirely clear to me whether PyString_AsString can return
null (looks like the answer might vary between Python 2 and 3).
But in any case, this code's attempt to cope with the possibility
was quite broken, because pstrdup() neither allows a null argument
nor ever returns a null.

Moreover, the code below this point assumes that "message" is a
palloc'd string, which would not be the case for a dgettext result.

Fix both problems by doing the pstrdup step separately.

f73b2bbb

pg_dump: add missing "destroyPQExpBuffer(query)" in dumpForeignServer(). · 074050f1

Tom Lane authored Apr 11, 2016

Coverity complained about this resource leak (why now, I don't know,
since it's been like that a long time). Our general policy in pg_dump
is that PQExpBuffers are worth cleaning up, so do it here too. But
don't bother with a back-patch, because it seems unlikely that very
many databases contain enough FOREIGN SERVER objects to notice.

074050f1

Add comment about intentional fallthrough in switch. · 1630f5b9

Tom Lane authored Apr 10, 2016

Coverity complained about an apparent missing "break" in a switch
added by bb140506.  The human-readable comments are pretty
clear that this is intentional, but add a standard /* FALL THRU */
comment to make it clear to tools too.

1630f5b9

Clean up foreign-key caching code in planner. · 5306df28

Tom Lane authored Apr 10, 2016

Coverity complained that the code added by 015e8894 lacked an
error check for SearchSysCache1 failures, which it should have. But
the code was pretty duff in other ways too, including failure to think
about whether it could really cope with arrays of different lengths.

5306df28

Fix access-to-already-freed-memory issue in plpython's error handling. · 7e3bb080

Tom Lane authored Apr 10, 2016

PLy_elog() could attempt to access strings that Python had already freed,
because the strings that PLy_get_spi_error_data() returns are simply
pointers into storage associated with the error "val" PyObject. That's
fine at the instant PLy_get_spi_error_data() returns them, but just after
that PLy_traceback() intentionally releases the only refcount on that
object, allowing it to be freed --- so that the strings we pass to
ereport() are dangling pointers.

In principle this could result in garbage output or a coredump. In
practice, I think the risk is pretty low, because there are no Python
operations between where we decrement that refcount and where we use the
strings (and copy them into PG storage), and thus no reason for Python
to recycle the storage. Still, it's clearly hazardous, and it leads to
Valgrind complaints when running under a Valgrind that hasn't been
lobotomized to ignore Python memory allocations.

The code was a mess anyway: we fetched the error data out of Python
(clearing Python's error indicator) with PyErr_Fetch, examined it, pushed
it back into Python with PyErr_Restore (re-setting the error indicator),
then immediately pulled it back out with another PyErr_Fetch. Just to
confuse matters even more, there were some gratuitous-and-yet-hazardous
PyErr_Clear calls in the "examine" step, and we didn't get around to doing
PyErr_NormalizeException until after the second PyErr_Fetch, making it even
less clear which object was being manipulated where and whether we still
had a refcount on it. (If PyErr_NormalizeException did substitute a
different "val" object, it's possible that the problem could manifest for
real, because then we'd be doing assorted Python stuff with no refcount
on the object we have string pointers into.)

So, rearrange all that into some semblance of sanity, and don't decrement
the refcount on the Python error objects until the end of PLy_elog().
In HEAD, I failed to resist the temptation to reformat some messy bits
from 5c3c3cd0 along the way.

Back-patch as far as 9.2, because the code is substantially the same
that far back. I believe that 9.1 has the bug as well; but the code
around it is rather different and I don't want to take a chance on
breaking something for what seems a low-probability problem.

7e3bb080

Avoid the use of a separate spinlock to protect a LWLock's wait queue. · 008608b9

Andres Freund authored Apr 10, 2016

Previously we used a spinlock, in adition to the atomically manipulated
->state field, to protect the wait queue. But it's pretty simple to
instead perform the locking using a flag in state.

Due to 6150a1b0 BufferDescs, on platforms (like PPC) with > 1 byte
spinlocks, increased their size above 64byte. As 64 bytes are the size
we pad allocated BufferDescs to, this can increase false sharing;
causing performance problems in turn. Together with the previous commit
this reduces the size to <= 64 bytes on all common platforms.

Author: Andres Freund
Discussion: CAA4eK1+ZeB8PMwwktf+3bRS0Pt4Ux6Rs6Aom0uip8c6shJWmyg@mail.gmail.com
    20160327121858.zrmrjegmji2ymnvr@alap3.anarazel.de

008608b9

Allow Pin/UnpinBuffer to operate in a lockfree manner. · 48354581

Andres Freund authored Apr 10, 2016

Pinning/Unpinning a buffer is a very frequent operation; especially in
read-mostly cache resident workloads. Benchmarking shows that in various
scenarios the spinlock protecting a buffer header's state becomes a
significant bottleneck. The problem can be reproduced with pgbench -S on
larger machines, but can be considerably worse for queries which touch
the same buffers over and over at a high frequency (e.g. nested loops
over a small inner table).

To allow atomic operations to be used, cram BufferDesc's flags,
usage_count, buf_hdr_lock, refcount into a single 32bit atomic variable;
that allows to manipulate them together using 32bit compare-and-swap
operations. This requires reducing MAX_BACKENDS to 2^18-1 (which could
be lifted by using a 64bit field, but it's not a realistic configuration
atm).

As not all operations can easily implemented in a lockfree manner,
implement the previous buf_hdr_lock via a flag bit in the atomic
variable. That way we can continue to lock the header in places where
it's needed, but can get away without acquiring it in the more frequent
hot-paths.  There's some additional operations which can be done without
the lock, but aren't in this patch; but the most important places are
covered.

As bufmgr.c now essentially re-implements spinlocks, abstract the delay
logic from s_lock.c into something more generic. It now has already two
users, and more are coming up; there's a follupw patch for lwlock.c at
least.

This patch is based on a proof-of-concept written by me, which Alexander
Korotkov made into a fully working patch; the committed version is again
revised by me.  Benchmarking and testing has, amongst others, been
provided by Dilip Kumar, Alexander Korotkov, Robert Haas.

On a large x86 system improvements for readonly pgbench, with a high
client count, of a factor of 8 have been observed.

Author: Alexander Korotkov and Andres Freund
Discussion: 2400449.GjM57CE0Yg@dinodell

48354581

10 Apr, 2016 3 commits

Improve contrib/bloom regression test using code coverage info. · cf223c3b

Tom Lane authored Apr 10, 2016

Originally, this test created a 100000-row test table, which made it
run rather slowly compared to other contrib tests. Investigation with
gcov showed that we got no further improvement in code coverage after
the first 700 or so rows, making the large table 99% a waste of time.
Cut it back to 2000 rows to fix the runtime problem and still leave
some headroom for testing behaviors that may appear later.

A closer look at the gcov results showed that the main coverage
omissions in contrib/bloom occurred because the test never filled more
than one entry in the notFullPage array; which is unsurprising because
it exercised index cleanup only in the scenario of complete table
deletion, allowing every page in the index to become deleted rather
than not-full. Add testing that allows the not-full path to be
exercised as well.

Also, test the amvalidate function, because blvalidate.c had zero
coverage without that, and besides it's a good idea to check for
mistakes in the bloom opclass definitions.

cf223c3b

Fix possible NULL dereference in ExecAlterObjectDependsStmt · bd905a0d

Alvaro Herrera authored Apr 10, 2016

I used the wrong variable here.  Doesn't make a difference today because
the only plausible caller passes a non-NULL variable, but someday it
will be wrong, and even today's correctness is subtle: the caller that
does pass a NULL is never invoked because of object type constraints.
Surely not a condition to rely on.

Noted by Coverity

bd905a0d

Further minor improvement in generic_xlog.c: always say REGBUF_STANDARD. · 660d5fb8

Tom Lane authored Apr 10, 2016

Since we're requiring pages handled by generic_xlog.c to be standard
format, specify REGBUF_STANDARD when doing a full-page image, so that
xloginsert.c can compress out the "hole" between pd_lower and pd_upper.
Given the current API in which this path will be taken only for a newly
initialized page, the hole is likely to be particularly large in such
cases, so that this oversight could easily be performance-significant.
I don't notice any particular change in the runtime of contrib/bloom's
regression test, though.

660d5fb8

09 Apr, 2016 8 commits

Micro-optimize GenericXLogFinish(). · 68689c66

Tom Lane authored Apr 09, 2016

Make the inner comparison loops of computeDelta() as tight as possible by
pulling considerations of valid and invalid ranges out of the inner loops,
and extending a match or non-match detection as far as possible before
deciding what to do next.  To keep this tractable, give up the possibility
of merging fragments across the pd_lower to pd_upper gap.  The fraction of
pages where that could happen (ie, there are 4 or fewer bytes in the gap,
*and* data changes immediately adjacent to it on both sides) is too small
to be worth spending cycles on.

Also, avoid two BLCKSZ-length memcpy()s by computing the delta before
moving data into the target buffer, instead of after.  This doesn't save
nearly as many cycles as being tenser about computeDelta(), but it still
seems worth doing.

On my machine, this patch cuts a full 40% off the runtime of
contrib/bloom's regression test.

68689c66

Fix PL/Python ereport() test to work on Python 2.3. · c7a141a9
Tom Lane authored Apr 09, 2016
```
Per buildfarm.

Pavel Stehule
```
c7a141a9

Get rid of GenericXLogUnregister(). · 08e78543

Tom Lane authored Apr 09, 2016

This routine is unsafe as implemented, because it invalidates the page
image pointers returned by previous GenericXLogRegister() calls.

Rather than complicate the API or the implementation to avoid that,
let's just get rid of it; the use-case for having it seems much
too thin to justify a lot of work here.

While at it, do some wordsmithing on the SGML docs for generic WAL.

08e78543

Get rid of blinsert()'s use of GenericXLogUnregister(). · 80cf1891

Tom Lane authored Apr 09, 2016

That routine is dangerous, and unnecessary once we get rid of this
one caller.

In passing, fix failure to clean up temp memory context, or switch
back to caller's context, during slowest exit path.

80cf1891

Code review/prettification for generic_xlog.c. · db03cf37

Tom Lane authored Apr 09, 2016

Improve commentary, use more specific names for the delta fields,
const-ify pointer arguments where possible, avoid assuming that
initializing only the first element of a local array will guarantee
that the remaining elements end up as we need them. (I think that
code in generic_redo actually worked, but only because InvalidBuffer
is zero; this is a particularly ugly way of depending on that ...)

db03cf37

Run pgindent on generic_xlog.c. · 2dd318d2

Tom Lane authored Apr 09, 2016

This code desperately needs some micro-optimization, and I'd like it
to be formatted a bit more nicely while I work on it.

2dd318d2

Fix typo in C comment. · 381200be
Kevin Grittner authored Apr 09, 2016

381200be
Turn special page pointer validation to static inline function · 56dffb5a
Kevin Grittner authored Apr 09, 2016
```
Inclusion of multiple macros inside another macro was pushing MSVC
past its size liimit.  Reported by buildfarm.
```
56dffb5a