Commits · 11e178d0dc4bc2328ae4759090b3c48b07023fab · Abuhujair Javed / Postgres FD Implementation

21 Apr, 2016 4 commits

Inline initial comparisons in TestForOldSnapshot() · 11e178d0

Kevin Grittner authored Apr 21, 2016

Even with old_snapshot_threshold = -1 (which disables the "snapshot
too old" feature), performance regressions were seen at moderate to
high concurrency. For example, a one-socket, four-core system
running 200 connections at saturation could see up to a 2.3%
regression, with larger regressions possible on NUMA machines.
By inlining the early (smaller, faster) tests in the
TestForOldSnapshot() function, the i7 case dropped to a 0.2%
regression, which could easily just be noise, and is clearly an
improvement. Further testing will show whether more is needed.

11e178d0

postgres_fdw: Don't push down certain full joins. · 5b1f9ce1

Robert Haas authored Apr 20, 2016

If there's a filter condition on either side of a full outer join,
it is neither correct to attach it to the join's ON clause nor to
throw it into the toplevel WHERE clause.  Just don't push down the
join in that case.

To maximize the number of cases where we can still push down full
joins, push inner join conditions into the ON clause at the first
opportunity rather than postponing them to the top-level WHERE
clause.  This produces nicer SQL, anyway.

This bug was introduced in e4106b25.

Ashutosh Bapat, per report from Rajkumar Raghuwanshi.

5b1f9ce1

Honor PGCTLTIMEOUT environment variable for pg_regress' startup wait. · cbabb70f

Tom Lane authored Apr 20, 2016

In commit 2ffa8696 we made pg_ctl recognize an environment variable
PGCTLTIMEOUT to set the default timeout for starting and stopping the
postmaster. However, pg_regress uses pg_ctl only for the "stop" end of
that; it has bespoke code for starting the postmaster, and that code has
historically had a hard-wired 60-second timeout. Further buildfarm
experience says it'd be a good idea if that timeout were also controlled
by PGCTLTIMEOUT, so let's make it so. Like the previous patch, back-patch
to all active branches.

Discussion: <13969.1461191936@sss.pgh.pa.us>

cbabb70f

Add pg_dump support for the new PARALLEL option for aggregates. · b4e0f183
Robert Haas authored Apr 20, 2016
```
This was an oversight in commit 41ea0c23.

Fabrízio de Royes Mello, per a report from Tushar Ahuja
```
b4e0f183

20 Apr, 2016 4 commits

Forbid parallel Hash Right Join or Hash Full Join. · 9c75e1a3
Robert Haas authored Apr 20, 2016
```
That won't work.  You'll get bogus null-extended rows.

Mithun Cy
```
9c75e1a3

Update backup documentation for new APIs · cfb863f2

Magnus Hagander authored Apr 20, 2016

This includes the rest of the documentation that was not included
in 71176854. A larger restructure would still be wanted, but with
this commit the documentation of the new features is complete.

cfb863f2

Fix memory leak and other bugs in ginPlaceToPage() & subroutines. · bde361fe

Tom Lane authored Apr 20, 2016

Commit 36a35c55 turned the interface between ginPlaceToPage and
its subroutines in gindatapage.c and ginentrypage.c into a royal mess:
page-update critical sections were started in one place and finished in
another place not even in the same file, and the very same subroutine
might return having started a critical section or not. Subsequent patches
band-aided over some of the problems with this design by making things
even messier.

One user-visible resulting problem is memory leaks caused by the need for
the subroutines to allocate storage that would survive until ginPlaceToPage
calls XLogInsert (as reported by Julien Rouhaud). This would not typically
be noticeable during retail index updates. It could be visible in a GIN
index build, in the form of memory consumption swelling to several times
the commanded maintenance_work_mem.

Another rather nasty problem is that in the internal-page-splitting code
path, we would clear the child page's GIN_INCOMPLETE_SPLIT flag well before
entering the critical section that it's supposed to be cleared in; a
failure in between would leave the index in a corrupt state. There were
also assorted coding-rule violations with little immediate consequence but
possible long-term hazards, such as beginning an XLogInsert sequence before
entering a critical section, or calling elog(DEBUG) inside a critical
section.

To fix, redefine the API between ginPlaceToPage() and its subroutines
by splitting the subroutines into two parts. The "beginPlaceToPage"
subroutine does what can be done outside a critical section, including
full computation of the result pages into temporary storage when we're
going to split the target page. The "execPlaceToPage" subroutine is called
within a critical section established by ginPlaceToPage(), and it handles
the actual page update in the non-split code path. The critical section,
as well as the XLOG insertion call sequence, are both now always started
and finished in ginPlaceToPage(). Also, make ginPlaceToPage() create and
work in a short-lived memory context to eliminate the leakage problem.
(Since a short-lived memory context had been getting created in the most
common code path in the subroutines, this shouldn't cause any noticeable
performance penalty; we're just moving the overhead up one call level.)

In passing, fix a bunch of comments that had gone unmaintained throughout
all this klugery.

Report: <571276DD.5050303@dalibo.com>

bde361fe

Revert no-op changes to BufferGetPage() · a343e223

Kevin Grittner authored Apr 20, 2016

The reverted changes were intended to force a choice of whether any
newly-added BufferGetPage() calls needed to be accompanied by a
test of the snapshot age, to support the "snapshot too old"
feature.  Such an accompanying test is needed in about 7% of the
cases, where the page is being used as part of a scan rather than
positioning for other purposes (such as DML or vacuuming).  The
additional effort required for back-patching, and the doubt whether
the intended benefit would really be there, have indicated it is
best just to rely on developers to do the right thing based on
comments and existing usage, as we do with many other conventions.

This change should have little or no effect on generated executable
code.

Motivated by the back-patching pain of Tom Lane and Robert Haas

a343e223

19 Apr, 2016 1 commit

Improve regression tests for degree-based trigonometric functions. · 4db0d2d2

Tom Lane authored Apr 19, 2016

Print the actual value of each function result that's expected to be exact,
rather than merely emitting a NULL if it's not right. Although we print
these with extra_float_digits = 3, we should not trust that the platform
will produce a result visibly different from the expected value if it's off
only in the last place; hence, also include comparisons against the exact
values as before. This is a bit bulkier and uglier than the previous
printout, but it will provide more information and be easier to interpret
if there's a test failure.

Discussion: <18241.1461073100@sss.pgh.pa.us>

4db0d2d2

18 Apr, 2016 4 commits

Make partition-lock-release coding more transparent in BufferAlloc(). · a0382e2d

Tom Lane authored Apr 18, 2016

Coverity complained that oldPartitionLock was possibly dereferenced after
having been set to NULL. That actually can't happen, because we'd only use
it if (oldFlags & BM_TAG_VALID) is true. But nonetheless Coverity is
justified in complaining, because at line 1275 we actually overwrite
oldFlags, and then still expect its BM_TAG_VALID bit to be a safe guide to
whether to release the oldPartitionLock. Thus, the code would be incorrect
if someone else had changed the buffer's BM_TAG_VALID flag meanwhile.
That should not happen, since we hold pin on the buffer throughout this
sequence, but it's starting to look like a rather shaky chain of logic.
And there's no need for such assumptions, because we can simply replace
the (oldFlags & BM_TAG_VALID) tests with (oldPartitionLock != NULL),
which has identical results and makes it plain to all comers that we don't
dereference a null pointer. A small side benefit is that the range of
liveness of oldFlags is greatly reduced, possibly allowing the compiler
to save a register.

This is just cleanup, not an actual bug fix, so there seems no need
for a back-patch.

a0382e2d

Further reduce the number of semaphores used under --disable-spinlocks. · 75c24d0f

Tom Lane authored Apr 18, 2016

Per discussion, there doesn't seem to be much value in having
NUM_SPINLOCK_SEMAPHORES set to 1024: under any scenario where you are
running more than a few backends concurrently, you really had better have a
real spinlock implementation if you want tolerable performance. And 1024
semaphores is a sizable fraction of the system-wide SysV semaphore limit
on many platforms. Therefore, reduce this setting's default value to 128
to make it less likely to cause out-of-semaphores problems.

75c24d0f

Fix typo in docs. · 8ce8307b
Fujii Masao authored Apr 18, 2016
```
Artur Zakirov
```
8ce8307b
doc: Document that sequences can also be extension configuration tables · d460c7cc
Peter Eisentraut authored Apr 17, 2016
```
From: Michael Paquier <michael.paquier@gmail.com>
```
d460c7cc

17 Apr, 2016 1 commit

Avoid code duplication in \crosstabview. · 9603a325

Tom Lane authored Apr 17, 2016

In commit 6f0d6a50 I added a duplicate copy of psqlscanslash's identifier
downcasing code, but actually it's not hard to split that out as a callable
subroutine and avoid the duplication.

9603a325

16 Apr, 2016 7 commits

Adjust spin.c's spinlock emulation so that 0 is not a valid spinlock value. · 4039c736

Tom Lane authored Apr 16, 2016

We've had repeated troubles over the years with failures to initialize
spinlocks correctly; see 6b93fcd1 for a recent example. Most of the time,
on most platforms, such oversights can escape notice because all-zeroes is
the expected initial content of an slock_t variable. The only platform
we have where the initialized state of an slock_t isn't zeroes is HPPA,
and that's practically gone in the wild. To make it easier to catch such
errors without needing one of those, adjust the --disable-spinlocks code
so that zero is not a valid value for an slock_t for it.

In passing, remove a bunch of unnecessary #include's from spin.c;
commit daa7527a removed all the intermodule coupling that
made them necessary.

4039c736

doc: Change some "user" to "role" for consistency in the section · 5fdda1ce
Peter Eisentraut authored Apr 16, 2016
```
suggested by Johannes Choo
```
5fdda1ce
doc: Markup improvement · efb25e56
Peter Eisentraut authored Apr 16, 2016

efb25e56

Disallow creation of indexes on system columns (except for OID). · c34df8a0

Tom Lane authored Apr 16, 2016

Although OID acts pretty much like user data, the other system columns do
not, so an index on one would likely misbehave.  And it's pretty hard to
see a use-case for one, anyway.  Let's just forbid the case rather than
worry about whether it should be supported.

David Rowley

c34df8a0

In recordExtensionInitPriv(), keep the scan til we're done with it · 99f2f3c1

Stephen Frost authored Apr 15, 2016

For reasons of sheer brain fade, we (I) was calling systable_endscan()
immediately after systable_getnext() and expecting the tuple returned
by systable_getnext() to still be valid.

That's clearly wrong.  Move the systable_endscan() down below the tuple
usage.

Discovered initially by Pavel Stehule and then also by Alvaro.

Add a regression test based on Alvaro's testing.

99f2f3c1

doc: Add missing parentheses · d2de44c2
Peter Eisentraut authored Apr 15, 2016
```
From: Alexander Law <exclusion@gmail.com>
```
d2de44c2
psql: Add new gettext trigger · c3136876
Peter Eisentraut authored Apr 15, 2016

c3136876

15 Apr, 2016 13 commits

Use less-generic names in matview.sql. · 4447f0bc

Tom Lane authored Apr 15, 2016

The original coding of this test used table and view names like "t",
"tv", "foo", etc. This tended to interfere with doing simple manual
tests in the regression database; not to mention that it posed a
considerable risk of conflict with other regression test scripts.
Prefix these names with "mvtest_" to avoid such conflicts.

Also, change transiently-created role name to be "regress_xxx" per
discussions about being careful with regression-test role creation.

4447f0bc

Fix possible crash in ALTER TABLE ... REPLICA IDENTITY USING INDEX. · 8f1911d5

Tom Lane authored Apr 15, 2016

Careless coding added by commit 07cacba9 could result in a crash
or a bizarre error message if someone tried to select an index on the
OID column as the replica identity index for a table.  Back-patch to 9.4
where the feature was introduced.

Discussion: CAKJS1f8TQYgTRDyF1_u9PVCKWRWz+DkieH=U7954HeHVPJKaKg@mail.gmail.com

David Rowley

8f1911d5

postgres_fdw: Clean up handling of system columns. · da7d44b6

Robert Haas authored Apr 15, 2016

Previously, querying the xmin column of a single postgres_fdw foreign
table fetched the tuple length, xmax the typmod, and cmin or cmax the
composite type OID of the tuple. However, when you queried several
such tables and the join got shipped to the remote side, these columns
ended up containing the remote values of the corresponding columns.
Both behaviors are rather unprincipled, the former for obvious reasons
and the latter because the remote values of these columns don't have
any local significance; our transaction IDs are in a different space
than those of the remote machine. Clean this up by setting all of
these fields to 0 in both cases. Also fix the handling of tableoid
to be sane.

Robert Haas and Ashutosh Bapat, reviewed by Etsuro Fujita.

da7d44b6

Tweak EXPLAIN for parallel query to show workers launched. · 5702277c

Robert Haas authored Apr 15, 2016

The previous display was sort of confusing, because it didn't
distinguish between the number of workers that we planned to launch
and the number that actually got launched.  This has already confused
several people, so display both numbers and label them clearly.

Julien Rouhaud, reviewed by me.

5702277c

Fix portability problem induced by commit . · 6b85d4ba

Tom Lane authored Apr 15, 2016

pg_xlogdump includes bufmgr.h. With a compiler that emits code for
static inline functions even when they're unreferenced, that leads
to unresolved external references in the new static-inline version
of BufferGetPage(). So hide it with #ifndef FRONTEND, as we've done
for similar issues elsewhere. Per buildfarm member pademelon.

6b85d4ba

Fix typo in comment · ba8fe38f
Magnus Hagander authored Apr 15, 2016

ba8fe38f

Update helptext for vcregress.pl · cf086b1c

Magnus Hagander authored Apr 15, 2016

This has clearly not been tracking the code changse for quite some time.

Michael Paquier, problem spotted by Kyotaro HORIGUCHI

cf086b1c

Make regression test for multiple synchronous standbys more stable. · 36c1c916

Fujii Masao authored Apr 15, 2016

The regression test checks whether the output of pg_stat_replication is
expected or not after changing synchronous_standby_names and reloading
the configuration file. Regarding this test logic, previously there was
a timing issue which made the test result unstable. That is,
pg_stat_replication could return unexpected result during small window
after the configuration file was reloaded before new setting value
took effect, and which made the test fail.

This commit changes the test logic so that it uses a loop with a timeout
to give some room for the test to pass. Now the test fails only when
pg_stat_replication keeps returning unexpected result for 30 seconds.

Michael Paquier

36c1c916

Fix memory leak in GIN index scans. · f0e766bd

Tom Lane authored Apr 15, 2016

The code had a query-lifespan memory leak when encountering GIN entries
that have posting lists (rather than posting trees, ie, there are a
relatively small number of heap tuples containing this index key value).
With a suitable data distribution this could add up to a lot of leakage.
Problem seems to have been introduced by commit 36a35c55, so back-patch
to 9.4.

Julien Rouhaud

f0e766bd

Rethink \crosstabview's argument parsing logic. · 6f0d6a50

Tom Lane authored Apr 14, 2016

\crosstabview interpreted its arguments in an unusual way, including
doing case-insensitive matching of unquoted column names, which is
surely not the right thing.  Rip that out in favor of doing something
equivalent to the dequoting/case-folding rules used by other psql
commands.  To keep it simple, change the syntax so that the optional
sort column is specified as a separate argument, instead of the
also-quite-unusual syntax that attached it to the colH argument with
a colon.

Also, rework the error messages to be closer to project style.

6f0d6a50

Make init_spin_delay() C89 compliant #2. · 4b74c6a4

Andres Freund authored Apr 14, 2016

My previous attempt at doing so, in 80abbeba, was not sufficient. While that
fixed the problem for bufmgr.c and lwlock.c , s_lock.c still has non-constant
expressions in the struct initializer, because the file/line/function
information comes from the caller of s_lock().

Give up on using a macro, and use a static inline instead.

Discussion: 4369.1460435533@sss.pgh.pa.us

4b74c6a4

Remove trailing commas in enums. · 533cd230

Andres Freund authored Apr 14, 2016

These aren't valid C89. Found thanks to gcc's -Wc90-c99-compat. These
exist in differing places in most supported branches.

533cd230

Fix trivial typo. · 7b167812
Andres Freund authored Apr 13, 2016

7b167812

14 Apr, 2016 6 commits

Fix core dump in ReorderBufferRestoreChange on alignment-picky platforms. · 6a3d3965

Tom Lane authored Apr 14, 2016

When re-reading an update involving both an old tuple and a new tuple from
disk, reorderbuffer.c was careless about whether the new tuple is suitably
aligned for direct access --- in general, it isn't. We'd missed seeing
this in the buildfarm because the contrib/test_decoding tests exercise this
code path only a few times, and by chance all of those cases have old
tuples with length a multiple of 4, which is usually enough to make the
access to the new tuple's t_len safe. For some still-not-entirely-clear
reason, however, Debian's sparc build gets a bus error, as reported by
Christoph Berg; perhaps it's assuming 8-byte alignment of the pointer?

The lack of previous field reports is probably because you need all of
these conditions to trigger a crash: an alignment-picky platform (not
Intel), a transaction large enough to spill to disk, an update within
that xact that changes a primary-key field and has an odd-length old tuple,
and of course logical decoding tracing the transaction.

Avoid the alignment assumption by using memcpy instead of fetching t_len
directly, and add a test case that exposes the crash on picky platforms.
Back-patch to 9.4 where the bug was introduced.

Discussion: <20160413094117.GC21485@msg.credativ.de>

6a3d3965

Adjust signature of walrcv_receive hook. · c2dc194b

Tom Lane authored Apr 14, 2016

Commit 314cbfc5 redefined the signature of this hook as
typedef int (*walrcv_receive_type) (char **buffer, int *wait_fd);

But in fact the type of the "wait_fd" variable ought to be pgsocket,
which is what WaitLatchOrSocket expects, and which is necessary if
we want to be able to assign PGINVALID_SOCKET to it on Windows.
So fix that.

c2dc194b

Adjust datatype of ReplicationState.acquired_by. · 994f1125

Tom Lane authored Apr 14, 2016

It was declared as "pid_t", which would be fine except that none of
the places that printed it in error messages took any thought for the
possibility that it's not equivalent to "int". This leads to warnings
on some buildfarm members, and could possibly lead to actually wrong
error messages on those platforms. There doesn't seem to be any very
good reason not to just make it "int"; it's only ever assigned from
MyProcPid, which is int. If we want to cope with PIDs that are wider
than int, this is not the place to start.

Also, fix the comment, which seems to perhaps be a leftover from a time
when the field was only a bool?

Per buildfarm. Back-patch to 9.5 which has same issue.

994f1125

Docs: clarify description of LIMIT/OFFSET behavior. · fda21aa0

Tom Lane authored Apr 14, 2016

Section 7.6 was a tad confusing because it specified what LIMIT NULL
does, but neglected to do the same for OFFSET NULL, making this look
like perhaps a special case or a wrong restatement of the bit about
LIMIT ALL. Wordsmith a bit while at it. Per bug #14084.

fda21aa0

Fix prototype of pgwin32_bind(). · 22989a8e

Tom Lane authored Apr 14, 2016

I (tgl) had copied-and-pasted this from pgwin32_accept(), failing to
notice that the third parameter should be "int" not "int *".

David Rowley

22989a8e

Fix broken dependency-mongering for index operator classes/families. · 92a30a7e

Tom Lane authored Apr 13, 2016

For a long time, opclasscmds.c explained that "we do not create a
dependency link to the AM [for an opclass or opfamily], because we don't
currently support DROP ACCESS METHOD". Commit 473b9328 invented
DROP ACCESS METHOD, but it batted only 1 for 2 on adding the dependency
links, and 0 for 2 on updating the comments about the topic.

In passing, undo the same commit's entirely inappropriate decision to
blow away an existing index as a side-effect of create_am.sql.

92a30a7e