Commits · 9a51698bae86f748279ecedcae018925b5af5b2d · Abuhujair Javed / Postgres FD Implementation

18 Dec, 2015 6 commits

Fix typo in comment. · 9a51698b
Robert Haas authored Dec 18, 2015
```
Amit Langote
```
9a51698b

Allow to omit boundaries in array subscript · 9246af67

Teodor Sigaev authored Dec 18, 2015

Allow to omiy lower or upper or both boundaries in array subscript
for selecting slice of array.

Author: YUriy Zhuravlev

9246af67

Cube extension kNN support · 33bd250f

Teodor Sigaev authored Dec 18, 2015

Introduce distance operators over cubes:
<#> taxicab distance
<->  euclidean distance
<=> chebyshev distance

Also add kNN support of those distances in GiST opclass.

Author: Stas Kelvich

33bd250f

Remove unreferenced function declarations. · 3d0c50ff

Tom Lane authored Dec 17, 2015

datapagemap_create() and datapagemap_destroy() were declared extern,
but they don't actually exist anywhere.  Per YUriy Zhuravlev and
Michael Paquier.

3d0c50ff

Use just one standalone-backend session for initdb's post-bootstrap steps. · c4a8812c

Tom Lane authored Dec 17, 2015

Previously, each subroutine in initdb fired up its own standalone backend
session. Over time we'd grown as many as fifteen of these sessions,
and the cumulative startup and shutdown work for them was getting pretty
noticeable. Combining things so that all these steps share a single
backend session cuts a good 10% off the total runtime of initdb, more
if you're not fsync'ing.

The main stumbling block to doing this before was that some of the sessions
were run with -j and some not. The improved definition of -j mode
implemented by my previous commit makes it possible to fix that by running
all the post-bootstrap steps with -j; we just have to use double instead of
single newlines to end command strings. (This is only absolutely necessary
around the VACUUM and CREATE DATABASE steps, since those can't be run in a
transaction block. But it seems best to make them all use double newlines
so that the commands remain separate for error-reporting purposes.)

A minor disadvantage is that since initdb can't tell how much of its
output the backend has executed, we can no longer have the per-step
progress reporting initdb used to print. But things are fast enough
nowadays that that's not really all that useful anyway.

In passing, add more const decoration to some of the static arrays in
initdb.c.

c4a8812c

Adjust behavior of single-user -j mode for better initdb error reporting. · 66d947b9

Tom Lane authored Dec 17, 2015

Previously, -j caused the entire input file to be read in and executed as
a single command string. That's undesirable, not least because any error
causes the entire file to be regurgitated as the "failing query". Some
experimentation suggests a better rule: end the command string when we see
a semicolon immediately followed by two newlines, ie, an empty line after
a query. This serves nicely to break up the existing examples such as
information_schema.sql and system_views.sql. A limitation is that it's
no longer possible to write such a sequence within a string literal or
multiline comment in a file meant to be read with -j; but there are no
instances of such a problem within the data currently used by initdb.
(If someone does make such a mistake in future, it'll be obvious because
they'll get an unterminated-literal or unterminated-comment syntax error.)
Other than that, there shouldn't be any negative consequences; you're not
forced to end statements that way, it's just a better idea in most cases.

In passing, remove src/include/tcop/tcopdebug.h, which is dead code
because it's not included anywhere, and hasn't been for more than
ten years. One of the debug-support symbols it purported to describe
has been unreferenced for at least the same amount of time, and the
other is removed by this commit on the grounds that it was useless:
forcing -j mode all the time would have broken initdb. The lack of
complaints about that, or about the missing inclusion, shows that
no one has tried to use TCOP_DONTUSENEWLINE in many years.

66d947b9

17 Dec, 2015 2 commits

Fix improper initialization order for readline. · aee7705b

Tom Lane authored Dec 17, 2015

Turns out we must set rl_basic_word_break_characters *before* we call
rl_initialize() the first time, because it will quietly copy that value
elsewhere --- but only on the first call.  (Love these undocumented
dependencies.)  I broke this yesterday in commit 2ec477dc;
like that commit, back-patch to all active branches.  Per report from
Pavel Stehule.

aee7705b

Rework internals of changing a type's ownership · 756e7b4c

Alvaro Herrera authored Dec 17, 2015

This is necessary so that REASSIGN OWNED does the right thing with
composite types, to wit, that it also alters ownership of the type's
pg_class entry -- previously, the pg_class entry remained owned by the
original user, which caused later other failures such as the new owner's
inability to use ALTER TYPE to rename an attribute of the affected
composite.  Also, if the original owner is later dropped, the pg_class
entry becomes owned by a non-existant user which is bogus.

To fix, create a new routine AlterTypeOwner_oid which knows whether to
pass the request to ATExecChangeOwner or deal with it directly, and use
that in shdepReassignOwner rather than calling AlterTypeOwnerInternal
directly.  AlterTypeOwnerInternal is now simpler in that it only
modifies the pg_type entry and recurses to handle a possible array type;
higher-level tasks are handled by either AlterTypeOwner directly or
AlterTypeOwner_oid.

I took the opportunity to add a few more objects to the test rig for
REASSIGN OWNED, so that more cases are exercised.  Additional ones could
be added for superuser-only-ownable objects (such as FDWs and event
triggers) but I didn't want to push my luck by adding a new superuser to
the tests on a backpatchable bug fix.

Per bug #13666 reported by Chris Pacejo.

Backpatch to 9.5.

(I would back-patch this all the way back, except that it doesn't apply
cleanly in 9.4 and earlier because 59367fdf wasn't backpatched.  If we
decide that we need this in earlier branches too, we should backpatch
both.)

756e7b4c

16 Dec, 2015 3 commits

Cope with Readline's failure to track SIGWINCH events outside of input. · 2ec477dc

Tom Lane authored Dec 16, 2015

It emerges that libreadline doesn't notice terminal window size change
events unless they occur while collecting input. This is easy to stumble
over if you resize the window while using a pager to look at query output,
but it can be demonstrated without any pager involvement. The symptom is
that queries exceeding one line are misdisplayed during subsequent input
cycles, because libreadline has the wrong idea of the screen dimensions.

The safest, simplest way to fix this is to call rl_reset_screen_size()
just before calling readline(). That causes an extra ioctl(TIOCGWINSZ)
for every command; but since it only happens when reading from a tty, the
performance impact should be negligible. A more valid objection is that
this still leaves a tiny window during entry to readline() wherein delivery
of SIGWINCH will be missed; but the practical consequences of that are
probably negligible. In any case, there doesn't seem to be any good way to
avoid the race, since readline exposes no functions that seem safe to call
from a generic signal handler --- rl_reset_screen_size() certainly isn't.

It turns out that we also need an explicit rl_initialize() call, else
rl_reset_screen_size() dumps core when called before the first readline()
call.

rl_reset_screen_size() is not present in old versions of libreadline,
so we need a configure test for that. (rl_initialize() is present at
least back to readline 4.0, so we won't bother with a test for it.)
We would need a configure test anyway since libedit's emulation of
libreadline doesn't currently include such a function. Fortunately,
libedit seems not to have any corresponding bug.

Merlin Moncure, adjusted a bit by me

2ec477dc

Speed up CREATE INDEX CONCURRENTLY's TID sort. · b648b703

Robert Haas authored Dec 16, 2015

Encode TIDs as 64-bit integers to speed up comparisons.  This seems to
speed things up on all platforms, but is even more beneficial when
8-byte integers are passed by value.

Peter Geoghegan.  Design suggestions and review by Tom Lane.  Review
also by Simon Riggs and by me.

b648b703

Mark CHECK constraints declared NOT VALID valid if created with table. · f27a6b15

Robert Haas authored Dec 16, 2015

FOREIGN KEY constraints have behaved this way for a long time, but for
some reason the behavior of CHECK constraints has been inconsistent up
until now.

Amit Langote and Amul Sul, with assorted tweaks by me.

f27a6b15

15 Dec, 2015 8 commits

Document use of Subject Alternative Names in SSL server certificates. · 0625dbb0
Tom Lane authored Dec 15, 2015
```
Commit acd08d76 did not bother with updating the documentation.
```
0625dbb0
Update 9.5 release notes through today. · bfc7f5dd
Tom Lane authored Dec 15, 2015
```
Also do another round of copy-editing, and fix up remaining FIXME items.
```
bfc7f5dd

Teach mdnblocks() not to create zero-length files. · 049469e7

Robert Haas authored Dec 15, 2015

It's entirely surprising that mdnblocks() has the side effect of
creating new files on disk, so let's make it not do that.  One
consequence of the old behavior is that, if running on a damaged
cluster that is missing a file, mdnblocks() can recreate the file
and allow a subsequent _mdfd_getseg() for a higher segment to succeed.
This happens because, while mdnblocks() stops when it finds a segment
that is shorter than 1GB, _mdfd_getseg() has no such check, and thus
the empty file created by mdnblocks() can allow it to continue its
traversal and find higher-numbered segments which remain.

It might be a good idea for _mdfd_getseg() to actually verify that
each segment it finds is exactly 1GB before proceeding to the next
one, but that would involve some additional system calls, so for
now I'm just doing this much.

Patch by me, per off-list analysis by Kevin Grittner and Rahila Syed.
Review by Andres Freund.

049469e7

Move buffer I/O and content LWLocks out of the main tranche. · 6150a1b0

Robert Haas authored Dec 15, 2015

Move the content lock directly into the BufferDesc, so that locking and
pinning a buffer touches only one cache line rather than two.  Adjust
the definition of BufferDesc slightly so that this doesn't make the
BufferDesc any larger than one cache line (at least on platforms where
a spinlock is only 1 or 2 bytes).

We can't fit the I/O locks into the BufferDesc and stay within one
cache line, so move those to a completely separate tranche.  This
leaves a relatively limited number of LWLocks in the main tranche, so
increase the padding of those remaining locks to a full cache line,
rather than allowing adjacent locks to share a cache line, hopefully
reducing false sharing.

Performance testing shows that these changes make little difference
on laptop-class machines, but help significantly on larger servers,
especially those with more than 2 sockets.

Andres Freund, originally based on an earlier patch by Simon Riggs.
Review and cosmetic adjustments (including heavy rewriting of the
comments) by me.

6150a1b0

Provide a way to predefine LWLock tranche IDs. · 3fed4174

Robert Haas authored Dec 15, 2015

It's a bit cumbersome to use LWLockNewTrancheId(), because the returned
value needs to be shared between backends so that each backend can call
LWLockRegisterTranche() with the correct ID.  So, for built-in tranches,
use a hard-coded value instead.

This is motivated by an upcoming patch adding further built-in tranches.

Andres Freund and Robert Haas

3fed4174

Improve CREATE POLICY documentation · 43cd468c

Stephen Frost authored Dec 15, 2015

Clarify that SELECT policies are now applied when SELECT rights
are required for a given query, even if the query is an UPDATE or
DELETE query.  Pointed out by Noah.

Additionally, note the risk regarding concurrently open transactions
where a relation which controls access to the rows of another relation
are updated and the rows of the primary relation are also being
modified.  Pointed out by Peter Geoghegan.

Back-patch to 9.5.

43cd468c

Collect the global OR of hasRowSecurity flags for plancache · e5e11c8c

Stephen Frost authored Dec 14, 2015

We carry around information about if a given query has row security or
not to allow the plancache to use that information to invalidate a
planned query in the event that the environment changes.

Previously, the flag of one of the subqueries was simply being copied
into place to indicate if the query overall included RLS components.
That's wrong as we need the global OR of all subqueries.  Fix by
changing the code to match how fireRIRules works, which is results
in OR'ing all of the flags.

Noted by Tom.

Back-patch to 9.5 where RLS was introduced.

e5e11c8c

Add missing cleanup logic in pg_rewind/t/005_same_timeline.pl test. · db81329e
Tom Lane authored Dec 14, 2015
```
Per Michael Paquier
```
db81329e

14 Dec, 2015 6 commits

Add missing CHECK_FOR_INTERRUPTS in lseg_inside_poly · 0d8f3d5d

Alvaro Herrera authored Dec 14, 2015

Apparently, there are bugs in this code that cause it to loop endlessly.
That bug still needs more research, but in the meantime it's clear that
the loop is missing a check for interrupts so that it can be cancelled
timely.

Backpatch to 9.1 -- this has been missing since 49475aab.

0d8f3d5d

Remove xmlparse(document '') test · e2f1765c

Kevin Grittner authored Dec 14, 2015

This one test was behaving differently between the ubuntu fix for
CVE-2015-7499 and the base "expected" file.  It's not worth having
yet another version of the expected file for this test, so drop it.
Perhaps at some point when all distros have settled down to the
same behavior on this test, it can be restored.

Problem found by me on libxml2 (2.9.1+dfsg1-3ubuntu4.6).
Solution suggested by Tom Lane.
Backpatch to 9.5, where the test was added.

e2f1765c

Fix out-of-memory error handling in ParameterDescription message processing. · 7b96bf44

Heikki Linnakangas authored Dec 14, 2015

If libpq ran out of memory while constructing the result set, it would hang,
waiting for more data from the server, which might never arrive. To fix,
distinguish between out-of-memory error and not-enough-data cases, and give
a proper error message back to the client on OOM.

There are still similar issues in handling COPY start messages, but let's
handle that as a separate patch.

Michael Paquier, Amit Kapila and me. Backpatch to all supported versions.

7b96bf44

Fix bug in SetOffsetVacuumLimit() triggered by find_multixact_start() failure. · cca705a5

Andres Freund authored Dec 14, 2015

Previously, if find_multixact_start() failed, SetOffsetVacuumLimit() would
install 0 into MultiXactState->offsetStopLimit if it previously succeeded.
Luckily, there are no known cases where find_multixact_start() will return
an error in 9.5 and above. But if it were to happen, for example due to
filesystem permission issues, it'd be somewhat bad: GetNewMultiXactId()
could continue allocating mxids even if close to a wraparound, or it could
erroneously stop allocating mxids, even if no wraparound is looming. The
wrong value would be corrected the next time SetOffsetVacuumLimit() is
called, or by a restart.

Reported-By: Noah Misch, although this is not his preferred fix
Discussion: 20151210140450.GA22278@alap3.anarazel.de
Backpatch: 9.5, where the bug was introduced as part of 4f627f

cca705a5

Correct statement to actually be the intended assert statement. · 2a354496

Andres Freund authored Dec 14, 2015

e3f4cfc7 introduced a LWLockHeldByMe() call, without the corresponding
Assert() surrounding it.

Spotted by Coverity.

Backpatch: 9.1+, like the previous commit

2a354496

Docs: document that psql's "\i -" means read from stdin. · 7bd149ce

Tom Lane authored Dec 13, 2015

This has worked that way for a long time, maybe always, but you would
not have known it from the documentation. Also back-patch the notes
I added to HEAD earlier today about behavior of the "-f -" switch,
which likewise have been valid for many releases.

7bd149ce

13 Dec, 2015 4 commits

Code and docs review for multiple -c and -f options in psql. · fcbbf82d

Tom Lane authored Dec 13, 2015

Commit d5563d7d drew complaints from Coverity, which quite
correctly complained that one copy of each -c or -f string was being
leaked. What's more, simple_action_list_append was allocating enough space
for still a third copy of each string as part of the SimpleActionListCell,
even though that coding method had been superseded by a separate strdup
operation. There were some other minor coding infelicities too. The
documentation needed more work as well, eg it forgot to explain that -c
causes psql not to accept any interactive input.

fcbbf82d

Consistently set all fields in pg_stat_replication to null instead of 0 · a91bdf67

Magnus Hagander authored Dec 13, 2015

Previously the "sent" field would be set to 0 and all other xlog
pointers be set to NULL if there were no valid values (such as when
in a backup sending walsender).

a91bdf67

Properly initialize write, flush and replay locations in walsender slots · 263c1957

Magnus Hagander authored Dec 13, 2015

These would leak random xlog positions if a walsender used for backup would
a walsender slot previously used by a replication walsender.

In passing also fix a couple of cases where the xlog pointer is directly
compared to zero instead of using XLogRecPtrIsInvalid, noted by
Michael Paquier.

263c1957

Doc: update external URLs for PostGIS project. · 6d96cd07
Tom Lane authored Dec 12, 2015
```
Paul Ramsey
```
6d96cd07

12 Dec, 2015 3 commits

doc: Add some markup · 19e7ca89
Peter Eisentraut authored Dec 12, 2015

19e7ca89

Fix ALTER TABLE ... SET TABLESPACE for unlogged relations. · f54d0629

Andres Freund authored Dec 12, 2015

Changing the tablespace of an unlogged relation did not WAL log the
creation and content of the init fork. Thus, after a standby is
promoted, unlogged relation cannot be accessed anymore, with errors
like:
ERROR:  58P01: could not open file "pg_tblspc/...": No such file or directory
Additionally the init fork was not synced to disk, independent of the
configured wal_level, a relatively small durability risk.

Investigation of that problem also brought to light that, even for
permanent relations, the creation of !main forks was not WAL logged,
i.e. no XLOG_SMGR_CREATE record were emitted. That mostly turns out not
to be a problem, because these files were created when the actual
relation data is copied; nonexistent files are not treated as an error
condition during replay. But that doesn't work for empty files, and
generally feels a bit haphazard. Luckily, outside init and main forks,
empty forks don't occur often or are not a problem.

Add the required WAL logging and syncing to disk.

Reported-By: Michael Paquier
Author: Michael Paquier and Andres Freund
Discussion: 20151210163230.GA11331@alap3.anarazel.de
Backpatch: 9.1, where unlogged relations were introduced

f54d0629

Add an expected-file to match behavior of latest libxml2. · 085423e3

Tom Lane authored Dec 11, 2015

Recent releases of libxml2 do not provide error context reports for errors
detected at the very end of the input string. This appears to be a bug, or
at least an infelicity, introduced by the fix for libxml2's CVE-2015-7499.
We can hope that this behavioral change will get undone before too long;
but the security patch is likely to spread a lot faster/further than any
follow-on cleanup, which means this behavior is likely to be present in the
wild for some time to come. As a stopgap, add a variant regression test
expected-file that matches what you get with a libxml2 that acts this way.

085423e3

11 Dec, 2015 8 commits

pg_rewind: Don't error if the two clusters are already on the same timeline · 6b34e556

Peter Eisentraut authored Dec 03, 2015

This previously resulted in an error and a nonzero exit status, but
after discussion this should rather be a noop with a zero exit status.

6b34e556

For REASSIGN OWNED for foreign user mappings · 8c161553

Alvaro Herrera authored Dec 11, 2015

As reported in bug #13809 by Alexander Ashurkov, the code for REASSIGN
OWNED hadn't gotten word about user mappings. Deal with them in the
same way default ACLs do, which is to ignore them altogether; they are
handled just fine by DROP OWNED. The other foreign object cases are
already handled correctly by both commands.

Also add a REASSIGN OWNED statement to foreign_data test to exercise the
foreign data objects. (The changes are just before the "cleanup" phase,
so it shouldn't remove any existing live test.)

Reported by Alexander Ashurkov, then independently by Jaime Casanova.

8c161553

Install our "missing" script where PGXS builds can find it. · dccf8e9e

Tom Lane authored Dec 11, 2015

This allows sane behavior in a PGXS build done on a machine where build
tools such as bison are missing.

Jim Nasby

dccf8e9e

Handle policies during DROP OWNED BY · 833728d4

Stephen Frost authored Dec 11, 2015

DROP OWNED BY handled GRANT-based ACLs but was not removing roles from
policies. Fix that by having DROP OWNED BY remove the role specified
from the list of roles the policy (or policies) apply to, or the entire
policy (or policies) if it only applied to the role specified.

As with ACLs, the DROP OWNED BY caller must have permission to modify
the policy or a WARNING is thrown and no change is made to the policy.

833728d4

Get rid of the planner's LateralJoinInfo data structure. · 4fcf4845

Tom Lane authored Dec 11, 2015

I originally modeled this data structure on SpecialJoinInfo, but after
commit acfcd45c that looks like a pretty poor decision.
All we really need is relid sets identifying laterally-referenced rels;
and most of the time, what we want to know about includes indirect lateral
references, a case the LateralJoinInfo data was unsuited to compute with
any efficiency. The previous commit redefined RelOptInfo.lateral_relids
as the transitive closure of lateral references, so that it easily supports
checking indirect references. For the places where we really do want just
direct references, add a new RelOptInfo field direct_lateral_relids, which
is easily set up as a copy of lateral_relids before we perform the
transitive closure calculation. Then we can just drop lateral_info_list
and LateralJoinInfo and the supporting code. This makes the planner's
handling of lateral references noticeably more efficient, and shorter too.

Such a change can't be back-patched into stable branches for fear of
breaking extensions that might be looking at the planner's data structures;
but it seems not too late to push it into 9.5, so I've done so.

4fcf4845

Handle dependencies properly in ALTER POLICY · ed8bec91

Stephen Frost authored Dec 11, 2015

ALTER POLICY hadn't fully considered partial policy alternation
(eg: change just the roles on the policy, or just change one of
the expressions) when rebuilding the dependencies.  Instead, it
would happily remove all dependencies which existed for the
policy and then only recreate the dependencies for the objects
referred to in the specific ALTER POLICY command.

Correct that by extracting and building the dependencies for all
objects referenced by the policy, regardless of if they were
provided as part of the ALTER POLICY command or were already in
place as part of the pre-existing policy.

ed8bec91

Still more fixes for planner's handling of LATERAL references. · acfcd45c

Tom Lane authored Dec 11, 2015

More fuzz testing by Andreas Seltenreich exposed that the planner did not
cope well with chains of lateral references. If relation X references Y
laterally, and Y references Z laterally, then we will have to scan X on the
inside of a nestloop with Z, so for all intents and purposes X is laterally
dependent on Z too. The planner did not understand this and would generate
intermediate joins that could not be used. While that was usually harmless
except for wasting some planning cycles, under the right circumstances it
would lead to "failed to build any N-way joins" or "could not devise a
query plan" planner failures.

To fix that, convert the existing per-relation lateral_relids and
lateral_referencers relid sets into their transitive closures; that is,
they now show all relations on which a rel is directly or indirectly
laterally dependent. This not only fixes the chained-reference problem
but allows some of the relevant tests to be made substantially simpler
and faster, since they can be reduced to simple bitmap manipulations
instead of searches of the LateralJoinInfo list.

Also, when a PlaceHolderVar that is due to be evaluated at a join contains
lateral references, we should treat those references as indirect lateral
dependencies of each of the join's base relations. This prevents us from
trying to join any individual base relations to the lateral reference
source before the join is formed, which again cannot work.

Andreas' testing also exposed another oversight in the "dangerous
PlaceHolderVar" test added in commit 85e5e222. Simply rejecting
unsafe join paths in joinpath.c is insufficient, because in some cases
we will end up rejecting *all* possible paths for a particular join, again
leading to "could not devise a query plan" failures. The restriction has
to be known also to join_is_legal and its cohort functions, so that they
will not select a join for which that will happen. I chose to move the
supporting logic into joinrels.c where the latter functions are.

Back-patch to 9.3 where LATERAL support was introduced.

acfcd45c

Fix commit timestamp initialization · 69e7235c

Alvaro Herrera authored Dec 11, 2015

This module needs explicit initialization in order to replay WAL records
in recovery, but we had broken this recently following changes to make
other (stranger) scenarios work correctly.  To fix, rework the
initialization sequence so that it always takes place before WAL replay
commences for both master and standby.

I could have gone for a more localized fix that just added a "startup"
call for the master server, but it seemed better to restructure the
existing callers as well so that the whole thing made more sense.  As a
drawback, there is more control logic in xlog.c now than previously, but
doing otherwise meant passing down the ControlFile flag, which seemed
uglier as a whole.

This also meant adding a check to not re-execute ActivateCommitTs if it
had already been called.

Reported by Fujii Masao.

Backpatch to 9.5.

69e7235c