Commits · a76ef15d9fc9207a0758e8d6f6700dc8c931a934 · Abuhujair Javed / Postgres FD Implementation

06 Nov, 2015 4 commits

Add sort support routine for the UUID data type. · a76ef15d

Robert Haas authored Nov 06, 2015

This introduces a simple encoding scheme to produce abbreviated keys:
pack as many bytes of each UUID as will fit into a Datum.  On
little-endian machines, a byteswap is also performed; the abbreviated
comparator can therefore just consist of a simple 3-way unsigned integer
comparison.

The purpose of this change is to speed up sorting data on a column
of type UUID.

Peter Geoghegan

a76ef15d

Set include_realm=1 default in parse_hba_line · 5644419b

Stephen Frost authored Nov 06, 2015

With include_realm=1 being set down in parse_hba_auth_opt, if multiple
options are passed on the pg_hba line, such as:

host all     all    0.0.0.0/0    gss include_realm=0 krb_realm=XYZ.COM

We would mistakenly reset include_realm back to 1.  Instead, we need to
set include_realm=1 up in parse_hba_line, prior to parsing any of the
additional options.

Discovered by Jeff McCormick during testing.

Bug introduced by 9a088417.

Back-patch to 9.5

5644419b

pg_size_pretty: Format negative values similar to positive ones. · 8a1fab36

Robert Haas authored Nov 06, 2015

Previously, negative values were always displayed in bytes, regardless
of how large they were.

Adrian Vondendriesch, reviewed by Julien Rouhaud and myself

8a1fab36

Document interaction of bgworkers with LISTEN/NOTIFY. · dde5f09f
Robert Haas authored Nov 06, 2015
```
Thomas Munro and Robert Haas, reviewed by Haribabu Kommi
```
dde5f09f

05 Nov, 2015 5 commits

Fix erroneous hash calculations in gin_extract_jsonb_path(). · b23af458

Tom Lane authored Nov 05, 2015

The jsonb_path_ops code calculated hash values inconsistently in some cases
involving nested arrays and objects.  This would result in queries possibly
not finding entries that they should find, when using a jsonb_path_ops GIN
index for the search.  The problem cases involve JSONB values that contain
both scalars and sub-objects at the same nesting level, for example an
array containing both scalars and sub-arrays.  To fix, reset the current
stack->hash after processing each value or sub-object, not before; and
don't try to be cute about the outermost level's initial hash.

Correcting this means that existing jsonb_path_ops indexes may now be
inconsistent with the new hash calculation code.  The symptom is the same
--- searches not finding entries they should find --- but the specific
rows affected are likely to be different.  Users will need to REINDEX
jsonb_path_ops indexes to make sure that all searches work as expected.

Per bug #13756 from Daniel Cheng.  Back-patch to 9.4 where the faulty
logic was introduced.

b23af458

Fix memory leaks in PL/Python. · 8c75ad43

Tom Lane authored Nov 05, 2015

Previously, plpython was in the habit of allocating a lot of stuff in
TopMemoryContext, and it was very slipshod about making sure that stuff
got cleaned up; in particular, use of TopMemoryContext as fn_mcxt for
function calls represents an unfixable leak, since we generally don't
know what the called function might have allocated in fn_mcxt. This
results in session-lifespan leakage in certain usage scenarios, as for
example in a case reported by Ed Behn back in July.

To fix, get rid of all the retail allocations in TopMemoryContext.
All long-lived allocations are now made in sub-contexts that are
associated with specific objects (either pl/python procedures, or
Python-visible objects such as cursors and plans). We can clean these
up when the associated object is deleted.

I went so far as to get rid of PLy_malloc completely. There were a
couple of places where it could still have been used safely, but on
the whole it was just an invitation to bad coding.

Haribabu Kommi, based on a draft patch by Heikki Linnakangas;
some further work by me

8c75ad43

Pass extra data to bgworkers, and use this to fix parallel contexts. · 64b2e7ad

Robert Haas authored Nov 05, 2015

Up until now, the total amount of data that could be passed to a
background worker at startup was one datum, which can be a small as
4 bytes on some systems.  That's enough to pass a dsm_handle or an
array index, but not much else.  Add a bgw_extra flag to the
BackgroundWorker struct, allowing up to 128 bytes to be passed to
a new worker on any platform.

Use this to fix a problem I recently discovered with the parallel
context machinery added in 9.5: the master assigns each worker an
array index, and each worker subsequently assigns itself an array
index, and there's nothing to guarantee that the two sets of indexes
match, leading to chaos.

Normally, I would not back-patch the change to add bgw_extra, since it
is basically a feature addition.  However, since 9.5 is still in beta
and there seems to be no other sensible way to repair the broken
parallel context machinery, back-patch to 9.5.  Existing background
worker code can ignore the bgw_extra field without a problem, but
might need to be recompiled since the structure size has changed.

Report and patch by me.  Review by Amit Kapila.

64b2e7ad

Improve implementation of GEQO's init_tour() function. · 59464bd6

Tom Lane authored Nov 05, 2015

Rather than filling a temporary array and then copying values to the
output array, we can generate the required random permutation in-place
using the Fisher-Yates shuffle algorithm.  This is shorter as well as
more efficient than before.  It's pretty unlikely that anyone would
notice a speed improvement, but shorter code is better.

Nathan Wagner, edited a bit by me

59464bd6

Update spelling of COPY options · 7bd099d5

Peter Eisentraut authored Nov 04, 2015

The preferred spelling was changed from FORCE QUOTE to FORCE_QUOTE and
the like, but some code was still referring to the old spellings.

7bd099d5

04 Nov, 2015 1 commit

Add regression tests for remote execution of extension operators/functions. · b9f117d6

Tom Lane authored Nov 04, 2015

Rather than relying on other extensions to be available for installation,
let's just add some test objects to the postgres_fdw extension itself
within the regression script.

b9f117d6

03 Nov, 2015 8 commits

Allow postgres_fdw to ship extension funcs/operators for remote execution. · d8949416

Tom Lane authored Nov 03, 2015

The user can whitelist specified extension(s) in the foreign server's
options, whereupon we will treat immutable functions and operators of those
extensions as candidates to be sent for remote execution.

Whitelisting an extension in this way basically promises that the extension
exists on the remote server and behaves compatibly with the local instance.
We have no way to prove that formally, so we have to rely on the user to
get it right. But this seems like something that people can usually get
right in practice.

We might in future allow functions and operators to be whitelisted
individually, but extension granularity is a very convenient special case,
so it got done first.

The patch as-committed lacks any regression tests, which is unfortunate,
but introducing dependencies on other extensions for testing purposes
would break "make installcheck" scenarios, which is worse. I have some
ideas about klugy ways around that, but it seems like material for a
separate patch. For the moment, leave the problem open.

Paul Ramsey, hacked up a bit more by me

d8949416

Improve comments about abbreviation abort. · ee44cb75
Robert Haas authored Nov 03, 2015
```
Peter Geoghegan
```
ee44cb75

postgres_fdw: Add ORDER BY to some remote SQL queries. · f18c944b

Robert Haas authored Nov 03, 2015

If the join problem's entire ORDER BY clause can be pushed to the
remote server, consider a path that adds this ORDER BY clause.  If
use_remote_estimate is on, we cost this path using an additional
remote EXPLAIN.  If not, we just estimate that the path costs 20%
more, which is intended to be large enough that we won't request a
remote sort when it's not helpful, but small enough that we'll have
the remote side do the sort when in doubt.  In some cases, the remote
sort might actually be free, because the remote query plan might
happen to produce output that is ordered the way we need, but without
remote estimates we have no way of knowing that.

It might also be useful to request sorted output from the remote side
if it enables an efficient merge join, but this patch doesn't attempt
to handle that case.

Ashutosh Bapat with revisions by me.  Also reviewed by Fabrízio de Royes
Mello and Jeevan Chalke.

f18c944b

Remove obsolete advice about doubling backslashes in regex escapes. · fc0b8935

Tom Lane authored Nov 03, 2015

Standard-conforming literals have been the default for long enough that
it no longer seems necessary to go out of our way to tell people to write
regex escapes illegibly.

fc0b8935

Code + docs review for unicode linestyle patch. · a69b0b2c

Tom Lane authored Nov 03, 2015

Fix some brain fade in commit a2dabf0e: erroneous variable names
in docs, rearrangements that made sentences less clear not more so,
undocumented and poorly-chosen-anyway API behaviors of subroutines,
bad grammar in error messages, copy-and-paste faults.

Albe Laurenz and Tom Lane

a69b0b2c

shm_mq: Third attempt at fixing nowait behavior in shm_mq_receive. · 4efe26cb

Robert Haas authored Nov 03, 2015

Commit a1480ec1 purported to fix the
problems with commit b2ccb5f4, but it
didn't completely fix them.  The problem is that the checks were
performed in the wrong order, leading to a race condition.  If the
sender attached, sent a message, and detached after the receiver
called shm_mq_get_sender and before the receiver called
shm_mq_counterparty_gone, we'd incorrectly return SHM_MQ_DETACHED
before all messages were read.  Repair by reversing the order of
operations, and add a long comment explaining why this new logic is
(hopefully) correct.

4efe26cb

Correct tiny inaccuracy in strxfrm cache comment. · 0279f62f
Robert Haas authored Nov 03, 2015
```
Peter Geoghegan
```
0279f62f
Remove some more dead Alpha-specific code. · 620ac88d
Tom Lane authored Nov 02, 2015

620ac88d

02 Nov, 2015 2 commits

Fix problems with ParamListInfo serialization mechanism. · 1efc7e53

Robert Haas authored Nov 02, 2015

Commit d1b7c1ff introduced a mechanism
for serializing a ParamListInfo structure to be passed to a parallel
worker.  However, this mechanism failed to handle external expanded
values, as pointed out by Noah Misch.  Repair.

Moreover, plpgsql_param_fetch requires adjustment because the
serialization mechanism needs it to skip evaluating unused parameters
just as we would do when it is called from copyParamList, but params
== estate->paramLI in that case.  To fix, make the bms_is_member test
in that function unconditional.

Finally, have setup_param_list set a new ParamListInfo field,
paramMask, to the parameters actually used in the expression, so that
we don't try to fetch those that are not needed when serializing a
parameter list.  This isn't necessary for correctness, but it makes
the performance of the parallel executor code comparable to what we
do for cases involving cursors.

Design suggestions and extensive review by Noah Misch.  Patch by me.

1efc7e53

Add RMV to list of commands taking AE lock. · bf25fb2f

Kevin Grittner authored Nov 02, 2015

Backpatch to 9.3, where it was initially omitted.

Craig Ringer, with minor adjustment by Kevin Grittner

bf25fb2f

31 Oct, 2015 1 commit

Fix serialization anomalies due to race conditions on INSERT. · 585e2a3b

Kevin Grittner authored Oct 31, 2015

On insert the CheckForSerializableConflictIn() test was performed
before the page(s) which were going to be modified had been locked
(with an exclusive buffer content lock).  If another process
acquired a relation SIReadLock on the heap and scanned to a page on
which an insert was going to occur before the page was so locked,
a rw-conflict would be missed, which could allow a serialization
anomaly to be missed.  The window between the check and the page
lock was small, so the bug was generally not noticed unless there
was high concurrency with multiple processes inserting into the
same table.

This was reported by Peter Bailis as bug #11732, by Sean Chittenden
as bug #13667, and by others.

The race condition was eliminated in heap_insert() by moving the
check down below the acquisition of the buffer lock, which had been
the very next statement.  Because of the loop locking and unlocking
multiple buffers in heap_multi_insert() a check was added after all
inserts were completed.  The check before the start of the inserts
was left because it might avoid a large amount of work to detect a
serialization anomaly before performing the all of the inserts and
the related WAL logging.

While investigating this bug, other SSI bugs which were even harder
to hit in practice were noticed and fixed, an unnecessary check
(covered by another check, so redundant) was removed from
heap_update(), and comments were improved.

Back-patch to all supported branches.

Kevin Grittner and Thomas Munro

585e2a3b

30 Oct, 2015 4 commits

Implement lookbehind constraints in our regular-expression engine. · 12c9a040

Tom Lane authored Oct 30, 2015

A lookbehind constraint is like a lookahead constraint in that it consumes
no text; but it checks for existence (or nonexistence) of a match *ending*
at the current point in the string, rather than one *starting* at the
current point. This is a long-requested feature since it exists in many
other regex libraries, but Henry Spencer had never got around to
implementing it in the code we use.

Just making it work is actually pretty trivial; but naive copying of the
logic for lookahead constraints leads to code that often spends O(N^2) time
to scan an N-character string, because we have to run the match engine
from string start to the current probe point each time the constraint is
checked. In typical use-cases a lookbehind constraint will be written at
the start of the regex and hence will need to be checked at every character
--- so O(N^2) work overall. To fix that, I introduced a third copy of the
core DFA matching loop, paralleling the existing longest() and shortest()
loops. This version, matchuntil(), can suspend and resume matching given
a couple of pointers' worth of storage space. So we need only run it
across the string once, stopping at each interesting probe point and then
resuming to advance to the next one.

I also put in an optimization that simplifies one-character lookahead and
lookbehind constraints, such as "(?=x)" or "(?<!\w)", into AHEAD and BEHIND
constraints, which already existed in the engine. This avoids the overhead
of the LACON machinery entirely for these rather common cases.

The net result is that lookbehind constraints run a factor of three or so
slower than Perl's for multi-character constraints, but faster than Perl's
for one-character constraints ... and they work fine for variable-length
constraints, which Perl gives up on entirely. So that's not bad from a
competitive perspective, and there's room for further optimization if
anyone cares. (In reality, raw scan rate across a large input string is
probably not that big a deal for Postgres usage anyway; so I'm happy if
it's linear.)

12c9a040

doc: security_barrier option is a Boolean, not a string. · c5057b2b
Robert Haas authored Oct 30, 2015
```
Mistake introduced by commit 5bd91e3a.

Hari Babu
```
c5057b2b

Update parallel executor support to reuse the same DSM. · 3a1f8611

Robert Haas authored Oct 30, 2015

Commit b0b0d84b purported to make it
possible to relaunch workers using the same parallel context, but it had
an unpleasant race condition: we might reinitialize after the workers
have sent their last control message but before they have dettached the
DSM, leaving to crashes.  Repair by introducing a new ParallelContext
operation, ReinitializeParallelDSM.

Adjust execParallel.c to use this new support, so that we can rescan a
Gather node by relaunching workers but without needing to recreate the
DSM.

Amit Kapila, with some adjustments by me.  Extracted from latest parallel
sequential scan patch.

3a1f8611

Fix typo in bgworker.c · c6baec92
Robert Haas authored Oct 30, 2015

c6baec92

29 Oct, 2015 3 commits
- Docs: add example clarifying use of nested JSON containment. · 23937a42
  Tom Lane authored Oct 29, 2015
```
Show how this can be used in practice to make queries simpler and more
flexible.  Also, draw an explicit contrast to the existence operator,
which doesn't work that way.

Peter Geoghegan and Tom Lane
```
  23937a42
- Remove some remains from Alpha support removal · c5130e8e
  Peter Eisentraut authored Oct 29, 2015
  
  c5130e8e
- Message style improvements · a8d585c0
  Peter Eisentraut authored Oct 28, 2015
```
Message style, plurals, quoting, spelling, consistency with similar
messages
```
  a8d585c0
28 Oct, 2015 3 commits
- Add missing serial comma, for consistency. · d4556516
  Robert Haas authored Oct 28, 2015
```
Amit Langote, per Etsuro Fujita
```
  d4556516
- Fix incorrect message in ATWrongRelkindError. · 9dcce712
  Robert Haas authored Oct 28, 2015
```
Mistake introduced by commit 3bf3ab8c.

Etsuro Fujita
```
  9dcce712
- Fix secondary expected output for commit_ts test · dbe6f434
  Alvaro Herrera authored Oct 27, 2015
```
Per red wall in buildfarm
```
  dbe6f434
27 Oct, 2015 5 commits

Make Gather node projection-capable. · 8538a630

Robert Haas authored Oct 28, 2015

The original Gather code failed to mark a Gather node as not able to
do projection, but it couldn't, even though it did call initialize its
projection info via ExecAssignProjectionInfo. There doesn't seem to
be any good reason for this node not to have projection capability,
so clean things up so that it does. Without this, plans using Gather
nodes might need to carry extra Result nodes to do projection.

8538a630

Document BRIN's inclusion opclass framework · c15898c1

Alvaro Herrera authored Oct 27, 2015

Backpatch to 9.5 -- this should have been part of b0b7be61, but we
didn't have 38b03caebc5de either at the time.

Author: Emre Hasegeli
Revised by: Ian Barwick
Discussion:
 http://www.postgresql.org/message-id/CAE2gYzyB39Q9up_-TO6FKhH44pcAM1x6n_Cuj15qKoLoFihUVg@mail.gmail.com
 http://www.postgresql.org/message-id/562DA711.3020305@2ndquadrant.com

c15898c1

Fix BRIN free space computations · 21a4e4a4

Alvaro Herrera authored Oct 27, 2015

A bug in the original free space computation made it possible to
return a page which wasn't actually able to fit the item.  Since the
insertion code isn't prepared to deal with PageAddItem failing, a PANIC
resulted ("failed to add BRIN tuple [to new page]").  Add a macro to
encapsulate the correct computation, and use it in
brin_getinsertbuffer's callers before calling that routine, to raise an
early error.

I became aware of the possiblity of a problem in this area while working
on ccc4c074.  There's no archived discussion about it, but it's
easy to reproduce a problem in the unpatched code with something like

CREATE TABLE t (a text);
CREATE INDEX ti ON t USING brin (a) WITH (pages_per_range=1);

for length in `seq 8000 8196`
do
	psql -f - <<EOF
TRUNCATE TABLE t;
INSERT INTO t VALUES ('z'), (repeat('a', $length));
EOF
done

Backpatch to 9.5, where BRIN was introduced.

21a4e4a4

Cleanup commit timestamp module activaction, again · 531d21b7

Alvaro Herrera authored Oct 27, 2015

Further tweak commit_ts.c so that on a standby the state is completely
consistent with what that in the master, rather than behaving
differently in the cases that the settings differ. Now in standby and
master the module should always be active or inactive in lockstep.

Author: Petr Jelínek, with some further tweaks by Álvaro Herrera.

Backpatch to 9.5, where commit timestamps were introduced.

Discussion: http://www.postgresql.org/message-id/5622BF9D.2010409@2ndquadrant.com

531d21b7

Measure string lengths only once · 0cd836a4

Alvaro Herrera authored Oct 27, 2015

Bernd Helmle complained that CreateReplicationSlot() was assigning the
same value to the same variable twice, so we could remove one of them.
Code inspection reveals that we can actually remove both assignments:
according to the author the assignment was there for beauty of the
strlen line only, and another possible fix to that is to put the strlen
in its own line, so do that.

To be consistent within the file, refactor all duplicated strlen()
calls, which is what we do elsewhere in the backend anyway. In
basebackup.c, snprintf already returns the right length; no need for
strlen afterwards.

Backpatch to 9.4, where replication slots were introduced, to keep code
identical. Some of this is older, but the patch doesn't apply cleanly
and it's only of cosmetic value anyway.

Discussion: http://www.postgresql.org/message-id/BE2FD71DEA35A2287EA5F018@eje.credativ.lan

0cd836a4

23 Oct, 2015 1 commit

shm_mq: Repair breakage from previous commit. · a1480ec1

Robert Haas authored Oct 22, 2015

If the counterparty writes some data into the queue and then detaches,
it's wrong to return SHM_MQ_DETACHED right away.  If we do that, we
fail to read whatever was written.

a1480ec1

22 Oct, 2015 3 commits

Add two missing cases to ATWrongRelkindError. · 872101be

Robert Haas authored Oct 22, 2015

This way, we produce a better error message if someone tries to do
something like ALTER INDEX .. ALTER COLUMN .. SET STORAGE.

Amit Langote

872101be

shm_mq: Fix failure to notice a dead counterparty when nowait is used. · b2ccb5f4

Robert Haas authored Oct 22, 2015

The shm_mq mechanism was intended to optionally notice when the process
on the other end of the queue fails to attach to the queue. It does
this by allowing the user to pass a BackgroundWorkerHandle; if the
background worker in question is launched and dies without attaching
to the queue, then we know it never will. This logic works OK in
blocking mode, but when called with nowait = true we fail to notice
that this has happened due to an asymmetry in the logic. Repair.

Reported off-list by Rushabh Lathia. Patch by me.

b2ccb5f4

Fix typos in comments. · 31ba62ce
Robert Haas authored Oct 22, 2015
```
CharSyam
```
31ba62ce