Commits · e89526d4f3567c58c2a69fa1b1d9e44df89349fb · Abuhujair Javed / Postgres FD Implementation

06 Aug, 2016 1 commit

In B-tree page deletion, clean up properly after page deletion failure. · e89526d4

Tom Lane authored Aug 06, 2016

In _bt_unlink_halfdead_page(), we might fail to find an immediate left
sibling of the target page, perhaps because of corruption of the page
sibling links. The code intends to cope with this by just abandoning
the deletion attempt; but what actually happens is that it fails outright
due to releasing the same buffer lock twice. (And error recovery masks
a second problem, which is possible leakage of a pin on another page.)
Seems to have been introduced by careless refactoring in commit efada2b8.
Since there are multiple cases to consider, let's make releasing the buffer
lock in the failure case the responsibility of _bt_unlink_halfdead_page()
not its caller.

Also, avoid fetching the leaf page's left-link again after we've dropped
lock on the page. This is probably harmless, but it's not exactly good
coding practice.

Per report from Kyotaro Horiguchi. Back-patch to 9.4 where the faulty code
was introduced.

Discussion: <20160803.173116.111915228.horiguchi.kyotaro@lab.ntt.co.jp>

e89526d4

05 Aug, 2016 9 commits

Teach libpq to decode server version correctly from future servers. · 69dc5ae4

Tom Lane authored Aug 05, 2016

Beginning with the next development cycle, PG servers will report two-part
not three-part version numbers.  Fix libpq so that it will compute the
correct numeric representation of such server versions for reporting by
PQserverVersion().  It's desirable to get this into the field and
back-patched ASAP, so that older clients are more likely to understand the
new server version numbering by the time any such servers are in the wild.

(The results with an old client would probably not be catastrophic anyway
for a released server; for example "10.1" would be interpreted as 100100
which would be wrong in detail but would not likely cause an old client to
misbehave badly.  But "10devel" or "10beta1" would result in sversion==0
which at best would result in disabling all use of modern features.)

Extracted from a patch by Peter Eisentraut; comments added by me

Patch: <802ec140-635d-ad86-5fdf-d3af0e260c22@2ndquadrant.com>

69dc5ae4

Fix copy-and-pasteo in 81c766b3. · fc509cd8
Tom Lane authored Aug 05, 2016
```
Report: <57A4E6DF.8070209@dunslane.net>
```
fc509cd8

Make array_to_tsvector() sort and de-duplicate the given strings. · f10eab73

Tom Lane authored Aug 05, 2016

This is required for the result to be a legal tsvector value.
Noted while fooling with Andreas Seltenreich's ts_delete() crash.

Discussion: <87invhoj6e.fsf@credativ.de>

f10eab73

Fix ts_delete(tsvector, text[]) to cope with duplicate array entries. · c50d192c

Tom Lane authored Aug 05, 2016

Such cases either failed an Assert, or produced a corrupt tsvector in
non-Assert builds, as reported by Andreas Seltenreich.  The reason is
that tsvector_delete_by_indices() just assumed that its input array had
no duplicates.  Fix by explicitly de-duping.

In passing, improve some comments, and fix a number of tests for null
values to use ERRCODE_NULL_VALUE_NOT_ALLOWED not
ERRCODE_INVALID_PARAMETER_VALUE.

Discussion: <87invhoj6e.fsf@credativ.de>

c50d192c

Re-pgindent tsvector_op.c. · 33fe7360

Tom Lane authored Aug 05, 2016

Messed up by recent commits --- this is annoying me while trying to fix
some bugs here.

33fe7360

docs: re-add spaces before units removed · 5ebad9a5

Bruce Momjian authored Aug 05, 2016

This reverts the spaces before k/M/G/TB units removed for consistency in
commit ca0c37b5.

Discussion: 20160802165116.GC32575@momjian.us

5ebad9a5

Update time zone data files to tzdata release 2016f. · a629330b

Tom Lane authored Aug 05, 2016

DST law changes in Kemerovo and Novosibirsk. Historical corrections for
Azerbaijan, Belarus, and Morocco. Asia/Novokuznetsk and Asia/Novosibirsk
now use numeric time zone abbreviations instead of invented ones. Zones
for Antarctic bases and other locations that have been uninhabited for
portions of the time span known to the tzdata database now report "-00"
rather than "zzz" as the zone abbreviation for those time spans.

Also, I decided to remove some of the timezone/data/ files that we don't
use. At one time that subdirectory was a complete copy of what IANA
distributes in the tzdata tarballs, but that hasn't been true for a long
time. There seems no good reason to keep shipping those specific files
but not others; they're just bloating our tarballs.

a629330b

Change InitToastSnapshot to a macro. · 81c766b3

Robert Haas authored Aug 05, 2016

tqual.h is included in some front-end compiles, and a static inline
breaks on buildfarm member castoroides. Since the macro is never
referenced, it should dodge that problem, although this doesn't
seem like the cleanest way of hiding things from front-end compiles.

Report and review by Tom Lane; patch by me.

81c766b3

Fix hard to hit race condition in heapam's tuple locking code. · e7caacf7

Andres Freund authored Aug 04, 2016

As mentioned in its commit message, eca0f1db left open a race condition,
where a page could be marked all-visible, after the code checked
PageIsAllVisible() to pin the VM, but before the page is locked.  Plug
that hole.

Reviewed-By: Robert Haas, Andres Freund
Author: Amit Kapila
Discussion: CAEepm=3fWAbWryVW9swHyLTY4sXVf0xbLvXqOwUoDiNCx9mBjQ@mail.gmail.com
Backpatch: -

e7caacf7

04 Aug, 2016 2 commits

docs: mention rsync of temp and unlogged tables · 4eb4b3f2

Bruce Momjian authored Aug 04, 2016

This happens when using rsync to pg_upgrade slaves.

Reported-by: Jerry Sievers

Discussion: 20160726161946.GA3511@momjian.us

4eb4b3f2

Fix bogus coding in WaitForBackgroundWorkerShutdown(). · 8d498a5c

Tom Lane authored Aug 04, 2016

Some conditions resulted in "return" directly out of a PG_TRY block,
which left the exception stack dangling, and to add insult to injury
failed to restore the state of set_latch_on_sigusr1.

This is a bug only in 9.5; in HEAD it was accidentally fixed by commit
db0f6cad, which removed the surrounding PG_TRY block.  However, I (tgl)
chose to apply the patch to HEAD as well, because the old coding was
gratuitously different from WaitForBackgroundWorkerStartup(), and there
would indeed have been no bug if it were done like that to start with.

Dmitry Ivanov

Discussion: <1637882.WfYN5gPf1A@abook>

8d498a5c

03 Aug, 2016 12 commits

doc: Move indexterms to avoid whitespace issue in man pages · 81568a97
Peter Eisentraut authored Aug 03, 2016

81568a97

Prevent "snapshot too old" from trying to return pruned TOAST tuples. · 3e2f3c2e

Robert Haas authored Aug 03, 2016

Previously, we tested for MVCC snapshots to see whether they were too
old, but not TOAST snapshots, which can lead to complaints about missing
TOAST chunks if those chunks are subject to early pruning. Ideally,
the threshold lsn and timestamp for a TOAST snapshot would be that of
the corresponding MVCC snapshot, but since we have no way of deciding
which MVCC snapshot was used to fetch the TOAST pointer, use the oldest
active or registered snapshot instead.

Reported by Andres Freund, who also sketched out what the fix should
look like. Patch by me, reviewed by Amit Kapila.

3e2f3c2e

Make INSERT-from-multiple-VALUES-rows handle targetlist indirection better. · a3c7a993

Tom Lane authored Aug 03, 2016

Previously, if an INSERT with multiple rows of VALUES had indirection
(array subscripting or field selection) in its target-columns list, the
parser handled that by applying transformAssignedExpr() to each element
of each VALUES row independently.  This led to having ArrayRef assignment
nodes or FieldStore nodes in each row of the VALUES RTE.  That works for
simple cases, but in bug #14265 Nuri Boardman points out that it fails
if there are multiple assignments to elements/fields of the same target
column.  For such cases to work, rewriteTargetListIU() has to nest the
ArrayRefs or FieldStores together to produce a single expression to be
assigned to the column.  But it failed to find them in the top-level
targetlist and issued an error about "multiple assignments to same column".

We could possibly fix this by teaching the rewriter to apply
rewriteTargetListIU to each VALUES row separately, but that would be messy
(it would change the output rowtype of the VALUES RTE, for example) and
inefficient.  Instead, let's fix the parser so that the VALUES RTE outputs
are just the user-specified values, cast to the right type if necessary,
and then the ArrayRefs or FieldStores are applied in the top-level
targetlist to Vars representing the RTE's outputs.  This is the same
parsetree representation already used for similar cases with INSERT/SELECT
syntax, so it allows simplifications in ruleutils.c, which no longer needs
to treat INSERT-from-multiple-VALUES as its own special case.

This implementation works by applying transformAssignedExpr to the VALUES
entries as before, and then stripping off any ArrayRefs or FieldStores it
adds.  With lots of VALUES rows it would be noticeably more efficient to
not add those nodes in the first place.  But that's just an optimization
not a bug fix, and there doesn't seem to be any good way to do it without
significant refactoring.  (A non-invasive answer would be to apply
transformAssignedExpr + stripping to just the first VALUES row, and then
just forcibly cast remaining rows to the same data types exposed in the
first row.  But this way would lead to different, not-INSERT-specific
errors being reported in casting failure cases, so it doesn't seem very
nice.)  So leave that for later; this patch at least isn't making the
per-row parsing work worse, and it does make the finished parsetree
smaller, saving rewriter and planner work.

Catversion bump because stored rules containing such INSERTs would need
to change.  Because of that, no back-patch, even though this is a very
long-standing bug.

Report: <20160727005725.7438.26021@wrigleys.postgresql.org>
Discussion: <9578.1469645245@sss.pgh.pa.us>

a3c7a993

Do not let PostmasterContext survive into background workers. · ef1b5af8

Tom Lane authored Aug 03, 2016

We don't want postmaster child processes to contain a copy of the
postmaster's PostmasterContext. That would be a waste of memory at least,
and at worst a security issue, since there are copies of the semi-sensitive
pg_hba and pg_ident data in there. All other child process types delete
the PostmasterContext after forking, but the original coding of the
background worker patch (commit da07a1e8) did not do so. It appears
that the only reason for that was to avoid copying the bgworker's
MyBgworkerEntry out of that context; but the couple of additional
statements needed to do so are hardly good justification for it. Hence,
copy that data and then clear the context as other child processes do.

Because this patch changes the memory context in which a bgworker function
gains control, back-patching it would be a bit risky, so we won't fix this
in back branches. The "security" complaint is pretty thin anyway for
generic bgworkers; only with the introduction of parallel query is there
any question of running untrusted code in a bgworker process.

Discussion: <14111.1470082717@sss.pgh.pa.us>

ef1b5af8

Add missing casts in information schema · 6a9e09c4
Peter Eisentraut authored Aug 03, 2016
```
From: Clément Prévost <prevostclement@gmail.com>
```
6a9e09c4
doc: Remove documentation of nonexistent information schema columns · 2b8fd4fa
Peter Eisentraut authored Aug 03, 2016
```
These were probably copied in by accident.

From: Clément Prévost <prevostclement@gmail.com>
```
2b8fd4fa

Fix assorted problems in recovery tests · b26f7fa6

Alvaro Herrera authored Aug 03, 2016

In test 001_stream_rep we're using pg_stat_replication.write_location to
determine catch-up status, but we care about xlog having been applied
not just received, so change that to apply_location.

In test 003_recovery_targets, we query the database for a recovery
target specification and later for the xlog position supposedly
corresponding to that recovery specification.  If for whatever reason
more WAL is written between the two queries, the recovery specification
is earlier than the xlog position used by the query in the test harness,
so we wait forever, leading to test failures.  Deal with this by using a
single query to extract both items.  In 2a0f89cd we tried to deal
with it by giving them more tests to run, but in hindsight that was
obviously doomed to failure (no revert of that, though).

Per hamster buildfarm failures.

Author: Michaël Paquier

b26f7fa6

doc: Change recommendation to put NOTIFY into a rule · 69bdfc40
Peter Eisentraut authored Aug 03, 2016
```
Suggest a statement trigger instead.
```
69bdfc40
Add OldSnapshotTimeMapLock to wait_event table in docs. · c93d8737
Kevin Grittner authored Aug 03, 2016
```
Ashutosh Sharma with minor fixes by me.
```
c93d8737
C comment: fix typo · 6eb5b05d
Bruce Momjian authored Aug 03, 2016
```
Author: Amit Langote
```
6eb5b05d
doc: Remove slightly confusing xreflabels · 0a4d67b1
Peter Eisentraut authored Aug 02, 2016
```
It seems clearer to refer to these tables in the normal way.
```
0a4d67b1
Small wording tweaks · 07104991
Peter Eisentraut authored Aug 02, 2016
```
Dmitry Igrishin
```
07104991

02 Aug, 2016 7 commits

Remove duplicate InitPostmasterChild() call while starting a bgworker. · c6ea616f

Tom Lane authored Aug 02, 2016

This is apparently harmless on Windows, but on Unix it results in an
assertion failure.  We'd not noticed because this code doesn't get
used on Unix unless you build with -DEXEC_BACKEND.  Bug was evidently
introduced by sloppy refactoring in commit 31c45316.

Thomas Munro

Discussion: <CAEepm=1VOnbVx4wsgQFvj94hu9jVt2nVabCr7QiooUSvPJXkgQ@mail.gmail.com>

c6ea616f

doc: OS collation changes can break indexes · a253a885

Bruce Momjian authored Aug 02, 2016

Discussion: 20160702155517.GD18610@momjian.us

Reviewed-by: Christoph Berg

Backpatch-through: 9.1

a253a885

Block interrupts during HandleParallelMessages(). · b6a97b91

Tom Lane authored Aug 02, 2016

As noted by Alvaro, there are CHECK_FOR_INTERRUPTS() calls in the shm_mq.c
functions called by HandleParallelMessages(). I believe they're all
unreachable since we always pass nowait = true, but it doesn't seem like
a great idea to assume that no such call will ever be reachable from
HandleParallelMessages(). If that did happen, there would be a risk of a
recursive call to HandleParallelMessages(), which it does not appear to be
designed for --- for example, there's nothing that would prevent
out-of-order processing of received messages. And certainly such cases
cannot easily be tested. So let's prevent it by holding off interrupts for
the duration of the function. Back-patch to 9.5 which contains identical
code.

Discussion: <14869.1470083848@sss.pgh.pa.us>

b6a97b91

Change minimum max_worker_processes from 1 to 0 · c4d3a039

Peter Eisentraut authored Aug 02, 2016

Setting it to 0 is probably not useful in practice, but it allows
testing of situations without available background worker slots.

c4d3a039

Fix pg_dump's handling of public schema with both -c and -C options. · e2e95f5e

Tom Lane authored Aug 02, 2016

Since -c plus -C requests dropping and recreating the target database
as a whole, not dropping individual objects in it, we should assume that
the public schema already exists and need not be created. The previous
coding considered only the state of the -c option, so it would emit
"CREATE SCHEMA public" anyway, leading to an unexpected error in restore.

Back-patch to 9.2. Older versions did not accept -c with -C so the
issue doesn't arise there. (The logic being patched here dates to 8.0,
cf commit 2193121f, so it's not really wrong that it didn't consider
the case at the time.)

Note that versions before 9.6 will still attempt to emit REVOKE/GRANT
on the public schema; but that happens without -c/-C too, and doesn't
seem to be the focus of this complaint. I considered extending this
stanza to also skip the public schema's ACL, but that would be a
misfeature, as it'd break cases where users intentionally changed that
ACL. The real fix for this aspect is Stephen Frost's work to not dump
built-in ACLs, and that's not going to get back-ported.

Per bugs #13804 and #14271. Solution found by David Johnston and later
rediscovered by me.

Report: <20151207163520.2628.95990@wrigleys.postgresql.org>
Report: <20160801021955.1430.47434@wrigleys.postgresql.org>

e2e95f5e

doc: Whitespace fixes in man pages · e9888c2a
Peter Eisentraut authored Aug 02, 2016

e9888c2a
Consistently capitalize names of recovery tests · f6ced51f
Peter Eisentraut authored Aug 02, 2016

f6ced51f

01 Aug, 2016 6 commits

Minor cleanup for access/transam/parallel.c. · a5fe473a

Tom Lane authored Aug 01, 2016

ParallelMessagePending *must* be marked volatile, because it's set
by a signal handler. On the other hand, it's pointless for
HandleParallelMessageInterrupt to save/restore errno; that must be,
and is, done at the outer level of the SIGUSR1 signal handler.

Calling CHECK_FOR_INTERRUPTS() inside HandleParallelMessages, which itself
is called from CHECK_FOR_INTERRUPTS(), seems both useless and hazardous.
The comment claiming that this is needed to handle the error queue going
away is certainly misguided, in any case.

Improve a couple of error message texts, and use
ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE to report loss of parallel worker
connection, since that's what's used in e.g. tqueue.c. (Maybe it would be
worth inventing a dedicated ERRCODE for this type of failure? But I do not
think ERRCODE_INTERNAL_ERROR is appropriate.)

Minor stylistic cleanups.

a5fe473a

Don't CHECK_FOR_INTERRUPTS between WaitLatch and ResetLatch. · 887feefe

Tom Lane authored Aug 01, 2016

This coding pattern creates a race condition, because if an interesting
interrupt happens after we've checked InterruptPending but before we reset
our latch, the latch-setting done by the signal handler would get lost,
and then we might block at WaitLatch in the next iteration without ever
noticing the interrupt condition.  You can put the CHECK_FOR_INTERRUPTS
before WaitLatch or after ResetLatch, but not between them.

Aside from fixing the bugs, add some explanatory comments to latch.h
to perhaps forestall the next person from making the same mistake.

In HEAD, also replace gather_readnext's direct call of
HandleParallelMessages with CHECK_FOR_INTERRUPTS.  It does not seem clean
or useful for this one caller to bypass ProcessInterrupts and go straight
to HandleParallelMessages; not least because that fails to consider the
InterruptPending flag, resulting in useless work both here
(if InterruptPending isn't set) and in the next CHECK_FOR_INTERRUPTS call
(if it is).

This thinko seems to have been introduced in the initial coding of
storage/ipc/shm_mq.c (commit ec9037df), and then blindly copied into all
the subsequent parallel-query support logic.  Back-patch relevant hunks
to 9.4 to extirpate the error everywhere.

Discussion: <1661.1469996911@sss.pgh.pa.us>

887feefe

Remove unused arguments from pg_replication_origin_xact_reset function. · dd5eb805

Fujii Masao authored Aug 02, 2016

The document specifies that pg_replication_origin_xact_reset function
doesn't have any argument variables. But previously it was actually
defined so as to have two argument variables, though they were not
used at all. That is, the pg_proc entry for that function was incorrect.
This patch fixes the pg_proc entry and removes those two arguments
from the function definition.

No back-patch because this change needs a catalog version bump
although the issue exists in 9.5 as well. Instead, a note about those
unused argument variables will be added to 9.5 document later.

Catalog version bumped due to the change of pg_proc.

dd5eb805

pg_rewind docs: clarify handling of remote servers · 878bd9ac
Bruce Momjian authored Aug 01, 2016

878bd9ac
Fixed array checking code for "unsigned long long" datatypes in libecpg. · 3ebc88e5
Michael Meskes authored Aug 01, 2016

3ebc88e5

Fix pg_basebackup so that it accepts 0 as a valid compression level. · 74d8c95b

Fujii Masao authored Aug 01, 2016

The help message for pg_basebackup specifies that the numbers 0 through 9
are accepted as valid values of -Z option. But, previously -Z 0 was rejected
as an invalid compression level.

Per discussion, it's better to make pg_basebackup treat 0 as valid
compression level meaning no compression, like pg_dump.

Back-patch to all supported versions.

Reported-By: Jeff Janes
Reviewed-By: Amit Kapila
Discussion: CAMkU=1x+GwjSayc57v6w87ij6iRGFWt=hVfM0B64b1_bPVKRqg@mail.gmail.com

74d8c95b

31 Jul, 2016 3 commits

Doc: remove claim that hash index creation depends on effective_cache_size. · 11653cd8

Tom Lane authored Jul 31, 2016

This text was added by commit ff213239, and not long thereafter obsoleted
by commit 4adc2f72 (which made the test depend on NBuffers instead); but
nobody noticed the need for an update. Commit 9563d5b5 adds some further
dependency on maintenance_work_mem, but the existing verbiage seems to
cover that with about as much precision as we really want here. Let's
just take it all out rather than leaving ourselves open to more errors of
omission in future. (That solution makes this change back-patchable, too.)

Noted by Peter Geoghegan.

Discussion: <CAM3SWZRVANbj9GA9j40fAwheQCZQtSwqTN1GBTVwRrRbmSf7cg@mail.gmail.com>

11653cd8

Code review for tqueue.c: fix memory leaks, speed it up, other fixes. · a9ed875f

Tom Lane authored Jul 31, 2016

When doing record typmod remapping, tqueue.c did fresh catalog lookups
for each tuple it processed, which was pretty horrible performance-wise
(it seemed to about halve the already none-too-quick speed of bulk reads
in parallel mode).  Worse, it insisted on putting bits of that data into
TopMemoryContext, from where it never freed them, causing a
session-lifespan memory leak.  (I suppose this was coded with the idea
that the sender process would quit after finishing the query ---
but the receiver uses the same code.)

Restructure to avoid repetitive catalog lookups and to keep that data
in a query-lifespan context, in or below the context where the
TQueueDestReceiver or TupleQueueReader itself lives.

Fix some other bugs such as continuing to use a tupledesc after
releasing our refcount on it.  Clean up cavalier datatype choices
(typmods are int32, please, not int, and certainly not Oid).  Improve
comments and error message wording.

a9ed875f

Correctly handle owned sequences with extensions · f9e439b1

Stephen Frost authored Jul 31, 2016

With the refactoring of pg_dump to handle components, getOwnedSeqs needs
to be a bit more intelligent regarding which components to dump when.
Specifically, we can't simply use the owning table's components as the
set of components to dump as the table might only be including certain
components while all components of the sequence should be dumped, for
example, when the table is a member of an extension while the sequence
is not.

Handle this by combining the set of components to be dumped for the
sequence explicitly and those to be dumped for the table when setting
the components to be dumped for the sequence.

Also add a number of regression tests around this to, hopefully, catch
any future changes which break the expected behavior.

Discovered by: Philippe BEAUDOIN
Reviewed by: Michael Paquier

f9e439b1