Commits · 9e6b1bf258170e62dac555fc82ff0536dfe01d29 · Abuhujair Javed / Postgres FD Implementation

14 Jun, 2014 2 commits

Noah Misch authored Jun 14, 2014

This function is pervasive on free software operating systems; import
NetBSD's implementation.  Back-patch to 8.4, like the commit that will
harness it.

9e6b1bf2

Change the signature of rm_desc so that it's passed a XLogRecord. · 0ef0b678
Heikki Linnakangas authored Jun 14, 2014
```
Just feels more natural, and is more consistent with rm_redo.
```
0ef0b678

13 Jun, 2014 5 commits

Harden pg_filenode_relation test against concurrent DROP TABLE. · f3fdd257
Noah Misch authored Jun 13, 2014
```
Per buildfarm member prairiedog.  Back-patch to 9.4, where the test was
introduced.

Reviewed by Tom Lane.
```
f3fdd257
Adjust 9.4 release notes. · a7205d81
Noah Misch authored Jun 13, 2014
```
Back-patch to 9.4.
```
a7205d81
emacs.samples: Reliably override ".dir-locals.el". · 81300ea4
Noah Misch authored Jun 13, 2014
```
Back-patch to 9.4, where .dir-locals.el was introduced.
```
81300ea4

Improve predtest.c's ability to reason about operator expressions. · 3f8c23c4

Tom Lane authored Jun 13, 2014

We have for a long time been able to prove implications and refutations
between clauses structured like "expr op const" with the same subexpression
and btree-related operators; for example that "x < 4" implies "x <= 5".
The implication machinery is needed to detect usability of partial indexes,
and the refutation machinery is needed to implement constraint exclusion.

This patch extends that machinery to make proofs for operator expressions
involving the same two immutable-but-not-necessarily-just-Const input
expressions, ie does "expr1 op1 expr2" prove or refute "expr1 op2 expr2" or
"expr2 op2 expr1"? An important example is that we can now prove "x = y"
given "y = x", which formerly the code could not deduce unless x or y was a
constant. We can make use of the system's knowledge of operator commutator
and negator pairs, and can also make use of btree opclass relationships,
for example "x < y" implies "x <= y" and refutes "x > y" (notice that
neither of these could be proven just from commutator or negator links).

Inspired by a gripe from Brian Dunavant. This seems more like a new
feature than a bug fix, though, so no back-patch.

3f8c23c4

Fix pg_restore's processing of old-style BLOB COMMENTS data. · c81e63d8

Tom Lane authored Jun 12, 2014

Prior to 9.0, pg_dump handled comments on large objects by dumping a bunch
of COMMENT commands into a single BLOB COMMENTS archive object. With
sufficiently many such comments, some of the commands would likely get
split across bufferloads when restoring, causing failures in
direct-to-database restores (though no problem would be evident in text
output). This is the same type of issue we have with table data dumped as
INSERT commands, and it can be fixed in the same way, by using a mini SQL
lexer to figure out where the command boundaries are. Fortunately, the
COMMENT commands are no more complex to lex than INSERTs, so we can just
re-use the existing lexer for INSERTs.

Per bug #10611 from Jacek Zalewski. Back-patch to all active branches.

c81e63d8

12 Jun, 2014 9 commits

Improve tuplestore's error messages for I/O failures. · 6554656e

Tom Lane authored Jun 12, 2014

We should report the errno when we get a failure from functions like
BufFileWrite.  "ERROR: write failed" is unreasonably taciturn for a
case that's well within the realm of possibility; I've seen it a
couple times in the buildfarm recently, in situations that were
probably out-of-disk-space, but it'd be good to see the errno
to confirm it.

I think this code was originally written without assuming that
the buffile.c functions would return useful errno; but most other
callers *are* assuming that, and a quick look at the buffile code
gives no reason to suppose otherwise.

Also, a couple of the old messages were phrased on the assumption
that a short read might indicate a logic bug in tuplestore itself;
but that code's pretty well tested by now, so a filesystem-level
problem seems much more likely.

6554656e

Adjust largeobject regression test to leave a couple of LOs behind. · 70ad7ed4

Tom Lane authored Jun 12, 2014

Since we commonly test pg_dump/pg_restore by seeing whether they can dump
and restore the regression test database, it behooves us to include some
large objects in that test scenario.

I tried to include a comment on one of these large objects to improve
the test scenario further ... but it turns out that pg_upgrade fails to
preserve comments on large objects, and its regression test notices
the discrepancy. So uncommenting that COMMENT is a TODO for later.

70ad7ed4

Preserve exposed type of subquery outputs when substituting NULLs. · 9d4444a6
Tom Lane authored Jun 12, 2014
```
I thought I could get away with hardcoded int4 here, but the buildfarm
says differently.
```
9d4444a6

Remove inadvertent copyright violation in largeobject regression test. · d2783bee

Tom Lane authored Jun 12, 2014

Robert Frost is no longer with us, but his copyrights still are, so
let's stop using "Stopping by Woods on a Snowy Evening" as test data
before somebody decides to sue us. Wordsworth is more safely dead.

d2783bee

Add regression test to prevent future breakage of legacy query in libpq. · 2dd352d4

Tom Lane authored Jun 12, 2014

Memorialize the expected output of the query that libpq has been using for
many years to get the OIDs of large-object support functions. Although
we really ought to change the way libpq does this, we must expect that
this query will remain in use in the field for the foreseeable future,
so until we're ready to break compatibility with old libpq versions
we'd better check the results stay the same. See the recent lo_create()
fiasco.

2dd352d4

Rename lo_create(oid, bytea) to lo_from_bytea(). · 154146d2

Tom Lane authored Jun 12, 2014

The previous naming broke the query that libpq's lo_initialize() uses
to collect the OIDs of the server-side functions it requires, because
that query effectively assumes that there is only one function named
lo_create in the pg_catalog schema (and likewise only one lo_open, etc).

While we should certainly make libpq more robust about this, the naive
query will remain in use in the field for the foreseeable future, so it
seems the only workable choice is to use a different name for the new
function. lo_from_bytea() won a small straw poll.

Back-patch into 9.4 where the new function was introduced.

154146d2

Fix typos · 79379107
Alvaro Herrera authored Feb 07, 2014

79379107

Remove unnecessary output expressions from unflattened subqueries. · 55d5b3c0

Tom Lane authored Jun 12, 2014

If a sub-select-in-FROM gets flattened into the upper query, then we
naturally get rid of any output columns that are defined in the sub-select
text but not actually used in the upper query. However, this doesn't
happen when it's not possible to flatten the subquery, for example because
it contains GROUP BY, LIMIT, etc. Allowing the subquery to compute useless
output columns is often fairly harmless, but sometimes it has significant
performance cost: the unused output might be an expensive expression,
or it might be a Var from a relation that we could remove entirely (via
the join-removal logic) if only we realized that we didn't really need
that Var. Situations like this are common when expanding views, so it
seems worth taking the trouble to detect and remove unused outputs.

Because the upper query's Var numbering for subquery references depends on
positions in the subquery targetlist, we don't want to renumber the items
we leave behind. Instead, we can implement "removal" by replacing the
unwanted expressions with simple NULL constants. This wastes a few cycles
at runtime, but not enough to justify more work in the planner.

55d5b3c0

Consistency improvements for slot and decoding code. · e04a9ccd

Andres Freund authored Jun 12, 2014

Change the order of checks in similar functions to be the same; remove
a parameter that's not needed anymore; rename a memory context and
expand a couple of comments.

Per review comments from Amit Kapila

e04a9ccd

11 Jun, 2014 7 commits

Have configuration templates augment, not replace, LDFLAGS. · 4d92b158

Noah Misch authored Jun 11, 2014

This preserves user-specified LDFLAGS; we already kept user-specified
CFLAGS and CPPFLAGS. Given the shortage of complaints and the fact that
any problem caused is likely to appear at build time, no back-patch.

Dag-Erling Smørgrav and Noah Misch

4d92b158

Consistently define BUILDING_DLL during builds of src/port for Windows. · bd31794d

Noah Misch authored Jun 11, 2014

The MSVC build process already did so; this fixes the principal build
process to match. Both processes already did likewise for src/common.
This lets server builds of src/port reference postgres.exe data symbols.

bd31794d

Fix typos in comments. · d098b236
Noah Misch authored Jun 11, 2014

d098b236
Fix typos in comments. · a26ae56f
Fujii Masao authored Jun 11, 2014

a26ae56f

Fix ancient encoding error in hungarian.stop. · fd90b5d5

Tom Lane authored Jun 10, 2014

When we grabbed this file off the Snowball project's website, we mistakenly
supposed that it was in LATIN1 encoding, but evidently it was actually in
LATIN2. This resulted in ő (o-double-acute, U+0151, which is code 0xF5 in
LATIN2) being misconverted into õ (o-tilde, U+00F5), as complained of in
bug #10589 from Zoltán Sörös. We'd have messed up u-double-acute too,
but there aren't any of those in the file. Other characters used in the
file have the same codes in LATIN1 and LATIN2, which no doubt helped hide
the problem for so long.

The error is not only ours: the Snowball project also was confused about
which encoding is required for Hungarian. But dealing with that will
require source-code changes that I'm not at all sure we'll wish to
back-patch. Fixing the stopword file seems reasonably safe to back-patch
however.

fd90b5d5

Stamp shared-library minor version numbers for 9.5. · 3bd82dd3
Tom Lane authored Jun 10, 2014

3bd82dd3
Stamp HEAD as 9.5devel. · a24c104b
Tom Lane authored Jun 10, 2014
```
Let the hacking begin ...
```
a24c104b

10 Jun, 2014 1 commit

Forward-port regression test for bug #10587 into 9.3 and HEAD. · ab76208e

Tom Lane authored Jun 09, 2014

Although this bug is already fixed in post-9.2 branches, the case
triggering it is quite different from what was under consideration
at the time. It seems worth memorializing this example in HEAD
just to make sure it doesn't get broken again in future.

Extracted from commit 187ae17300776f48b2bd9d0737923b1bf70f606e.

ab76208e

09 Jun, 2014 2 commits

Fix infinite loop when splitting inner tuples in SPGiST text indexes. · c170655c

Tom Lane authored Jun 09, 2014

Previously, the code used a node label of zero both for strings that
contain no bytes beyond the inner tuple's prefix, and for cases where an
"allTheSame" inner tuple has to be split to allow a string with a different
next byte to be inserted into it. Failing to distinguish these cases meant
that if a string ending with the current prefix needed to be inserted into
an allTheSame tuple, we got into an infinite loop, because after splitting
the tuple we'd descend into the child allTheSame tuple and then find we
need to split again.

To fix, instead use -1 and -2 as the node labels for these two cases.
This requires widening the node label type from "char" to int2, but
fortunately SPGiST stores all pass-by-value node label types in their
Datum representation, which means that this change is transparently upward
compatible so far as the on-disk representation goes. We continue to
recognize zero as a dummy node label for reading purposes, but will not
attempt to push new index entries down into such a label, so that the loop
won't occur even when dealing with an existing index.

Per report from Teodor Sigaev. Back-patch to 9.2 where the faulty
code was introduced.

c170655c

Wrap multixact/members correctly during extension, take 2 · b0b263ba

Alvaro Herrera authored Jun 09, 2014

In a50d9762 I already changed this, but got it wrong for the case
where the number of members is larger than the number of entries that
fit in the last page of the last segment.

As reported by Serge Negodyuck in a followup to bug #8673.

b0b263ba

05 Jun, 2014 7 commits

Fix off-by-one in decoding causing one-record events to be skipped. · fe7337f2

Andres Freund authored Jun 05, 2014

A ReorderBufferTransaction's end_lsn, the sentPtr advocated by
walsender keepalive messages, and the end location remembered by the
decoding get_*changes* SQL functions all use the location of the last
read record + 1. I.e. the LSN points to the beginning of the next
record. That cannot realistically be changed without changing the
replication protocol because that's how keepalive messages have worked
since 9.0.
The bug is that the logic inside the snapshot builder, which decides
whether a transaction's contents should be decoded, assumed the start
location would point towards the last byte of the last record. The
reason this didn't actually cause visible problems is that currently
that decision is only made for commit records. Since interesting
transactions always have at least one additional record - containing
actual data - we'd never skip a transaction.
But if there ever were transactions, or other events, with just one
record containing important information, we'd skip them after stopping
and restarting logical decoding.

fe7337f2

Add defenses against running with a wrong selection of LOBLKSIZE. · 5f93c378

Tom Lane authored Jun 05, 2014

It's critical that the backend's idea of LOBLKSIZE match the way data has
actually been divided up in pg_largeobject. While we don't provide any
direct way to adjust that value, doing so is a one-line source code change
and various people have expressed interest recently in changing it. So,
just as with TOAST_MAX_CHUNK_SIZE, it seems prudent to record the value in
pg_control and cross-check that the backend's compiled-in setting matches
the on-disk data.

Also tweak the code in inv_api.c so that fetches from pg_largeobject
explicitly verify that the length of the data field is not more than
LOBLKSIZE. Formerly we just had Asserts() for that, which is no protection
at all in production builds. In some of the call sites an overlength data
value would translate directly to a security-relevant stack clobber, so it
seems worth one extra runtime comparison to be sure.

In the back branches, we can't change the contents of pg_control; but we
can still make the extra checks in inv_api.c, which will offer some amount
of protection against running with the wrong value of LOBLKSIZE.

5f93c378

Consistently spell a replication slot's name as slot_name. · f0c10856

Andres Freund authored Jun 05, 2014

Previously there's been a mix between 'slotname' and 'slot_name'. It's
not nice to be unneccessarily inconsistent in a new feature. As a post
beta1 initdb now is required in the wake of eeca4cd3, fix the
inconsistencies.
Most the changes won't affect usage of replication slots because the
majority of changes is around function parameter names. The prominent
exception to that is that the recovery.conf parameter
'primary_slotname' is now named 'primary_slot_name'.

f0c10856

Move regression test listing of builtin leakproof functions to opr_sanity.sql. · e0cb4aa8

Andres Freund authored Jun 05, 2014

The original location in create_function_3.sql didn't invite the close
structinity warranted for adding new leakproof functions. Add comments
to the test explaining that functions should only be added after
careful consideration and understanding what a leakproof function is.

Per complaint from Tom Lane after 5eebb8d9.

e0cb4aa8

Adjust SP-GiST WAL record formats to reduce alignment padding. · 8776faa8

Heikki Linnakangas authored Jun 05, 2014

The way the code was written, the padding was copied from uninitialized
memory areas.. Because the structs are local variables in the code where
the WAL records are constructed, making them larger and zeroing the padding
bytes would not make the code very pretty, so rather than fixing this
directly by zeroing out the padding bytes, it seems more clear to not try to
align the tuples in the WAL records. The redo functions are taught to copy
the tuple header to a local variable to avoid unaligned access.

Stable-branches have the same problem, but we can't change the WAL format
there, so fix in master only. Reading a few random extra bytes at the stack
is harmless in practice, so it's not worth crafting a different
back-patchable fix.

Per reports from Kevin Grittner and Andres Freund, using clang static
analyzer and Valgrind, respectively.

8776faa8

Tweak new regression test case for better portability. · d4d48a5e

Tom Lane authored Jun 04, 2014

Buildfarm says we get different plans on 32-bit and 64-bit platforms,
probably because of MAXALIGN-related differences in memory-consumption
calculations. Add some dummy WHERE clauses so that the planner estimates
different sizes for the three generate_series() relations; that should
stabilize the choice of join order.

d4d48a5e

Add btree and hash opclasses for pg_lsn. · 4c8ab1b9

Tom Lane authored Jun 04, 2014

This is needed to allow ORDER BY, DISTINCT, etc to work as expected for
pg_lsn values.

We had previously decided to put this off for 9.5, but in view of commit
eeca4cd3 there's no reason to avoid a
catversion bump for 9.4beta2, and this does make a pretty significant
usability difference for pg_lsn.

Michael Paquier, with fixes from Andres Freund and Tom Lane

4c8ab1b9

04 Jun, 2014 5 commits

Bump PG_CONTROL_VERSION for previous 9.4 changes. · eeca4cd3

Tom Lane authored Jun 04, 2014

This should have been done in 6bc8ef0b
and/or 50e54709, but better late than
never.  If we don't change this then we risk 9.3 pg_controldata or
pg_resetxlog being inappropriately used against a 9.4 pg_control file,
or vice versa.

eeca4cd3

Fix longstanding bug in HeapTupleSatisfiesVacuum(). · 621a99a6

Andres Freund authored Jun 04, 2014

HeapTupleSatisfiesVacuum() didn't properly discern between
DELETE_IN_PROGRESS and INSERT_IN_PROGRESS for rows that have been
inserted in the current transaction and deleted in a aborted
subtransaction of the current backend. At the very least that caused
problems for CLUSTER and CREATE INDEX in transactions that had
aborting subtransactions producing rows, leading to warnings like:
WARNING:  concurrent delete in progress within table "..."
possibly in an endless, uninterruptible, loop.

Instead of treating *InProgress xmins the same as *IsCurrent ones,
treat them as being distinct like the other visibility routines. As
implemented this separatation can cause a behaviour change for rows
that have been inserted and deleted in another, still running,
transaction. HTSV will now return INSERT_IN_PROGRESS instead of
DELETE_IN_PROGRESS for those. That's both, more in line with the other
visibility routines and arguably more correct. The latter because a
INSERT_IN_PROGRESS will make callers look at/wait for xmin, instead of
xmax.
The only current caller where that's possibly worse than the old
behaviour is heap_prune_chain() which now won't mark the page as
prunable if a row has concurrently been inserted and deleted. That's
harmless enough.

As a cautionary measure also insert a interrupt check before the gotos
in IndexBuildHeapScan() that lead to the uninterruptible loop. There
are other possible causes, like a row that several sessions try to
update and all fail, for repeated loops and the cost of doing so in
the retry case is low.

As this bug goes back all the way to the introduction of
subtransactions in 573a71a5 backpatch to all supported releases.

Reported-By: Sandro Santilli

621a99a6

Add description of pg_stat directory into doc. · c8c9c1f5
Fujii Masao authored Jun 05, 2014
```
Back-patch to 9.3 where pg_stat directory was introduced.
```
c8c9c1f5

Save pg_stat_statements statistics file into $PGDATA/pg_stat directory at shutdown. · 654e8e44

Fujii Masao authored Jun 04, 2014

187492b6 changed pgstat.c so that
the stats files were saved into $PGDATA/pg_stat directory when the server
was shutdowned. But it accidentally forgot to change the location of
pg_stat_statements permanent stats file. This commit fixes pg_stat_statements
so that its stats file is also saved into $PGDATA/pg_stat at shutdown.

Since this fix changes the file layout, we don't back-patch it to 9.3
where this oversight was introduced.

654e8e44

Silence Bison deprecation warnings · 55fb759a

Peter Eisentraut authored Jun 03, 2014

Bison >=3.0 issues warnings about

    %name-prefix="base_yy"

instead of the now preferred

    %name-prefix "base_yy"

but the latter doesn't work with Bison 2.3 or less.  So for now we
silence the deprecation warnings.

55fb759a

03 Jun, 2014 2 commits

Use EncodeDateTime instead of to_char to render JSON timestamps. · ab14a73a

Andrew Dunstan authored Jun 03, 2014

Per gripe from Peter Eisentraut and Tom Lane.

The output is slightly different, but still ISO 8601 compliant: to_char
doesn't output the minutes when time zone offset is an integer number of
hours, while EncodeDateTime outputs ":00".

The code is slightly adapted from code in xml.c

ab14a73a

Do not escape a unicode sequence when escaping JSON text. · 0ad1a816

Andrew Dunstan authored Jun 03, 2014

Previously, any backslash in text being escaped for JSON was doubled so
that the result was still valid JSON. However, this led to some perverse
results in the case of Unicode sequences, These are now detected and the
initial backslash is no longer escaped. All other backslashes are
still escaped. No validity check is performed, all that is looked for is
\uXXXX where X is a hexidecimal digit.

This is a change from the 9.2 and 9.3 behaviour as noted in the Release
notes.

Per complaint from Teodor Sigaev.

0ad1a816