Commits · ec9037df2634ddcd6a3b036463722c8ee009b132 · Abuhujair Javed / Postgres FD Implementation

14 Jan, 2014 2 commits

Single-reader, single-writer, lightweight shared message queue. · ec9037df

Robert Haas authored Jan 14, 2014

This code provides infrastructure for user backends to communicate
relatively easily with background workers.  The message queue is
structured as a ring buffer and allows messages of arbitary length
to be sent and received.

Patch by me.  Review by KaiGai Kohei and Andres Freund.

ec9037df

Simple table of contents for a shared memory segment. · 6ddd5137

Robert Haas authored Jan 14, 2014

This interface is intended to make it simple to divide a dynamic shared
memory segment into different regions with distinct purposes.  It
therefore serves much the same purpose that ShmemIndex accomplishes for
the main shared memory segment, but it is intended to be more
lightweight.

Patch by me.  Review by Andres Freund.

6ddd5137

13 Jan, 2014 7 commits

Code improvements for ALTER SYSTEM .. SET. · 05ff5062

Robert Haas authored Jan 13, 2014

Move FreeConfigVariables() later to make sure ErrorConfFile is valid
when we use it, and get rid of an unnecessary string copy operation.

Amit Kapila, kibitzed by me.

05ff5062

Make bitmap heap scans show exact/lossy block info in EXPLAIN ANALYZE. · 2bb1f14b
Robert Haas authored Jan 13, 2014
```
Etsuro Fujita
```
2bb1f14b

Fix possible buffer overrun in contrib/pg_trgm. · c3ccc9ee

Tom Lane authored Jan 13, 2014

Allow for the possibility that folding a string to lower case makes it
longer (due to replacing a character with a longer multibyte character).
This doesn't change the number of trigrams that will be extracted, but
it does affect the required size of an intermediate buffer in
generate_trgm(). Per bug #8821 from Ufuk Kayserilioglu.

Also install some checks that the input string length is not so large
as to cause overflow in the calculations of palloc request sizes.

Back-patch to all supported versions.

c3ccc9ee

Fix calculation of ISMN check digit. · 866a1f09
Heikki Linnakangas authored Jan 13, 2014
```
This has always been broken, so back-patch to all supported versions.

Fabien COELHO
```
866a1f09
Add OVERLAPS to index in the docs. · 04038148
Heikki Linnakangas authored Jan 13, 2014
```
Per report from Adam Mackler and Jonathan Katz
```
04038148
Always use the same way to addres a descriptor in ecpg's regression tests. · 976a7d11
Michael Meskes authored Jan 13, 2014

976a7d11

Fix pg_dumpall on pre-8.1 servers · bb953ad1

Bruce Momjian authored Jan 12, 2014

rolname did not exist in pg_shadow.

Backpatch to 9.3

Report by Andrew Gierth via IRC

bb953ad1

12 Jan, 2014 1 commit

Disallow LATERAL references to the target table of an UPDATE/DELETE. · 158b7fa6

Tom Lane authored Jan 11, 2014

On second thought, commit 0c051c90 was
over-hasty: rather than allowing this case, we ought to reject it for now.
That leaves the field clear for a future feature that allows the target
table to be re-specified in the FROM (or USING) clause, which will enable
left-joining the target table to something else. We can then also allow
LATERAL references to such an explicitly re-specified target table.
But allowing them right now will create ambiguities or worse for such a
feature, and it isn't something we documented 9.3 as supporting.

While at it, add a convenience subroutine to avoid having several copies
of the ereport for disalllowed-LATERAL-reference cases.

158b7fa6

11 Jan, 2014 7 commits

Fix possible crashes due to using elog/ereport too early in startup. · 910bac59

Tom Lane authored Jan 11, 2014

Per reports from Andres Freund and Luke Campbell, a server failure during
set_pglocale_pgservice results in a segfault rather than a useful error
message, because the infrastructure needed to use ereport hasn't been
initialized; specifically, MemoryContextInit hasn't been called.
One known cause of this is starting the server in a directory it
doesn't have permission to read.

We could try to prevent set_pglocale_pgservice from using anything that
depends on palloc or elog, but that would be messy, and the odds of future
breakage seem high. Moreover there are other things being called in main.c
that look likely to use palloc or elog too --- perhaps those things
shouldn't be there, but they are there today. The best solution seems to
be to move the call of MemoryContextInit to very early in the backend's
real main() function. I've verified that an elog or ereport occurring
immediately after that is now capable of sending something useful to
stderr.

I also added code to elog.c to print something intelligible rather than
just crashing if MemoryContextInit hasn't created the ErrorContext.
This could happen if MemoryContextInit itself fails (due to malloc
failure), and provides some future-proofing against someone trying to
sneak in new code even earlier in server startup.

Back-patch to all supported branches. Since we've only heard reports of
this type of failure recently, it may be that some recent change has made
it more likely to see a crash of this kind; but it sure looks like it's
broken all the way back.

910bac59

Revert fd2ace80 · d84c584e
Bruce Momjian authored Jan 11, 2014
```
Seems we want to document '=' plpgsql assignment instead.
```
d84c584e

Fix compute_scalar_stats() for case that all values exceed WIDTH_THRESHOLD. · 62865262

Tom Lane authored Jan 11, 2014

The standard typanalyze functions skip over values whose detoasted size
exceeds WIDTH_THRESHOLD (1024 bytes), so as to limit memory bloat during
ANALYZE.  However, we (I think I, actually :-() failed to consider the
possibility that *every* non-null value in a column is too wide.  While
compute_minimal_stats() seems to behave reasonably anyway in such a case,
compute_scalar_stats() just fell through and generated no pg_statistic
entry at all.  That's unnecessarily pessimistic: we can still produce
valid stanullfrac and stawidth values in such cases, since we do include
too-wide values in the average-width calculation.  Furthermore, since the
general assumption in this code is that too-wide values are probably all
distinct from each other, it seems reasonable to set stadistinct to -1
("all distinct").

Per complaint from Kadri Raudsepp.  This has been like this since roughly
neolithic times, so back-patch to all supported branches.

62865262

docs: remove undocumented assign syntax in plpgsql examples · fd2ace80
Bruce Momjian authored Jan 11, 2014
```
Pavel Stehule
```
fd2ace80

Add another regression test cross-checking operator and function comments. · 28233ffa

Tom Lane authored Jan 11, 2014

Add a query that lists all the functions that are operator implementation
functions and have a SQL comment that doesn't just say "implementation of
XYZ operator". (Note that the preceding test checks that such functions'
comments exactly match the corresponding operators' comments.)

While it's not forbidden to add more functions to this list, that should
only be done when we're encouraging users to use either the function or
operator syntax for the functionality, which is a fairly rare situation.

28233ffa

Remove DESCR entries for json operator functions. · 11829ff8
Andrew Dunstan authored Jan 10, 2014
```
Per -hackers discussion.
```
11829ff8
Adjust pg_upgrade for move of username lookup functions to /common · 850ade3e
Bruce Momjian authored Jan 10, 2014

850ade3e

10 Jan, 2014 2 commits

Move username lookup functions from /port to /common · 111022ea
Bruce Momjian authored Jan 10, 2014
```
Per suggestion from Peter E and Alvaro
```
111022ea

Accept pg_upgraded tuples during multixact freezing · 423e1211

Alvaro Herrera authored Jan 10, 2014

The new MultiXact freezing routines introduced by commit 8e9a16ab8f7
neglected to consider tuples that came from a pg_upgrade'd database; a
vacuum run that tried to freeze such tuples would die with an error such
as
ERROR: MultiXactId 11415437 does no longer exist -- apparent wraparound

To fix, ensure that GetMultiXactIdMembers is allowed to return empty
multis when the infomask bits are right, as is done in other callsites.

Per trouble report from F-Secure.

In passing, fix a copy&paste bug reported by Andrey Karpov from VIVA64
from their PVS-Studio static checked, that instead of setting relminmxid
to Invalid, we were setting relfrozenxid twice.  Not an important
mistake because that code branch is about relations for which we don't
use the frozenxid/minmxid values at all in the first place, but seems to
warrants a fix nonetheless.

423e1211

09 Jan, 2014 7 commits

Remove unnecessary local variables to work around an icc optimization bug. · faab7a95

Tom Lane authored Jan 09, 2014

Buildfarm member dunlin has been crashing since commit 8b49a604, but other
machines seem fine with that code. It turns out that removing the local
variables in ordered_set_startup() that are copies of fields in "qstate"
dodges the problem. This might cost a few cycles on register-rich
machines, but it's probably a wash on others, and in any case this code
isn't performance-critical. Thanks to Jeremy Drake for off-list
investigation.

faab7a95

Changed regression test to ecpg test suite for alignment problem just with last · 192b4aac
Michael Meskes authored Jan 09, 2014
```
commit.
```
192b4aac

Fix descriptor output in ECPG. · d685e242

Michael Meskes authored Jan 09, 2014

While working on most platforms the old way sometimes created alignment
problems. This should fix it. Also the regresion tests were updated to test for
the reported case.

Report and fix by MauMau <maumau307@gmail.com>

d685e242

Refactor checking whether we've reached the recovery target. · c945af80

Heikki Linnakangas authored Jan 09, 2014

Makes the replay loop slightly more readable, by separating the concerns of
whether to stop and whether to delay, and how to extract the timestamp from
a record.

This has the user-visible change that the timestamp of the last applied
record is now updated after actually applying it. Before, it was updated
just before applying it. That meant that pg_last_xact_replay_timestamp()
could return the timestamp of a commit record that is in process of being
replayed, but not yet applied. Normally the difference is small, but if
min_recovery_apply_delay is set, there could be a significant delay between
reading a record and applying it.

Another behavioral change is that if you recover to a restore point, we stop
after the restore point record, not before it. It makes no difference as far
as running queries on the server is concerned, as applying a restore point
record changes nothing, but if examine the timeline history you will see
that the new timeline branched off just after the restore point record, not
before it. One practical consequence is that if you do PITR to the new
timeline, and set recovery target to the same named restore point again, it
will find and stop recovery at the same restore point. Conceptually, I think
it makes more sense to consider the restore point as part of the new
timeline's history than not.

In principle, setting the last-replayed timestamp before actually applying
the record was a bug all along, but it doesn't seem worth the risk to
backpatch, since min_recovery_apply_delay was only added in 9.4.

c945af80

pgcrypto: Make header files stand alone · 10a3b165

Peter Eisentraut authored Jan 09, 2014

pgp.h used to require including mbuf.h and px.h first.  Include those in
pgp.h, so that it can be used without prerequisites.  Remove mbuf.h
inclusions in .c files where mbuf.h features are not used
directly.  (px.h was always used.)

10a3b165

We don't need to include pg_sema.h in s_lock.h anymore. · 220b3433

Tom Lane authored Jan 08, 2014

Minor improvement to commit daa7527a:
s_lock.h no longer has any need to mention PGSemaphoreData, so we can
rip out the #include that supplies that. In a non-HAVE_SPINLOCKS
build, this doesn't really buy much since we still need the #include
in spin.h --- but everywhere else, this reduces #include footprint by
some trifle, and helps keep the different locking facilities separate.

220b3433

Fix "cannot accept a set" error when only some arms of a CASE return a set. · 080b7db7

Tom Lane authored Jan 08, 2014

In commit c1352052, I implemented an
optimization that assumed that a function's argument expressions would
either always return a set (ie multiple rows), or always not.  This is
wrong however: we allow CASE expressions in which some arms return a set
of some type and others just return a scalar of that type.  There may be
other examples as well.  To fix, replace the run-time test of whether an
argument returned a set with a static precheck (expression_returns_set).
This adds a little bit of query startup overhead, but it seems barely
measurable.

Per bug #8228 from David Johnston.  This has been broken since 8.0,
so patch all supported branches.

080b7db7

08 Jan, 2014 7 commits

Reduce the number of semaphores used under --disable-spinlocks. · daa7527a

Robert Haas authored Jan 08, 2014

Instead of allocating a semaphore from the operating system for every
spinlock, allocate a fixed number of semaphores (by default, 1024)
from the operating system and multiplex all the spinlocks that get
created onto them.  This could self-deadlock if a process attempted
to acquire more than one spinlock at a time, but since processes
aren't supposed to execute anything other than short stretches of
straight-line code while holding a spinlock, that shouldn't happen.

One motivation for this change is that, with the introduction of
dynamic shared memory, it may be desirable to create spinlocks that
last for less than the lifetime of the server.  Without this change,
attempting to use such facilities under --disable-spinlocks would
quickly exhaust any supply of available semaphores.  Quite apart
from that, it's desirable to contain the quantity of semaphores
needed to run the server simply on convenience grounds, since using
too many may make it harder to get PostgreSQL running on a new
platform, which is mostly the point of --disable-spinlocks in the
first place.

Patch by me; review by Tom Lane.

daa7527a

Fix pause_at_recovery_target + recovery_target_inclusive combination. · 3739e5ab

Heikki Linnakangas authored Jan 08, 2014

If pause_at_recovery_target is set, recovery pauses *before* applying the
target record, even if recovery_target_inclusive is set. If you then
continue with pg_xlog_replay_resume(), it will apply the target record
before ending recovery. In other words, if you log in while it's paused
and verify that the database looks OK, ending recovery changes its state
again, possibly destroying data that you were tring to salvage with PITR.

Backpatch to 9.1, this has been broken since pause_at_recovery_target was
added.

3739e5ab

If multiple recovery_targets are specified, use the latest one. · 815d71de

Heikki Linnakangas authored Jan 08, 2014

The docs say that only one of recovery_target_xid, recovery_target_time, or
recovery_target_name can be specified. But the code actually did something
different, so that a name overrode time, and xid overrode both time and name.
Now the target specified last takes effect, whether it's an xid, time or
name.

With this patch, we still accept multiple recovery_target settings, even
though docs say that only one can be specified. It's a general property of
the recovery.conf file parser that you if you specify the same option twice,
the last one takes effect, like with postgresql.conf.

815d71de

Avoid extra AggCheckCallContext() checks in ordered-set aggregates. · 847e46ab

Tom Lane authored Jan 08, 2014

In the transition functions, we don't really need to recheck this after the
first call. I had been feeling paranoid about possibly getting a non-null
argument value in some other context; but it's probably game over anyway
if we have a non-null "internal" value that's not what we are expecting.

In the final functions, the general convention in pre-existing final
functions seems to be that an Assert() is good enough, so do it like that
here too.

This seems to save a few tenths of a percent of overall query runtime,
which isn't much, but still it's just overhead if there's not a plausible
case where the checks would fire.

847e46ab

Save a few cycles in advance_transition_function(). · e6336b8b

Tom Lane authored Jan 08, 2014

Keep a pre-initialized FunctionCallInfoData in AggStatePerAggData, and
re-use that at each row instead of doing InitFunctionCallInfoData each
time. This saves only half a dozen assignments and maybe some stack
manipulation, and yet that seems to be good for a percent or two of the
overall query run time for simple aggregates such as count(*). The cost
is that the FunctionCallInfoData (which is about a kilobyte, on 64-bit
machines) stays allocated for the duration of the query instead of being
short-lived stack data. But we're already paying an equivalent space cost
for each regular FuncExpr or OpExpr node, so I don't feel bad about paying
it for aggregate functions. The code seems a little cleaner this way too,
since the number of things passed to advance_transition_function decreases.

e6336b8b

Fix bug in determining when recovery has reached consistency. · d59ff6c1

Heikki Linnakangas authored Jan 08, 2014

When starting WAL replay from an online checkpoint, the last replayed WAL
record variable was initialized using the checkpoint record's location, even
though the records between the REDO location and the checkpoint record had
not been replayed yet. That was noted as "slightly confusing" but harmless
in the comment, but in some cases, it fooled CheckRecoveryConsistency to
incorrectly conclude that we had already reached a consistent state
immediately at the beginning of WAL replay. That caused the system to accept
read-only connections in hot standby mode too early, and also PANICs with
message "WAL contains references to invalid pages".

Fix by initializing the variables to the REDO location instead.

In 9.2 and above, change CheckRecoveryConsistency() to use
lastReplayedEndRecPtr variable when checking if backup end location has
been reached. It was inconsistently using EndRecPtr for that check, but
lastReplayedEndRecPtr when checking min recovery point. It made no
difference before this patch, because in all the places where
CheckRecoveryConsistency was called the two variables were the same, but
it was always an accident waiting to happen, and would have been wrong
after this patch anyway.

Report and analysis by Tomonari Katsumata, bug #8686. Backpatch to 9.0,
where hot standby was introduced.

d59ff6c1

pg_upgrade: Fix fatal error handling · ca607b15

Peter Eisentraut authored Jan 08, 2014

Restore exiting when pg_log(PG_FATAL) is called directly instead of
calling pg_fatal().  Fault introduced in
264aa14a.

ca607b15

07 Jan, 2014 6 commits

Bruce Momjian authored Jan 07, 2014

Update all files in head, and files COPYRIGHT and legal.sgml in all back
branches.

7e04792a

Fix LATERAL references to target table of UPDATE/DELETE. · 0c051c90

Tom Lane authored Jan 07, 2014

I failed to think much about UPDATE/DELETE when implementing LATERAL :-(.
The implemented behavior ended up being that subqueries in the FROM or
USING clause (respectively) could access the update/delete target table as
though it were a lateral reference; which seems fine if they said LATERAL,
but certainly ought to draw an error if they didn't. Fix it so you get a
suitable error when you omit LATERAL. Per report from Emre Hasegeli.

0c051c90

Silence compiler warning on MSVC. · f68220df

Heikki Linnakangas authored Jan 07, 2014

MSVC doesn't know that elog(ERROR) doesn't return, and gives a warning about
missing return. Silence that.

Amit Kapila

f68220df

Move permissions check from do_pg_start_backup to pg_start_backup · 9544cc0d

Magnus Hagander authored Jan 07, 2014

And the same for do_pg_stop_backup. The code in do_pg_* is not allowed
to access the catalogs. For manual base backups, the permissions
check can be handled in the calling function, and for streaming
base backups only users with the required permissions can get past
the authentication step in the first place.

Reported by Antonin Houska, diagnosed by Andres Freund

9544cc0d

Avoid including tablespaces inside PGDATA twice in base backups · b168c5ef

Magnus Hagander authored Jan 07, 2014

If a tablespace was crated inside PGDATA it was backed up both as part
of the PGDATA backup and as the backup of the tablespace. Avoid this
by skipping any directory inside PGDATA that contains one of the active
tablespaces.

Dimitri Fontaine and Magnus Hagander

b168c5ef

Add more use of psprintf() · edc43458
Peter Eisentraut authored Jan 06, 2014

edc43458

06 Jan, 2014 1 commit
- Remove bogus -K option from pg_dump. · 10a82cda
  Heikki Linnakangas authored Jan 06, 2014
```
I added it to the getopt call by accident in commit
691e595d.

Amit Kapila
```
  10a82cda