Commits · 220b34331f77effdb46798ddd7cca0cffc1b2858 · Abuhujair Javed / Postgres FD Implementation

09 Jan, 2014 2 commits

We don't need to include pg_sema.h in s_lock.h anymore. · 220b3433

Tom Lane authored Jan 08, 2014

Minor improvement to commit daa7527a:
s_lock.h no longer has any need to mention PGSemaphoreData, so we can
rip out the #include that supplies that. In a non-HAVE_SPINLOCKS
build, this doesn't really buy much since we still need the #include
in spin.h --- but everywhere else, this reduces #include footprint by
some trifle, and helps keep the different locking facilities separate.

220b3433

Fix "cannot accept a set" error when only some arms of a CASE return a set. · 080b7db7

Tom Lane authored Jan 08, 2014

In commit c1352052, I implemented an
optimization that assumed that a function's argument expressions would
either always return a set (ie multiple rows), or always not.  This is
wrong however: we allow CASE expressions in which some arms return a set
of some type and others just return a scalar of that type.  There may be
other examples as well.  To fix, replace the run-time test of whether an
argument returned a set with a static precheck (expression_returns_set).
This adds a little bit of query startup overhead, but it seems barely
measurable.

Per bug #8228 from David Johnston.  This has been broken since 8.0,
so patch all supported branches.

080b7db7

08 Jan, 2014 7 commits

Reduce the number of semaphores used under --disable-spinlocks. · daa7527a

Robert Haas authored Jan 08, 2014

Instead of allocating a semaphore from the operating system for every
spinlock, allocate a fixed number of semaphores (by default, 1024)
from the operating system and multiplex all the spinlocks that get
created onto them.  This could self-deadlock if a process attempted
to acquire more than one spinlock at a time, but since processes
aren't supposed to execute anything other than short stretches of
straight-line code while holding a spinlock, that shouldn't happen.

One motivation for this change is that, with the introduction of
dynamic shared memory, it may be desirable to create spinlocks that
last for less than the lifetime of the server.  Without this change,
attempting to use such facilities under --disable-spinlocks would
quickly exhaust any supply of available semaphores.  Quite apart
from that, it's desirable to contain the quantity of semaphores
needed to run the server simply on convenience grounds, since using
too many may make it harder to get PostgreSQL running on a new
platform, which is mostly the point of --disable-spinlocks in the
first place.

Patch by me; review by Tom Lane.

daa7527a

Fix pause_at_recovery_target + recovery_target_inclusive combination. · 3739e5ab

Heikki Linnakangas authored Jan 08, 2014

If pause_at_recovery_target is set, recovery pauses *before* applying the
target record, even if recovery_target_inclusive is set. If you then
continue with pg_xlog_replay_resume(), it will apply the target record
before ending recovery. In other words, if you log in while it's paused
and verify that the database looks OK, ending recovery changes its state
again, possibly destroying data that you were tring to salvage with PITR.

Backpatch to 9.1, this has been broken since pause_at_recovery_target was
added.

3739e5ab

If multiple recovery_targets are specified, use the latest one. · 815d71de

Heikki Linnakangas authored Jan 08, 2014

The docs say that only one of recovery_target_xid, recovery_target_time, or
recovery_target_name can be specified. But the code actually did something
different, so that a name overrode time, and xid overrode both time and name.
Now the target specified last takes effect, whether it's an xid, time or
name.

With this patch, we still accept multiple recovery_target settings, even
though docs say that only one can be specified. It's a general property of
the recovery.conf file parser that you if you specify the same option twice,
the last one takes effect, like with postgresql.conf.

815d71de

Avoid extra AggCheckCallContext() checks in ordered-set aggregates. · 847e46ab

Tom Lane authored Jan 08, 2014

In the transition functions, we don't really need to recheck this after the
first call. I had been feeling paranoid about possibly getting a non-null
argument value in some other context; but it's probably game over anyway
if we have a non-null "internal" value that's not what we are expecting.

In the final functions, the general convention in pre-existing final
functions seems to be that an Assert() is good enough, so do it like that
here too.

This seems to save a few tenths of a percent of overall query runtime,
which isn't much, but still it's just overhead if there's not a plausible
case where the checks would fire.

847e46ab

Save a few cycles in advance_transition_function(). · e6336b8b

Tom Lane authored Jan 08, 2014

Keep a pre-initialized FunctionCallInfoData in AggStatePerAggData, and
re-use that at each row instead of doing InitFunctionCallInfoData each
time. This saves only half a dozen assignments and maybe some stack
manipulation, and yet that seems to be good for a percent or two of the
overall query run time for simple aggregates such as count(*). The cost
is that the FunctionCallInfoData (which is about a kilobyte, on 64-bit
machines) stays allocated for the duration of the query instead of being
short-lived stack data. But we're already paying an equivalent space cost
for each regular FuncExpr or OpExpr node, so I don't feel bad about paying
it for aggregate functions. The code seems a little cleaner this way too,
since the number of things passed to advance_transition_function decreases.

e6336b8b

Fix bug in determining when recovery has reached consistency. · d59ff6c1

Heikki Linnakangas authored Jan 08, 2014

When starting WAL replay from an online checkpoint, the last replayed WAL
record variable was initialized using the checkpoint record's location, even
though the records between the REDO location and the checkpoint record had
not been replayed yet. That was noted as "slightly confusing" but harmless
in the comment, but in some cases, it fooled CheckRecoveryConsistency to
incorrectly conclude that we had already reached a consistent state
immediately at the beginning of WAL replay. That caused the system to accept
read-only connections in hot standby mode too early, and also PANICs with
message "WAL contains references to invalid pages".

Fix by initializing the variables to the REDO location instead.

In 9.2 and above, change CheckRecoveryConsistency() to use
lastReplayedEndRecPtr variable when checking if backup end location has
been reached. It was inconsistently using EndRecPtr for that check, but
lastReplayedEndRecPtr when checking min recovery point. It made no
difference before this patch, because in all the places where
CheckRecoveryConsistency was called the two variables were the same, but
it was always an accident waiting to happen, and would have been wrong
after this patch anyway.

Report and analysis by Tomonari Katsumata, bug #8686. Backpatch to 9.0,
where hot standby was introduced.

d59ff6c1

pg_upgrade: Fix fatal error handling · ca607b15

Peter Eisentraut authored Jan 08, 2014

Restore exiting when pg_log(PG_FATAL) is called directly instead of
calling pg_fatal().  Fault introduced in
264aa14a.

ca607b15

07 Jan, 2014 6 commits

Update copyright for 2014 · 7e04792a

Bruce Momjian authored Jan 07, 2014

Update all files in head, and files COPYRIGHT and legal.sgml in all back
branches.

7e04792a

Fix LATERAL references to target table of UPDATE/DELETE. · 0c051c90

Tom Lane authored Jan 07, 2014

I failed to think much about UPDATE/DELETE when implementing LATERAL :-(.
The implemented behavior ended up being that subqueries in the FROM or
USING clause (respectively) could access the update/delete target table as
though it were a lateral reference; which seems fine if they said LATERAL,
but certainly ought to draw an error if they didn't. Fix it so you get a
suitable error when you omit LATERAL. Per report from Emre Hasegeli.

0c051c90

Silence compiler warning on MSVC. · f68220df

Heikki Linnakangas authored Jan 07, 2014

MSVC doesn't know that elog(ERROR) doesn't return, and gives a warning about
missing return. Silence that.

Amit Kapila

f68220df

Move permissions check from do_pg_start_backup to pg_start_backup · 9544cc0d

Magnus Hagander authored Jan 07, 2014

And the same for do_pg_stop_backup. The code in do_pg_* is not allowed
to access the catalogs. For manual base backups, the permissions
check can be handled in the calling function, and for streaming
base backups only users with the required permissions can get past
the authentication step in the first place.

Reported by Antonin Houska, diagnosed by Andres Freund

9544cc0d

Avoid including tablespaces inside PGDATA twice in base backups · b168c5ef

Magnus Hagander authored Jan 07, 2014

If a tablespace was crated inside PGDATA it was backed up both as part
of the PGDATA backup and as the backup of the tablespace. Avoid this
by skipping any directory inside PGDATA that contains one of the active
tablespaces.

Dimitri Fontaine and Magnus Hagander

b168c5ef

Add more use of psprintf() · edc43458
Peter Eisentraut authored Jan 06, 2014

edc43458

06 Jan, 2014 1 commit
- Remove bogus -K option from pg_dump. · 10a82cda
  Heikki Linnakangas authored Jan 06, 2014
```
I added it to the getopt call by accident in commit
691e595d.

Amit Kapila
```
  10a82cda
05 Jan, 2014 1 commit

Cache catalog lookup data across groups in ordered-set aggregates. · 8b49a604

Tom Lane authored Jan 05, 2014

The initial commit of ordered-set aggregates just did all the setup work
afresh each time the aggregate function is started up. But in a GROUP BY
query, the catalog lookups need not be repeated for each group, since the
column datatypes and sort information won't change. When there are many
small groups, this makes for a useful, though not huge, performance
improvement. Per suggestion from Andrew Gierth.

Profiling of these cases suggests that it might be profitable to avoid
duplicate lookups within tuplesort startup as well; but changing the
tuplesort APIs would have much broader impact, so I left that for
another day.

8b49a604

04 Jan, 2014 3 commits

Fix translatability markings in psql, and add defenses against future bugs. · 92459e7a

Tom Lane authored Jan 04, 2014

Several previous commits have added columns to various \d queries without
updating their translate_columns[] arrays, leading to potentially incorrect
translations in NLS-enabled builds.  Offenders include commit 89368676
(added prosecdef to \df+), c9ac00e6 (added description to \dc+) and
3b17efdf (added description to \dC+).  Fix those cases back to 9.3 or
9.2 as appropriate.

Since this is evidently more easily missed than one would like, in HEAD
also add an Assert that the supplied array is long enough.  This requires
an API change for printQuery(), so it seems inappropriate for back
branches, but presumably all future changes will be tested in HEAD anyway.

In HEAD and 9.3, also clean up a whole lot of sloppiness in the emitted
SQL for \dy (event triggers): lack of translatability due to failing to
pass words-to-be-translated through gettext_noop(), inadequate schema
qualification, and sloppy formatting resulting in unnecessarily ugly
-E output.

Peter Eisentraut and Tom Lane, per bug #8702 from Sergey Burladyan

92459e7a

Fix header comment for bitncmp(). · 5858cf8a

Tom Lane authored Jan 04, 2014

The result is an int less than, equal to, or greater than zero, in the
style of memcmp (and, in fact, exactly the output of memcmp in some cases).
This comment previously said -1, 1, or 0, which was an overspecification,
as noted by Emre Hasegeli. All of the existing callers appear to be fine
with the actual behavior, so just fix the comment.

In passing, improve infelicitous formatting of some call sites.

5858cf8a

Fix typo in comment. · 99299756

Tom Lane authored Jan 04, 2014

classifyClauses was renamed to classifyConditions somewhere along the
line, but this comment didn't get the memo.

Ian Barwick

99299756

03 Jan, 2014 3 commits

Restore some comments lost during 15732b34 · 1a3e82a7
Alvaro Herrera authored Jan 03, 2014
```
Michael Paquier
```
1a3e82a7
Ooops, should use double not single quotes in StaticAssertStmt(). · a3b4aeec
Tom Lane authored Jan 02, 2014
```
That's what I get for testing this on an older compiler.
```
a3b4aeec

Fix calculation of maximum statistics-message size. · a7ef273e

Tom Lane authored Jan 02, 2014

The PGSTAT_NUM_TABENTRIES macro should have been updated when new fields
were added to struct PgStat_MsgTabstat in commit 64482890, but it wasn't.
Fix that.

Also, add a static assertion that we didn't overrun the intended size limit
on stats messages.  This will not necessarily catch every mistake in
computing the maximum array size for stats messages, but it will catch ones
that have practical consequences.  (The assertion in fact doesn't complain
about the aforementioned error in PGSTAT_NUM_TABENTRIES, because that was
not big enough to cause the array length to increase.)

No back-patch, as there's no actual bug in existing releases; this is just
in the nature of future-proofing.

Mark Dilger and Tom Lane

a7ef273e

02 Jan, 2014 6 commits

Handle 5-char filenames in SlruScanDirectory · 638cf09e

Alvaro Herrera authored Jan 02, 2014

Original users of slru.c were all producing 4-digit filenames, so that
was all that that code was prepared to handle. Changes to multixact.c
in the course of commit 0ac5ad51 made pg_multixact/members create
5-digit filenames once a certain threshold was reached, which
SlruScanDirectory wasn't prepared to deal with; in particular,
5-digit-name files were not removed during truncation. Change that
routine to make it aware of those files, and have it process them just
like any others.

Right now, some pg_multixact/members directories will contain a mixture
of 4-char and 5-char filenames. A future commit is expected fix things
so that each slru.c user declares the correct maximum width for the
files it produces, to avoid such unsightly mixtures.

Noticed while investigating bug #8673 reported by Serge Negodyuck.

638cf09e

Wrap multixact/members correctly during extension · a50d9762

Alvaro Herrera authored Jan 02, 2014

In the 9.2 code for extending multixact/members, the logic was very
simple because the number of entries in a members page was a proper
divisor of 2^32, and thus at 2^32 wraparound the logic for page switch
was identical than at any other page boundary. In commit 0ac5ad51 I
failed to realize this and introduced code that was not able to go over
the 2^32 boundary. Fix that by ensuring that when we reach the last
page of the last segment we correctly zero the initial page of the
initial segment, using correct uint32-wraparound-safe arithmetic.

Noticed while investigating bug #8673 reported by Serge Negodyuck, as
diagnosed by Andres Freund.

a50d9762

Handle wraparound during truncation in multixact/members · 722acf51

Alvaro Herrera authored Jan 02, 2014

In pg_multixact/members, relying on modulo-2^32 arithmetic for
wraparound handling doesn't work all that well. Because we don't
explicitely track wraparound of the allocation counter for members, it
is possible that the "live" area exceeds 2^31 entries; trying to remove
SLRU segments that are "old" according to the original logic might lead
to removal of segments still in use. To fix, have the truncation
routine use a tailored SlruScanDirectory callback that keeps track of
the live area in actual use; that way, when the live range exceeds 2^31
entries, the oldest segments still live will not get removed untimely.

This new SlruScanDir callback needs to take care not to remove segments
that are "in the future": if new SLRU segments appear while the
truncation is ongoing, make sure we don't remove them. This requires
examination of shared memory state to recheck for false positives, but
testing suggests that this doesn't cause a problem. The original coding
didn't suffer from this pitfall because segments created when truncation
is running are never considered to be removable.

Per Andres Freund's investigation of bug #8673 reported by Serge
Negodyuck.

722acf51

Aggressively freeze tables when CLUSTER or VACUUM FULL rewrites them. · 3cff1879

Robert Haas authored Jan 02, 2014

We haven't wanted to do this in the past on the grounds that in rare
cases the original xmin value will be needed for forensic purposes, but
commit 37484ad2 removes that objection,
so now we can.

Per extensive discussion, among many people, on pgsql-hackers.

3cff1879

Fix contrib/pg_upgrade to clean all the cruft made during "make check". · 4cf81b73

Tom Lane authored Jan 02, 2014

Although these files get cleaned up if the test runs to completion,
a failure partway through leaves trash all over the floor. The Makefile
ought to be bright enough to get rid of it when you say "make clean".

4cf81b73

Rename walLogHints to wal_log_hints for easier grepping. · 4b351841
Robert Haas authored Jan 01, 2014
```
Michael Paquier
```
4b351841

01 Jan, 2014 1 commit

Do not use an empty hostname. · 7c957ec8

Michael Meskes authored Jan 01, 2014

When trying to connect to a given database libecpg should not try using an
empty hostname if no hostname was given.

7c957ec8

30 Dec, 2013 4 commits

Fix broken support for event triggers as extension members. · c01bc51f

Tom Lane authored Dec 30, 2013

CREATE EVENT TRIGGER forgot to mark the event trigger as a member of its
extension, and pg_dump didn't pay any attention anyway when deciding
whether to dump the event trigger. Per report from Moshe Jacobson.

Given the obvious lack of testing here, it's rather astonishing that
ALTER EXTENSION ADD/DROP EVENT TRIGGER work, but they seem to.

c01bc51f

Fix alphabetization in catalogs.sgml. · d7ee4311

Tom Lane authored Dec 30, 2013

Some recent patches seem not to have grasped the concept that the catalogs
are described in alphabetical order.

d7ee4311

Remove dead code now that orindxpath.c is history. · f7fbf4b0

Tom Lane authored Dec 30, 2013

We don't need make_restrictinfo_from_bitmapqual() anymore at all.
generate_bitmap_or_paths() doesn't need to be exported, and we can
drop its rather klugy restriction_only flag.

f7fbf4b0

Extract restriction OR clauses whether or not they are indexable. · f343a880

Tom Lane authored Dec 30, 2013

It's possible to extract a restriction OR clause from a join clause that
has the form of an OR-of-ANDs, if each sub-AND includes a clause that
mentions only one specific relation. While PG has been aware of that idea
for many years, the code previously only did it if it could extract an
indexable OR clause. On reflection, though, that seems a silly limitation:
adding a restriction clause can be a win by reducing the number of rows
that have to be filtered at the join step, even if we have to test the
clause as a plain filter clause during the scan. This should be especially
useful for foreign tables, where the change can cut the number of rows that
have to be retrieved from the foreign server; but testing shows it can win
even on local tables. Per a suggestion from Robert Haas.

As a heuristic, I made the code accept an extracted restriction clause
if its estimated selectivity is less than 0.9, which will probably result
in accepting extracted clauses just about always. We might need to tweak
that later based on experience.

Since the code no longer has even a weak connection to Path creation,
remove orindxpath.c and create a new file optimizer/util/orclauses.c.

There's some additional janitorial cleanup of now-dead code that needs
to happen, but it seems like that's a fit subject for a separate commit.

f343a880

29 Dec, 2013 3 commits

Don't attempt to limit target database for pg_restore. · 47f50262

Kevin Grittner authored Dec 29, 2013

There was an apparent attempt to limit the target database for
pg_restore to version 7.1.0 or later. Due to a leading zero this
was interpreted as an octal number, which allowed targets with
version numbers down to 2.87.36. The lowest actual release above
that was 6.0.0, so that was effectively the limit.

Since the success of the restore attempt will depend primarily on
on what statements were generated by the dump run, we don't want
pg_restore trying to guess whether a given target should be allowed
based on version number. Allow a connection to any version. Since
it is very unlikely that anyone would be using a recent version of
pg_restore to restore to a pre-6.0 database, this has little to no
practical impact, but it makes the code less confusing to read.

Issue reported and initial patch suggestion from Joel Jacobson
based on an article by Andrey Karpov reporting on issues found by
PVS-Studio static code analyzer. Final patch based on analysis by
Tom Lane. Back-patch to all supported branches.

47f50262

Undo autoconf 2.69's attempt to #define _DARWIN_USE_64_BIT_INODE. · ed011d97

Tom Lane authored Dec 29, 2013

Defining this symbol causes OS X 10.5 to use a buggy version of readdir(),
which can sometimes fail with EINVAL if the previously-fetched directory
entry has been deleted or renamed. In later OS X versions that bug has
been repaired, but we still don't need the #define because it's on by
default. So this is just an all-around bad idea, and we can do without it.

ed011d97

Update grammar · 71812a98
Peter Eisentraut authored Dec 28, 2013
```
From: Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>
```
71812a98

28 Dec, 2013 1 commit
- Fix whitespace · b986270b
  Peter Eisentraut authored Dec 27, 2013
  
  b986270b
27 Dec, 2013 2 commits

Properly detect invalid JSON numbers when generating JSON. · 29dcf7de

Andrew Dunstan authored Dec 27, 2013

Instead of looking for characters that aren't valid in JSON numbers, we
simply pass the output string through the JSON number parser, and if it
fails the string is quoted. This means among other things that money and
domains over money will be quoted correctly and generate valid JSON.

Fixes bug #8676 reported by Anderson Cristian da Silva.

Backpatched to 9.2 where JSON generation was introduced.

29dcf7de

Fix misplaced right paren bugs in pgstatfuncs.c. · a133bf70

Kevin Grittner authored Dec 27, 2013

The bug would only show up if the C sockaddr structure contained
zero in the first byte for a valid address; otherwise it would
fail to fail, which is probably why it went unnoticed for so long.

Patch submitted by Joel Jacobson after seeing an article by Andrey
Karpov in which he reports finding this through static code
analysis using PVS-Studio.  While I was at it I moved a definition
of a local variable referenced in the buggy code to a more local
context.

Backpatch to all supported branches.

a133bf70