Commits · 882368e854b6f094f94aca292f390bbd9f44359b · Abuhujair Javed / Postgres FD Implementation

02 Nov, 2011 10 commits

Fix btree stop-at-nulls logic properly. · 882368e8

Tom Lane authored Nov 02, 2011

As pointed out by Naoya Anzai, my previous try at this was a few bricks
shy of a load, because I had forgotten that the initial-positioning logic
might not try to skip over nulls at the end of the index the scan will
start from. We ought to fix that, because it represents an unnecessary
inefficiency, but first let's get the scan-stop logic back to a safe
state. With this patch, we preserve the performance benefit requested
in bug #6278 for the case of scanning forward into NULLs (in a NULLS
LAST index), but the reverse case of scanning backward across NULLs
when there's no suitable initial-positioning qual is still inefficient.

882368e8

Update more comments about checkpoints being done by bgwriter · 750f70b0
Simon Riggs authored Nov 02, 2011

750f70b0

Reduce checkpoints and WAL traffic on low activity database server · 18fb9d8d

Simon Riggs authored Nov 02, 2011

Previously, we skipped a checkpoint if no WAL had been written since
last checkpoint, though this does not appear in user documentation.
As of now, we skip a checkpoint until we have written at least one
enough WAL to switch the next WAL file. This greatly reduces the
level of activity and number of WAL messages generated by a very
low activity server. This is safe because the purpose of a checkpoint
is to act as a starting place for a recovery, in case of crash.
This patch maintains minimal WAL volume for replay in case of crash,
thus maintaining very low crash recovery time.

18fb9d8d

Refactor xlog.c to create src/backend/postmaster/startup.c · 9aceb6ab

Simon Riggs authored Nov 02, 2011

Startup process now has its own dedicated file, just like all other
special/background processes. Reduces role and size of xlog.c

9aceb6ab

Derive oldestActiveXid at correct time for Hot Standby. · 86e33648

Simon Riggs authored Nov 02, 2011

There was a timing window between when oldestActiveXid was derived
and when it should have been derived that only shows itself under
heavy load. Move code around to ensure correct timing of derivation.
No change to StartupSUBTRANS() code, which is where this failed.

Bug report by Chris Redekop

86e33648

Start Hot Standby faster when initial snapshot is incomplete. · 10b7c686

Simon Riggs authored Nov 02, 2011

If the initial snapshot had overflowed then we can start whenever
the latest snapshot is empty, not overflowed or as we did already,
start when the xmin on primary was higher than xmax of our starting
snapshot, which proves we have full snapshot data.

Bug report by Chris Redekop

10b7c686

Remove spurious entry from missed catch while patch juggling · 2296e62a
Simon Riggs authored Nov 02, 2011

2296e62a
Fix timing of Startup CLOG and MultiXact during Hot Standby · f8409b39
Simon Riggs authored Nov 02, 2011
```
Patch by me, bug report by Chris Redekop, analysis by Florian Pflug
```
f8409b39

Initialize myProcLocks queues just once, at postmaster startup. · c2891b46

Robert Haas authored Nov 01, 2011

In assert-enabled builds, we assert during the shutdown sequence that
the queues have been properly emptied, and during process startup that
we are inheriting empty queues.  In non-assert enabled builds, we just
save a few cycles.

c2891b46

Preserve Var location information during flatten_join_alias_vars. · 391af9f7

Tom Lane authored Nov 01, 2011

This allows us to give correct syntax error pointers when complaining
about ungrouped variables in a join query with aggregates or GROUP BY.
It's pretty much irrelevant for the planner's use of the function, though
perhaps it might aid debugging sometimes.

391af9f7

01 Nov, 2011 9 commits

Fix race condition with toast table access from a stale syscache entry. · 08e261cb

Tom Lane authored Nov 01, 2011

If a tuple in a syscache contains an out-of-line toasted field, and we
try to fetch that field shortly after some other transaction has committed
an update or deletion of the tuple, there is a race condition: vacuum
could come along and remove the toast tuples before we can fetch them.
This leads to transient failures like "missing chunk number 0 for toast
value NNNNN in pg_toast_2619", as seen in recent reports from Andrew
Hammond and Tim Uckun.

The design idea of syscache is that access to stale syscache entries
should be prevented by relation-level locks, but that fails for at least
two cases where toasted fields are possible: ANALYZE updates pg_statistic
rows without locking out sessions that might want to plan queries on the
same table, and CREATE OR REPLACE FUNCTION updates pg_proc rows without
any meaningful lock at all.

The least risky fix seems to be an idea that Heikki suggested when we
were dealing with a related problem back in August: forcibly detoast any
out-of-line fields before putting a tuple into syscache in the first place.
This avoids the problem because at the time we fetch the parent tuple from
the catalog, we should be holding an MVCC snapshot that will prevent
removal of the toast tuples, even if the parent tuple is outdated
immediately after we fetch it. (Note: I'm not convinced that this
statement holds true at every instant where we could be fetching a syscache
entry at all, but it does appear to hold true at the times where we could
fetch an entry that could have a toasted field. We will need to be a bit
wary of adding toast tables to low-level catalogs that don't have them
already.) An additional benefit is that subsequent uses of the syscache
entry should be faster, since they won't have to detoast the field.

Back-patch to all supported versions. The problem is significantly harder
to reproduce in pre-9.0 releases, because of their willingness to flush
every entry in a syscache whenever the underlying catalog is vacuumed
(cf CatalogCacheFlushRelation); but there is still a window for trouble.

08e261cb

Clean up whitespace and indentation in parser and scanner files · 654e1f96
Peter Eisentraut authored Nov 01, 2011
```
These are not touched by pgindent, so clean them up a bit manually.
```
654e1f96
Comment changes to show bgwriter no longer performs checkpoints. · f3ebaad4
Simon Riggs authored Nov 01, 2011

f3ebaad4
Have checkpointer send stats once each processing loop. · 3ba18205
Simon Riggs authored Nov 01, 2011
```
Noted by Fujii Masao
```
3ba18205
Update pg_upgrade comment on missing 'postgres' database. · 09d1174e
Bruce Momjian authored Nov 01, 2011

09d1174e
Add new file for checkpointer.c · bf405ba8
Simon Riggs authored Nov 01, 2011

bf405ba8
Allow pg_upgrade to upgrade an old cluster that doesn't have a · a50d860a
Bruce Momjian authored Nov 01, 2011
```
'postgres' database.
```
a50d860a

Split work of bgwriter between 2 processes: bgwriter and checkpointer. · 806a2aee

Simon Riggs authored Nov 01, 2011

bgwriter is now a much less important process, responsible for page
cleaning duties only. checkpointer is now responsible for checkpoints
and so has a key role in shutdown. Later patches will correct doc
references to the now old idea that bgwriter performs checkpoints.
Has beneficial effect on performance at high write rates, but mainly
refactoring to more easily allow changes for power reduction by
simplifying previously tortuous code around required to allow page
cleaning and checkpointing to time slice in the same process.

Patch by me, Review by Dickson Guedes

806a2aee

Document that multiple LDAP servers can be specified · 589adb86
Magnus Hagander authored Nov 01, 2011

589adb86

31 Oct, 2011 1 commit

Stop btree indexscans upon reaching nulls in either direction. · 6980f817

Tom Lane authored Oct 31, 2011

The existing scan-direction-sensitive tests were overly complex, and
failed to stop the scan in cases where it's perfectly legitimate to do so.
Per bug #6278 from Maksym Boguk.

Back-patch to 8.3, which is as far back as the patch applies easily.
Doesn't seem worth sweating over a relatively minor performance issue in
8.2 at this late date. (But note that this was a performance regression
from 8.1 and before, so 8.2 is being left as an outlier.)

6980f817

30 Oct, 2011 2 commits

Support more locale-specific formatting options in cash_out(). · 6743a878

Tom Lane authored Oct 30, 2011

The POSIX spec defines locale fields for controlling the ordering of the
value, sign, and currency symbol in monetary output, but cash_out only
supported a small subset of these options.  Fully implement p/n_sign_posn,
p/n_cs_precedes, and p/n_sep_by_space per spec.  Fix up cash_in so that
it will accept all these format variants.

Also, make sure that thousands_sep is only inserted to the left of the
decimal point, as required by spec.

Per bug #6144 from Eduard Kracmar and discussion of bug #6277.  This patch
includes some ideas from Alexander Lakhin's proposed patch, though it is
very different in detail.

6743a878

Further improvement of make_greater_string. · eb5834d5

Tom Lane authored Oct 30, 2011

Make sure that it considers all the possibilities that the old code did,
instead of trying only one possibility per character position. To keep the
runtime in bounds, instead tweak the character incrementers to not try
every possible multibyte character code. Remove unnecessary logic to
restore the old character value on failure. Additional comment and
formatting cleanup.

eb5834d5

29 Oct, 2011 4 commits

Update visibilitymap.c header comments. · fae54e4a
Robert Haas authored Oct 29, 2011
```
Recent work on index-only scans left this somewhat out of date.
```
fae54e4a

Fix assorted bogosities in cash_in() and cash_out(). · 7609239f

Tom Lane authored Oct 29, 2011

cash_out failed to handle multiple-byte thousands separators, as per bug
#6277 from Alexander Law.  In addition, cash_in didn't handle that either,
nor could it handle multiple-byte positive_sign.  Both routines failed to
support multiple-byte mon_decimal_point, which I did not think was worth
changing, but at least now they check for the possibility and fall back to
using '.' rather than emitting invalid output.  Also, make cash_in handle
trailing negative signs, which formerly it would reject.  Since cash_out
generates trailing negative signs whenever the locale tells it to, this
last omission represents a fail-to-reload-dumped-data bug.  IMO that
justifies patching this all the way back.

7609239f

Improve make_greater_string() with encoding-specific incrementers. · 78d523b6

Robert Haas authored Oct 29, 2011

This infrastructure doesn't in any way guarantee that the character
we produce will sort before the one we incremented; but it does at least
make it much more likely that we'll end up with something that is a valid
character, which improves our chances.

Kyotaro Horiguchi, with various adjustments by me.

78d523b6

Remove pg_upgrade dependency on the 'postgres' database existing in the · 51eba98c
Bruce Momjian authored Oct 28, 2011
```
new cluster.   vacuumdb, used by pg_upgrade, still has this dependency.
```
51eba98c

28 Oct, 2011 9 commits

Allow hint bits to be set sooner for temporary and unlogged tables. · 53f1ca59

Robert Haas authored Oct 28, 2011

We need not wait until the commit record is durably on disk, because
in the event of a crash the page we're updating with hint bits will
be gone anyway. Per off-list report from Heikki Linnakangas, this
can significantly degrade the performance of unlogged tables; I was
able to show a 2x speedup from this patch on a pgbench run with scale
factor 15. In practice, this will mostly help small, heavily updated
tables, because on larger tables you're unlikely to run into the same
row again before the commit record makes it out to disk.

53f1ca59

Demote some sanity checks in BufferIsValid() to assertions. · b6335a3f
Robert Haas authored Oct 28, 2011
```
Testing reveals that this macro is a hot-spot for index-only-scans.
Per discussion with Tom Lane.
```
b6335a3f

Remove hard-coded "\connect postgres" from pg_dumpall. · deb15803

Robert Haas authored Oct 28, 2011

This doesn't appear to accompish anything useful, and does make the
restore fail if the postgres database happens to have been dropped.

deb15803

De-parallelize ecpg build some more. · 74812624

Tom Lane authored Oct 28, 2011

Make sure ecpg/include/ is rebuilt before the other subdirectories,
so that ecpg_config.h is up to date. This is not likely to matter
during production builds, only development, so no back-patch.

74812624

Clarify that ORDER BY/FOR UPDATE can't malfunction at higher iso levels. · 9cf12dfd
Robert Haas authored Oct 28, 2011
```
Kevin Grittner
```
9cf12dfd
Change "and and" to "and". · 6c21105f
Robert Haas authored Oct 28, 2011
```
Report by Vik Reykja, patch by Kevin Grittner.
```
6c21105f
Clarify pg_upgrade error message that the 'postgres' database must exist · 9846dcfb
Bruce Momjian authored Oct 28, 2011
```
in the old cluster.
```
9846dcfb
Update docs to point to the timezone library's new home at IANA. · ece12659
Tom Lane authored Oct 27, 2011
```
The recent unpleasantness with copyrights has accelerated a move that
was already in planning.
```
ece12659
Update pg_upgrade testing instructions. · 38f3c7c4
Bruce Momjian authored Oct 27, 2011

38f3c7c4

27 Oct, 2011 3 commits

Fix the number of lwlocks needed by the "fast path" lock patch. It needs · cbf65509

Heikki Linnakangas authored Oct 27, 2011

one lock per backend or auxiliary process - the need for a lock for each
aux processes was not accounted for in NumLWLocks(). No-one noticed,
because the three locks needed for the three aux processes fit into the
few extra lwlocks we allocate for 3rd party modules that don't call
RequestAddinLWLocks() (NUM_USER_DEFINED_LWLOCKS, 4 by default).

cbf65509

Avoid recursion while processing ELSIF lists in plpgsql. · 051d1ba7

Tom Lane authored Oct 27, 2011

The original implementation of ELSIF in plpgsql converted the construct
into nested simple IF statements. This was prone to stack overflow with
long ELSIF lists, in two different ways. First, it's difficult to generate
the parsetree without using right-recursion in the bison grammar, and
that's prone to parser stack overflow since nothing can be reduced until
the whole list has been read. Second, we'd recurse during execution, thus
creating an unnecessary risk of execution-time stack overflow. Rewrite
so that the ELSIF list is represented as a flat list, scanned via iteration
not recursion, and generated through left-recursion in the grammar.
Per a gripe from Håvard Kongsgård.

051d1ba7

Add simple script to check for right recursion in Bison grammars. · 756a4ed5

Tom Lane authored Oct 27, 2011

We should generally use left-recursion not right-recursion to parse lists.
Bison hasn't got any built-in way to check for this type of inefficiency,
and I didn't find anything on the net in a quick search, so I wrote a
little Perl script to do it. Add to src/tools/ so we don't have to
re-invent this wheel next time we wonder if we're doing anything stupid.

Currently, the only place that seems to need fixing is plpgsql's stmt_else
production, so the problem doesn't appear to be common enough to warrant
trying to include such a test in our standard build process. If we did
want to do that, we'd need a way to ignore some false positives, such as
a_expr := '-' a_expr

756a4ed5

26 Oct, 2011 2 commits

Typo fixes. · bf820136

Tom Lane authored Oct 26, 2011

expect -> except, noted by Andrew Dunstan.  Also, "cannot" seems more
readable here than "can not", per David Wheeler.

bf820136

Improve planner's ability to recognize cases where an IN's RHS is unique. · 3e4b3465

Tom Lane authored Oct 26, 2011

If the right-hand side of a semijoin is unique, then we can treat it like a
normal join (or another way to say that is: we don't need to explicitly
unique-ify the data before doing it as a normal join). We were recognizing
such cases when the RHS was a sub-query with appropriate DISTINCT or GROUP
BY decoration, but there's another way: if the RHS is a plain relation with
unique indexes, we can check if any of the indexes prove the output is
unique. Most of the infrastructure for that was there already in the join
removal code, though I had to rearrange it a bit. Per reflection about a
recent example in pgsql-performance.

3e4b3465