Commits · 0563b4c0c3b7cc2323cfb63e11d723764e2d5f7d · Abuhujair Javed / Postgres FD Implementation

16 May, 2015 10 commits

First-draft release notes for 9.4.2 et al. · 0563b4c0

Tom Lane authored May 16, 2015

As usual, the release notes for older branches will be made by cutting
these down, but put them up for community review first.

0563b4c0

pg_upgrade: no need to check for matching float8_pass_by_value · 750ccaef
Bruce Momjian authored May 16, 2015
```
Report by Noah Misch
```
750ccaef
Fix docs typo · c65aa7a8
Tom Lane authored May 16, 2015
```
I don't think "respectfully" is what was meant here ...
```
c65aa7a8
More portability fixing for bipartite_match.c. · 26058bf0
Tom Lane authored May 16, 2015
```
<float.h> is required for isinf() on some platforms.  Per buildfarm.
```
26058bf0

pg_upgrade: force timeline 1 in the new cluster · 4c5e0600

Bruce Momjian authored May 16, 2015

Previously, this prevented promoted standby servers from being upgraded
because of a missing WAL history file.  (Timeline 1 doesn't need a
history file, and we don't copy WAL files anyway.)

Report by Christian Echerer(?), Alexey Klyukin

Backpatch through 9.0

4c5e0600

pg_upgrade: only allow template0 to be non-connectable · fb694d95

Bruce Momjian authored May 16, 2015

This patch causes pg_upgrade to error out during its check phase if:

(1) template0 is marked connectable
or
(2) any other database is marked non-connectable

This is done because, in the first case, pg_upgrade would fail because
the pg_dumpall --globals restore would fail, and in the second case, the
database would not be restored, leading to data loss.

Report by Matt Landry (1), Stephen Frost (2)

Backpatch through 9.0

fb694d95

Avoid direct use of INFINITY. · 12cc299c
Tom Lane authored May 15, 2015
```
It's not very portable.  Per buildfarm.
```
12cc299c
Add docs for tablesample system_time() · f941d033
Simon Riggs authored May 15, 2015

f941d033

Support GROUPING SETS, CUBE and ROLLUP. · f3d31185

Andres Freund authored May 16, 2015

This SQL standard functionality allows to aggregate data by different
GROUP BY clauses at once. Each grouping set returns rows with columns
grouped by in other sets set to NULL.

This could previously be achieved by doing each grouping as a separate
query, conjoined by UNION ALLs. Besides being considerably more concise,
grouping sets will in many cases be faster, requiring only one scan over
the underlying data.

The current implementation of grouping sets only supports using sorting
for input. Individual sets that share a sort order are computed in one
pass. If there are sets that don't share a sort order, additional sort &
aggregation steps are performed. These additional passes are sourced by
the previous sort step; thus avoiding repeated scans of the source data.

The code is structured in a way that adding support for purely using
hash aggregation or a mix of hashing and sorting is possible. Sorting
was chosen to be supported first, as it is the most generic method of
implementation.

Instead of, as in an earlier versions of the patch, representing the
chain of sort and aggregation steps as full blown planner and executor
nodes, all but the first sort are performed inside the aggregation node
itself. This avoids the need to do some unusual gymnastics to handle
having to return aggregated and non-aggregated tuples from underlying
nodes, as well as having to shut down underlying nodes early to limit
memory usage. The optimizer still builds Sort/Agg node to describe each
phase, but they're not part of the plan tree, but instead additional
data for the aggregation node. They're a convenient and preexisting way
to describe aggregation and sorting. The first (and possibly only) sort
step is still performed as a separate execution step. That retains
similarity with existing group by plans, makes rescans fairly simple,
avoids very deep plans (leading to slow explains) and easily allows to
avoid the sorting step if the underlying data is sorted by other means.

A somewhat ugly side of this patch is having to deal with a grammar
ambiguity between the new CUBE keyword and the cube extension/functions
named cube (and rollup). To avoid breaking existing deployments of the
cube extension it has not been renamed, neither has cube been made a
reserved keyword. Instead precedence hacking is used to make GROUP BY
cube(..) refer to the CUBE grouping sets feature, and not the function
cube(). To actually group by a function cube(), unlikely as that might
be, the function name has to be quoted.

Needs a catversion bump because stored rules may change.

Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund
Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas
Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule
Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com

f3d31185

Add docs for tablesample system_rows() · 6e4415c6
Simon Riggs authored May 15, 2015

6e4415c6

15 May, 2015 30 commits

Update time zone data files to tzdata release 2015d. · 9d366c1f

Tom Lane authored May 15, 2015

DST law changes in Egypt, Mongolia, Palestine.
Historical corrections for Canada and Chile.
Revised zone abbreviation for America/Adak (HST/HDT not HAST/HADT).

9d366c1f

Add BRIN infrastructure for "inclusion" opclasses · b0b7be61

Alvaro Herrera authored May 15, 2015

This lets BRIN be used with R-Tree-like indexing strategies.

Also provided are operator classes for range types, box and inet/cidr.
The infrastructure provided here should be sufficient to create operator
classes for similar datatypes; for instance, opclasses for PostGIS
geometries should be doable, though we didn't try to implement one.

(A box/point opclass was also submitted, but we ripped it out before
commit because the handling of floating point comparisons in existing
code is inconsistent and would generate corrupt indexes.)

Author: Emre Hasegeli.  Cosmetic changes by me
Review: Andreas Karlsson

b0b7be61

Improve test for CONVERT() with GB18030 <-> UTF8. · 199f5973
Tom Lane authored May 15, 2015
```
Add a bit of coverage of high code points.

Arjen Nienhuis
```
199f5973

Move strategy numbers to include/access/stratnum.h · 26df7066

Alvaro Herrera authored May 15, 2015

For upcoming BRIN opclasses, it's convenient to have strategy numbers
defined in a single place.  Since there's nothing appropriate, create
it.  The StrategyNumber typedef now lives there, as well as existing
strategy numbers for B-trees (from skey.h) and R-tree-and-friends (from
gist.h).  skey.h is forced to include stratnum.h because of the
StrategyNumber typedef, but gist.h is not; extensions that currently
rely on gist.h for rtree strategy numbers might need to add a new

A few .c files can stop including skey.h and/or gist.h, which is a nice
side benefit.

Per discussion:
https://www.postgresql.org/message-id/20150514232132.GZ2523@alvh.no-ip.org

Authored by Emre Hasegeli and Álvaro.

(It's not clear to me why bootscanner.l has any #include lines at all.)

26df7066

SQLStandard feature T613 Sampling now Supported · 1e98fa0b
Simon Riggs authored May 15, 2015

1e98fa0b
Fix uninitialized variable. · 66493dd7
Tom Lane authored May 15, 2015
```
Per compiler warnings.
```
66493dd7
Tablesample method API docs · 910baf0a
Simon Riggs authored May 15, 2015
```
Petr Jelinek
```
910baf0a
Add to contrib/Makefile · df259759
Simon Riggs authored May 15, 2015

df259759
contrib/tsm_system_time · 56e121a5
Simon Riggs authored May 15, 2015

56e121a5
contrib/tsm_system_rows · 4d40494b
Simon Riggs authored May 15, 2015

4d40494b

TABLESAMPLE system_time(limit) · 149f6f15

Simon Riggs authored May 15, 2015

Contrib module implementing a tablesample method
that allows you to limit the sample by a hard time
limit.

Petr Jelinek

Reviewed by Michael Paquier, Amit Kapila and
Simon Riggs

149f6f15

TABLESAMPLE system_rows(limit) · 9689290f

Simon Riggs authored May 15, 2015

Contrib module implementing a tablesample method
that allows you to limit the sample by a hard row
limit.

Petr Jelinek

Reviewed by Michael Paquier, Amit Kapila and
Simon Riggs

9689290f

Extend GB18030 encoding conversion to cover full Unicode range. · 8d3e0906

Tom Lane authored May 15, 2015

Our previous code for GB18030 <-> UTF8 conversion only covered Unicode code
points up to U+FFFF, but the actual spec defines conversions for all code
points up to U+10FFFF. That would be rather impractical as a lookup table,
but fortunately there is a simple algorithmic conversion between the
additional code points and the equivalent GB18030 byte patterns. Make use
of the just-added callback facility in LocalToUtf/UtfToLocal to perform the
additional conversions.

Having created the infrastructure to do that, we can use the same code to
map certain linearly-related subranges of the Unicode space below U+FFFF,
allowing removal of the corresponding lookup table entries. This more
than halves the lookup table size, which is a substantial savings;
utf8_and_gb18030.so drops from nearly a megabyte to about half that.

In support of doing that, replace ISO10646-GB18030.TXT with the data file
gb-18030-2000.xml (retrieved from
http://source.icu-project.org/repos/icu/data/trunk/charset/data/xml/ )
in which these subranges have been deleted from the simple lookup entries.

Per bug #12845 from Arjen Nienhuis. The conversion code added here is
based on his proposed patch, though I whacked it around rather heavily.

8d3e0906

doc: CREATE FOREIGN TABLE now allows CHECK ( ... ) NO INHERIT · 92edba26
Robert Haas authored May 15, 2015
```
Etsuro Fujita
```
92edba26

TABLESAMPLE, SQL Standard and extensible · f6d208d6

Simon Riggs authored May 15, 2015

Add a TABLESAMPLE clause to SELECT statements that allows
user to specify random BERNOULLI sampling or block level
SYSTEM sampling. Implementation allows for extensible
sampling functions to be written, using a standard API.
Basic version follows SQLStandard exactly. Usable
concrete use cases for the sampling API follow in later
commits.

Petr Jelinek

Reviewed by Michael Paquier and Simon Riggs

f6d208d6

Silence another create_index regression test failure. · 11a83bbe

Heikki Linnakangas authored May 15, 2015

More platform differences in the less-significant digits in output.

Per buildfarm member rover_firefly, still.

11a83bbe

Fix outdated src/test/mb/ tests, and add a GB18030 test. · 07af5238

Tom Lane authored May 15, 2015

The expected-output files for these tests were broken by the recent
addition of a warning for hash indexes. Update them.

Also add a test case for GB18030 encoding, similar to the other ones.
This is a pretty weak test, but it's better than nothing.

07af5238

Fix docs build. Oops. · 8b0f105d
Heikki Linnakangas authored May 15, 2015

8b0f105d

Add archive_mode='always' option. · ffd37740

Heikki Linnakangas authored May 15, 2015

In 'always' mode, the standby independently archives all files it receives
from the primary.

Original patch by Fujii Masao, docs and review by me.

ffd37740

docs: consistently uppercase index method and add spacing · f6d65f0c

Bruce Momjian authored May 15, 2015

Consistently uppercase index method names, e.g. GIN, and add space after
the index method name and the parentheses enclosing the column names.

f6d65f0c

Silence create_index regression test failure. · 9feaba28

Heikki Linnakangas authored May 15, 2015

The expected output contained some floating point values which might get
rounded slightly differently on different platforms. The exact output isn't
very interesting in this test, so just round it.

Per buildfarm member rover_firefly.

9feaba28

Fix datatype confusion with the new lossy GiST distance functions. · 98edd617

Heikki Linnakangas authored May 15, 2015

We can only support a lossy distance function when the distance function's
datatype is comparable with the original ordering operator's datatype.
The distance function always returns a float8, so we are limited to float8,
and float4 (by a hard-coded cast of the float8 to float4).

In light of this limitation, it seems like a good idea to have a separate
'recheck' flag for the ORDER BY expressions, so that if you have a non-lossy
distance function, it still works with lossy quals. There are cases like
that with the build-in or contrib opclasses, but it's plausible.

There was a hidden assumption that the ORDER BY values returned by GiST
match the original ordering operator's return type, but there are plenty
of examples where that's not true, e.g. in btree_gist and pg_trgm. As long
as the distance function is not lossy, we can tolerate that and just not
return the distance to the executor (or rather, always return NULL). The
executor doesn't need the distances if there are no lossy results.

There was another little bug: the recheck variable was not initialized
before calling the distance function. That revealed the bigger issue,
as the executor tried to reorder tuples that didn't need reordering, and
that failed because of the datatype mismatch.

98edd617

Fix insufficiently-paranoid GB18030 encoding verifier. · a868931f

Tom Lane authored May 15, 2015

The previous coding effectively only verified that the second byte of a
multibyte character was in the expected range; moreover, it wasn't careful
to make sure that the second byte even exists in the buffer before touching
it.  The latter seems unlikely to cause any real problems in the field
(in particular, it could never be a problem with null-terminated input),
but it's still a bug.

Since GB18030 is not a supported backend encoding, the only thing we'd
really be doing with GB18030 text is converting it to UTF8 in LocalToUtf,
which would fail anyway on any invalid character for lack of a match in
its lookup table.  So the only user-visible consequence of this change
should be that you'll get "invalid byte sequence for encoding" rather than
"character has no equivalent" for malformed GB18030 input.  However,
impending changes to the GB18030 conversion code will require these tighter
up-front checks to avoid producing bogus results.

a868931f

Remove useless pg_audit.conf · aff27e33

Stephen Frost authored May 15, 2015

No need to have pg_audit.conf any longer since the regression tests are
just loading the module at the start of each session (to simulate being
in shared_preload_libraries, which isn't something we can actually make
happen on the buildfarm itself, it seems).

Pointed out by Tom

aff27e33

Support --verbose option in reindexdb. · 458a0770
Fujii Masao authored May 15, 2015
```
Sawada Masahiko, reviewed by Fabrízio Mello
```
458a0770

Allow GiST distance function to return merely a lower-bound. · 35fcb1b3

Heikki Linnakangas authored May 15, 2015

The distance function can now set *recheck = false, like index quals. The
executor will then re-check the ORDER BY expressions, and use a queue to
reorder the results on the fly.

This makes it possible to do kNN-searches on polygons and circles, which
don't store the exact value in the index, but just a bounding box.

Alexander Korotkov and me

35fcb1b3

Support VERBOSE option in REINDEX command. · ecd222e7

Fujii Masao authored May 15, 2015

When this option is specified, a progress report is printed as each index
is reindexed.

Per discussion, we agreed on the following syntax for the extensibility of
the options.

    REINDEX (flexible options) { INDEX | ... } name

Sawada Masahiko.
Reviewed by Robert Haas, Fabrízio Mello, Alvaro Herrera, Kyotaro Horiguchi,
Jim Nasby and me.

Discussion: CAD21AoA0pK3YcOZAFzMae+2fcc3oGp5zoRggDyMNg5zoaWDhdQ@mail.gmail.com

ecd222e7

Honor traditional SGML NAMELEN limit. · 4b8f797f

Tom Lane authored May 14, 2015

We've conformed to this limit in the past, so might as well continue to.

Aaron Swenson

4b8f797f

Teach UtfToLocal/LocalToUtf to support algorithmic encoding conversions. · 7730f48e

Tom Lane authored May 14, 2015

Until now, these functions have only supported encoding conversions using
lookup tables, which is fine as long as there's not too many code points
to convert. However, GB18030 expects all 1.1 million Unicode code points
to be convertible, which would require a ridiculously-sized lookup table.
Fortunately, a large fraction of those conversions can be expressed through
arithmetic, ie the conversions are one-to-one in certain defined ranges.
To support that, provide a callback function that is used after consulting
the lookup tables. (This patch doesn't actually change anything about the
GB18030 conversion behavior, just provide infrastructure for fixing it.)

Since this requires changing the APIs of UtfToLocal/LocalToUtf anyway,
take the opportunity to rearrange their argument lists into what seems
to me a saner order. And beautify the call sites by using lengthof()
instead of error-prone sizeof() arithmetic.

In passing, also mark all the lookup tables used by these calls "const".
This moves an impressive amount of stuff into the text segment, at least
on my machine, and is safer anyhow.

7730f48e

Separate block sampling functions · 83e176ec

Simon Riggs authored May 15, 2015

Refactoring ahead of tablesample patch

Requested and reviewed by Michael Paquier

Petr Jelinek

83e176ec